Time to End the English-Only Internet
September 27, 2007 Internet, World
According to a study at the University of Guelph, 70% of the internet is in English. Considering how one in five humans can speak English to some level of competence, I’d say the internet is just too darned monolingual.
Thanks to translation services at AltaVista and Google, people can offer their pages in several other languages (WordPress bloggers can use GlobalTranslator, which simplifies the services) but machine translation often leaves much to be desired. Slang comes out wrong. Mis-spelled or mis-used words can make sentences unreadable in other languages (a prime example is the mis-use of “their”, “there” and “they’re”). On top of that, most of us that can speak several languages tend to use only one when we write something online (who wants to write a blog post or static web page two or three times in different languages, after all?).
But despite some of the current limitations of machine translated pages, they’re still better than nothing.
Of course, there are some benefits to offering your site in multiple languages. I’ve had GlobalTranslator on this site since January and have found that a full 38% of my readers are reading content in a language other than English. Spanish seems to be the most popular, with a full 22% of the readership, followed closely by Traditional Chinese and German. If the study’s 70% English figure is accurate, then all those “make money online” blogs are potentially missing out on a massive market.
And this language issue is the crux of my problem.
I read hundreds of blogs every week from every continent save Africa (I don’t think they’ve discovered blogging, yet), and 80% of these blogs have a fatal flaw in their website: English-Only Commenting. Perhaps I should rephrase that … because it’s not the comments themselves that I want to see in other languages … it’s the name fields.
Since I read so many blogs, there are times when I want to leave a comment or two. Because there are so many other people with the name Jason, I try to mix it up a bit by entering “ジェイソン (Jason)”. This does a few things for me. By writing my name using Japanese Katakana, people are a bit more likely to click the link back to my site (I’ve noticed a significant jump in visits when using non-English characters in my name when commenting), and it sets me apart from all the other Jason’s on the internet. I’ll be the first to admit that there are several people with this name here in Japan (I’ll be working with one shortly, too), but the odds of others using this visual link-clicking tactic has been slim thus far. Unfortunately, when I start using Unicode characters, websites start to complain.
The error I’m often presented with is often a lovely MySQL error which spits out the raw query which tells me what tables or columns are rejecting the data. This is most often seen with privately hosted WordPress blogs, as the larger blogging sites (Blogger, WordPress.com, etc.) tend to be happy with Unicode characters. I can understand that most people don’t really give much thought to the table types and collations when setting up a database for their blog (I also understand that most people don’t even know about these things to begin with), but I’m curious to know why the default tends to be less-friendly towards the East-Asian and Middle-Eastern character systems. There are far more people speaking Mandarin and Arabic than English.
According to the Internet World Stats page, approximately 17.8% of the world’s population has access to the internet. This is roughly the same percentage of people that also speak English on some level. As more of us connect to the global community, I’d love to see webpages become more language-friendly. Whether it’s in the form of a machine-translation service, or a nice UTF-8 Unicode-capable site that can handle people’s names and comments in their native language, any progress towards greater communication would be beneficial. Of course, if people are posting comments in languages other than the site’s primary target, these comments would need to be translated before being entered into the database (perhaps making the original comment available in the foreign language on request?).
The Internet has been called a Global Village. This might have been an apt analogy at first, but the tiny village quickly grew into a big city with its own Chinatown, Little Italy, and every other pocket of culture that we see in the physical world today.
Should we try to reduce the language barriers that exist on the Internet today? Or is it better to leave things as they are?
Comments (10)
Your “foreign language readers” number is fascinating. It’s been my experience that machine translators are pretty lacking when you want to look up more than individual words. I never realized that so many people rely on them. Those “make money online” blogs might be interested in reading this article.
I was quite surprised when I found out that wordpress doesn’t default to unicode. It was a bit of a headache to change the encoding for posts/comments in my db. I think it’s worth it to make the switch though. Why not allow somebody to comment in Thai or whatever?
Still, I doubt machine translation is up to par for real cross-language communication in, say, blog comments. I have no proof, but I imagine most translations would come up as gibberish.
As the Japanese machine translators would say: “There is no ginger.”
I wouldn’t want people to rely exclusively on machine-translation, as it tends to miss quite a bit of the subtleties of language, however, it does allow for a greater audience. After years of reading “Engrish” on IRC and various web forums and blogs, most of us can get the jist of a subject by picking out key words from the jibberish
That said … machine translation wouldn’t do much good for those Engrish sites where mis-spellings and l33t sp34k (is it still called that?) are used heavily.
Very interesting. I wondered why you were writing your name in Japanese in your comments, and now I know!
I’ve just started exploring SecondLife and am seeing the same “pockets of culture” there, too. People generally feel more comfortable in an environment using their native language, so they all congregate in the same places.
I agree that we generally feel more comfortable around people that we can relate to and communicate with. The issue that I was trying to address is the over-abundance of English-based sites that don’t allow for non-English speakers to take part without some sort of conformation. Considering how so many things are just a Google away from people, it would be in all webmaster’s best interests to make their sites as language-friendly as possible. Thanks to the GlobalTranslator plugin, the three big search engines (Google, Yahoo and MSN Live) have all sent me good chunks of traffic to those non-English pages. While machine translation may not be the best solution at the moment, by making sites more available to people who speak some of the world’s most common languages we can facilitate a better dialogue with peoples across the globe. If translation engines become even more necessary to help facilitate these exchanges, then the effort will be made to improve and enhance the language tools
Well … that’s the idea, anyways.
The idea that they “haven’t discovered” blogging in Africa yet is utterly, totally wrong.
Start here:
http://www.globalvoicesonline.org/-/world/sub-saharan-africa/
http://www.globalvoicesonline.org/-/world/middle-east-north-africa/
I have to admit when I saw the Katakana in the name I wondered what this guy was up to, so I decided to take a look. A job well done on the lure!
I tried to install this plugin in my blog, but couldn’t get it to work. So I came to your site to try it out and it doesn’t seem to work here either.
Thanks for bringing this to my attention, Thomas. The errors seem pretty recent, as I’ve had several non-english readers in the last few days. I’ll look into the code to see if the plug-in needs to be updated, or if BabelFish is no longer offering this service
Congrats on posting comment #500, too
No problem. If you get it figured out, let me know, as I’d like to have multi-language support on my blog too.
For what it’s worth, if you copy and paste the url that shows up into your browser, it brings you to a translated page. For some reason it just doesn’t redirect you automatically.
Here’s to 500!
From what I can tell, the data that’s coming back from the translation engine is just a little different than before, and the plugin is parsing it incorrectly. I don’t have much free time anymore, but I’ll see about getting this resolved before work tomorrow