Crowdsourcing and Localization

Traditionally, localization has been carried out by translators who are fluent in English and their native language, qualified and trained to translate between these two languages, and paid based on the number of words they translate.

A few companies are trying another approach: getting their own users to translate for free. This method can be called community localization – translation by the community of users – or crowdsourced localization. Facebook is one of the more prominent companies trying this technique, and it appears to be a pretty good solution for them, since they filed a patent for their method of localization by the community. Using this method, Facebook expanded from being an English-only website at the beginning of 2008, scaling up to now support more than 70 languages as of this month (October 2009).

While Facebook is a good case study of what is possible when you crowdsource your translations, I really want to talk in more general terms about crowdsourcing versus traditional localization methods.

The first reaction of some people in the localization profession to the crowdsourcing model is analogous to any entrenched industry faced with having its business model pulled out from under its feet (consider the music industry in the era of Torrents and MP3 files). But the reality is less black-and-white. There are pros and cons to consider, and while the user-generated translations themselves are free, it would be a falsehood to pretend that there was no cost to the company employing this model of localization.

So, which model – traditional or crowdsourced – is better?

Assuming you are considering the localization of your product, one of the first questions you should ask yourself is this: “Do I have a large, invested community of users”. Not every product does. For crowdsourcing to be a viable solution, you really need a large pool of users – and I don’t mean casual users here, I’m talking about users who really care about your product. After all, not every user who speaks a language will have both the free time and the passion for your product in order to volunteer to translate your site for free.

One challenge will become immediately clear: if you pay for translation, it will happen, on schedule. It’s predictable in a good way. The community, on the other hand, will work on their own schedule. It may be quick, or slow, or even never completed 100% for some languages. You need to ask yourself if you are comfortable surrendering control over when your localized versions go live.

Other questions which can help make the decision for you:

  • Is my user demographic enterprise, SMB, consumer, etc?
  • Is my product free or ad-supported, or am I charging customers?
  • Does my product have a UI shell that doesn’t change very frequently?
  • Am I prepared to invest the time and effort to create a framework – tools, processes, etc – to enable crowdsourcing the localization of my product?

Finally, at least for this blog post, let’s consider one more key point – quality of translation. Between traditional and crowdsourced translation, which produces the best translated product? It is very tempting to leap to the conclusion that the traditional model would produce the best results. After all, you are hiring trained, qualified translators to do the work. Paying for something should provide a better quality of service than something that’s free, right? (Tell that one to the Open Source movement!) But we need to step back and consider how things work in the real world.

Your trained, educated translators definitely know how to translate. That is their job, their bread and butter. There is no real dispute that, apples to apples, these professionals should produce higher-quality translation than others. But let’s consider how these people work: very often, they don’t know anything about the product they are translating. After all, these people are translating all kinds of products week after week. Chances are, they have never used or heard of your product before.

The opposite is, of course, true for the crowd. They have the benefit of knowing how your product functions and what translations are required to describe them, but they are not trained translators.. In addition, you have the additional burden of noise. What if two well-meaning people want to translate your product but don’t agree on the best translation? Without planning for this eventuality, you may find your site continually being updated by warring translators, each overwriting the other – somewhat akin to certain articles on Wikipedia. An extreme example, perhaps, but one which your translation framework needs to plan for nonetheless.