Thursday, July 30, 2009

Launchpad is now an automatic, magical translation factory!

I've been using Launchpad to host my personal finance application wxBanker for a few years now. The thing I was hearing most often was that it wasn't localized; people wanted currencies to look the way they should in their country, and the application to be in their language. Let me explain how Launchpad helped me provide translations for my application, and how much of an utterly slothful breeze it has recently become.

Image courtesy of shirt.woot.com
Normally to handle translations, an application has to wrap strings with gettext, create a template, find translators and give the template to them, collect translation files back, and integrate them into the project. This is painful and is why many applications aren't localized, and shut out most of the world as a result. One of the amazing features of Launchpad however, happens to be Rosetta, which brings translators TO your translations via a simple web interface, and then makes those translations available to the developer. With Rosetta, translators don't need to understand gettext or translation file formats; they just need to know two languages!



So that's what a translator sees. Notice how Launchpad even shows how other applications translated the same string. So once you generate a template and upload it, you can assign a translation group to your project such as "Ubuntu Translators" so that your strings will be translated by volunteers on the Ubuntu project; if your project isn't relevant to the Ubuntu project, you can use the more generic Launchpad Translators group. Now all you have to do is wait for some translations, then download them and check it in to your code. Not too bad, right?

It isn't, but Launchpad has recently made it so much better. They started by adding an option to automatically import translation templates from your project. This means as you are developing, all you have to do is regenerate the template and commit, and new strings will show up for translators in Rosetta and be translated automatically (from the developer's perspective). Then today, they announced the other side of this, which is automatically committing those translations back into your code on a daily basis. This means that all I have to do is commit templates as I change strings, and Launchpad handles everything else. This is a profound tool for developers.

What's the next step? Well, from a developer's perspective the translation template is a tool to give to the translators or in this case Launchpad. In the future Launchpad could eliminate this by generating the template itself from the code (this is what developers are doing themselves, after all), so that truly all you have to do after you set up the initial i18n/l10n framework is commit code as normal, and Launchpad magically commits back translations.

All this work Launchpad is doing gives developers more time to develop while still having localized applications at a very minimal cost. This is continuous translation integration, and boy is it cool!

10 comments:

Tom said...

Couldn't it automatically translate without users? Just take the most used strings from other apps?

And how about projects that are not hosted on LP?

simpsus_science said...

Hi Michael,
nice to see you give wxBanker some publicity. Are there any news when the transactions tagging and advanced categories will be merged? When is the 0.6 release planned?

Cheers

Simpsus

Michael said...

Tom, that could be cool and I wouldn't mind an "Automatically use suggestions for otherwise untranslated strings" checkbox, but there are going to be a bunch of previously untranslated strings that will still need translation by a person. And for short phrases there can be multiple translations. For example, the word "read" in English can be both a verb and an adjective, and "import" can be a verb and noun. In another language there might be two different words for each use though, and only a person with context (provided by translation comments) can know which one to use. Certainly though, it seems like it could help you grab some low-hanging translation fruit that's better than nothing.

As far as your second question, I'm not sure what you mean by "hosted". If the project isn't registered on Launchpad at all, it can't use Launchpad features obviously. But there is nothing stopping a project hosted anywhere from creating a Launchpad account and using it for Rosetta. If the code is hosted elsewhere (and Launchpad allows you to specify external SCMs and bug trackers, as well as import from them) you could still have a branch in Launchpad solely for Launchpad to import and export from, and handle merges back and forth on your end yourself.

Michael said...

Thanks simpsus :) The tagging branch is waiting on some work from the author, as well as some conceptual work I need to do (how can tagging be hidden/out of the way for users who don't need that, but easy to discover and turn on?) I'd like to get 0.6 by the end of August with recurring transactions and perhaps reporting (https://launchpad.net/wxbanker/+milestone/0.6), and target tagging/categories for 0.7 not much later.

Tom said...

I love the "Automatically use suggestions for otherwise untranslated strings" option, will it come? Launchpad should get a big language model like Google Wave that would certainly help. Maybe Mark should hire some cheap Google engineers ;)

It would be cool if Rosetta could work directly with upstream as http://transifex.org/ can. That also would be a cool addition for people who prefer git etc.

Michael said...

Tom, I'm not a Launchpad developer, but I have talk of it before I think.

I checked out transifex and it seems pretty interesting. I know LP has done work in the past to make Rosetta easier to use for upstreams and sync the two, and I am sure it will only get easier and better.

Jeroen Vermeulen said...

Hi, I'm one of the Launchpad Translations developers. Glad you like the new stuff we built!

A very minor point: if you want to invite translations for your project, we now have a Launchpad translation group as well as the Ubuntu translation group. The Ubuntu group is to focus more on Ubuntu (surprise), whereas the Launchpad group can serve as the "generical" translation group for other projects.

And a comment on the "automatically use translations from elsewhere" idea: it's very tempting, but there are a few obstacles.

First, there may be a copyright risk from automatically copying other projects' translations. And once we automatically start avoiding some translations, there's a bigger risk of choosing bad translations.

Second, not every translation you find will be correct, let alone consistent. Sometimes people import entire translations to the wrong language!

So ideally there should still be a human review step for those suggestions before they definitely go into a project. And that's exactly how it works now. For some programs, the "global suggestions" from other projects let you do a large chunk of the work very quickly, even in languages you only just well enough to see which translations make sense.

Michael said...

Thanks for the comment Jeroen! I mentioned Ubuntu Translators because my package is in Ubuntu and it seemed most relevant to readers of my blog on the Ubuntu planet, but mentioning LP Translators for generic use is good to know as well.

The problems with automatic translation are definitely true, and it will be interesting to see if any of them can be mitigated. Perhaps allowing someone to import all translations with known good copyrights for untranslated strings, marking them as "Needs Review".

Sneca said...

Hi Michael.

Rosetta has gone a long way, and it is a much appreciated tool. However, as a professional translator that sometimes collaborates on free software translations, I must say it is *not yet* a perfect tool. It lacks better concordance support with easier searches, for instance.
I would like to point out that for translating, one needs to know a bit more than just two languages. It is necessary to have a whole lotta of things in mind, and use your brainz in a very complex way :) Whether I do not want to discourage any contributors, l18n is serious business; terminology consistency, some linguistic common sense (and even style) should be stressed. In the business world, bad translations leave bad impressions on your clients. In GNUlinux realmz, where there are quite a few different desktop managers, so many programs and sometimes different names for similar things, bad translations may end up confusing people (esp. newcomers) and eventually turning them, at least, nervous. English prevails largely on the GNULinux world, and the more languages AND the better quality of translations, the closer we will get to squash bug nÂș1.

We will keep on working on it. Cheers!

Sonia Krugers said...

Michael, if want to discover newer software localization tools, have a look at the online l10n platform https://poeditor.com/

It has an intuitive and collaborative translation interface, where you can bring your own localization team and translate strings together.

You can also use it to crowdsource translations, by making the project public.
Other pluses are the GitHub and Bitbucket integrations, the Translation Memory feature and the API, which can help automate the localization workflow a lot.

Hope you like it!