living in a multiligual world: internationalization for web 2.0 applications
DESCRIPTION
Lars Trieloff's presentation at Web 2.0 Expo Berlin covers the why and how-to of internationalization for web 2.0, consolidating i18n technology and enabling user-contributed translations.TRANSCRIPT
i18n for Web 2.0Why and how to internationalize your Web 2.0
application
Lars Trieloff, Day Software
Why internationalize?International audiences want localized user interfaces.
Lars Trieloff• Product Manager, Founder,
Blogger, Open Source Coder
• Languages I understand:
• German, English
• Languages I do not understand:
• Mandarin, Hindi, Spanish, Arabic, Russian, Portuguese, Bengali, Malay, French, Japanese, Farsi, Urdu, Punjabi, Vietnamese, Tamil, Wu, Javanese, Turkish, Telugu, Korean, Marathi, Italian, Thai, cantonese, gujarati, polish, kannada, burmese (and all other)
Do it yourself, or someone else will do it
Do it yourself, or someone else will do it
Do it yourself, or someone else will do it
Do it yourself, or someone else will do it
Do it yourself, or someone else will do it
What is different in Web 2.0 internationalization?
Web 2.0 internationalization
• Web sites become Web applications
• The Web as a platform
• This means:
• Internationalize your plain old Web site
• Internationalize your rich internet applications
• Javascript, Flash, Silverlight, and more to come
• Internationalize your desktop applications
The internationalization problem is multiplied due to use of different technologies in Web and rich
internet applications as well as desktop applications
Challenge
Solution
Consolidation of internationalization technology: Each technology has its own internationalization framework:
We need a common framework for all of them
What to do
• Keep all internationalization data in one place
• Extract internationalization strings from application parts
• repeatedly
• automatically
• Let the applications pull the i18n strings
What do do
Web application
source code
RIA source code
desktop application
source code
String Extractor
Localization Database
Translator
Translator
Web application
RIA
desktop application
Intermediate Format
Intermediate Converter
ExampleHow we did it in Mindquarry
Our technologyOur problem
• Web application framework: Apache Cocoon, with Cocoon i18n Transformer
• Rich internet application framework: Dojo Toolkit, with dojo.i18n.*
• Desktop client: Java and SWT, with Java Message Bundles
Steps to consolidated i18n
1. Find a common i18n database format
2. Extract internationalizable content automatically
3. Attach applications to i18n database
1. i18n database format
• QT Linguist .ts files• XML files, easy to process• QT Linguist is a good, easy-to-
use and free translation editor• Can be used by non-
programmers
2. Automatic string extraction
• We have three types of source code: XML, Java and Javascript
• XML
• Ruby script parses all XML source code, finds internationalizable strings not yet in database and adds them
• Java and Javascript, similar with a more complex parser
messages.ts (QT Linguist)
XSLT
messages_de.xml (Cocoon i18n)
Apache Cocoon
messages_de.xml (Cocoon i18n)
3.1. Attach Cocoon
• Apache Cocoon‘s internationalization databases are XML files
• Transformation via XSLT
• Multiple output files, one for each language
messages.ts (QT Linguist)
XSLT
messages_de.xml (Cocoon i18n)
Apache Cocoon
messages_de.xml (Cocoon i18n)
messages_de.js (Dojo i18n)messages_de.js
(Dojo i18n)
Dojo Widget
3.2. Attach Dojo
• Dojo uses JSON as internationalization format
• Transformation via XSLT
• Handled dynamically via Cocoon
messages.ts (QT Linguist)
Desktop Client
i18n Adapter
3.3. Attach Java
• Message Bundle Reader is overwritten
• Uses internationalization database directory
• Internationalization database is being distributed with desktop client
How to get translations
How to get translations
do it yourself
How to get translations
do it yourself pay someone
¥
€$
How to get translations
do it yourself pay someone
¥
€$
ask your users
User-contributed internationalization
• The holy grail
• Build a community and website at the same time
• But hard to achieve
• Wikipedia
• Open Source projects
User-contributed internationalization
• The holy grail
• Build a community and website at the same time
• But hard to achieve
• Wikipedia
• Open Source projects
Build your own translation
websiteAllows users to sign-up,
contribute localization strings, costly, but allows for automatic post-processing, validation and
quality-control.
Build your own translation
websiteAllows users to sign-up,
contribute localization strings, costly, but allows for automatic post-processing, validation and
quality-control.
Build your own translation
websiteAllows users to sign-up,
contribute localization strings, costly, but allows for automatic post-processing, validation and
quality-control.
Ad-hoc-translations: use
a wikiAllows users to contribute
localization strings without sign-up, easy to deploy, but requires
manual post-processing, validation and quality-control.
Pootle: OSS for web-based translations
GPL-software, based on Python, works with .po or XLIFF, integration with version control, basic project management, used by 20+ open
source projects
http://pootle.wordforge.org
Pootle: OSS for web-based translations
GPL-software, based on Python, works with .po or XLIFF, integration with version control, basic project management, used by 20+ open
source projects
http://pootle.wordforge.org
More challenges in Web 2.0 internationalization
• User-generated content
• Rich Web design
User-generated content
• User-generated content is great
• But hard to translate
• But translating it increases network effects
• English speaking users benefit from content generated by German speaking users
• Is there a (partial) solution?
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Solution• Structured Content
• Sometimes easier to translate
• ratings
• locations
• time & date
• Sometimes it is still hard
• tags
Graphical text• Looks great
• But hard to internationalize
• can break calculated box sizes,
• re-creation necessary
• Do not do it
• unless you can do it right
• create dynamically on server
Graphical text• Looks great
• But hard to internationalize
• can break calculated box sizes,
• re-creation necessary
• Do not do it
• unless you can do it right
• create dynamically on server
Graphical text• Looks great
• But hard to internationalize
• can break calculated box sizes,
• re-creation necessary
• Do not do it
• unless you can do it right
• create dynamically on server
Wrap-Up
• Web 2.0 needs internationalization
• Consolidate i18n over apps and platforms
• Allow for user-contributed translations
• Make it automated, repeatable and cheap
Thank you very [email protected]
For more information, see my weblog athttp://weblogs.goshaky.com/weblogs/lars