l10n and i18n in the real world dan moore moore consulting june 9, 2005
TRANSCRIPT
My experience
● Contractor who worked on 3 year old ecommercish site.– Added to existing working application
● Java on Tomcat with Oracle backend● 36 countries,15 languages, two frameworks.● Vast majority of work outlined here was done
by Zia Consulting– They get the blame!
● Reflects state of system as of February
Your experience
● How many folks have worked on i18n applications– J2SE– J2EE– J2ME?
● What locales (any programming lang)– European– Asian– Other
Outline
● General process● Definitions● Display● Locale Identification● Data flow● Issues● Resources● Ask questions anytime
Definitions
● Locale– ISO standard: language_country_variant
● fr_CA, en_US_jive● Internationalization● Localization● Character set
– UTF-8 or ascii● Bundle (resource)
Bundle Examples
● Greeting.properties– HELLO_KEY=Hello there.– GOODBYE_KEY=Bye
● Greeting_fr.properties– HELLO_KEY=Bonjour.– GOODBYE_KEY=Au revoir
● Greeting_kr.properties– HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5– GOODBYE_KEY=\uc0bc\ucd74\ud2b2\ub345
● One bundle, multiple files
Web
● Some class matches keys and locales to generate text
● Jetspeed (velocity)– $l10n.HELLO_KEY– Path like configuration
● Expresso (taglib) – <bean:message key="HELLO_KEY"
schema="com.zia..." />– Schema pointed to one bundle. – Also provided java method to do so.
● Struts similar to expresso
PDF/Email
● iText– Specify encoding, possibly specify TT font– com.lowagie.text.pdf.BaseFont– createFont() method
● Character set in content-type header– text/plain;charset=utf-8– text/html; charset=utf-8
Localized features
● Different sections of site were localized in different languages– Product A supported for en and de, but product
B only for en– In db, tie features (Product A) to locales– Always have a fallback locale of en
● Allow users to change locale easily● Locale specific fields on forms
– In db, tied fields (last name) on forms to locales.
Other localization possibilities
● Currency– DecimalFormat class
● Dates– Not localized—one common format
● Name of company● Sorting
– Not localized that I saw– Could have been done on client side
Locale identification
● How do you handle statelessness● Custom solution vs headers
– Headers: browser– Cookies, url rewriting or hidden form fields– Localization feature required folks to switch– Other business reasons (pass locale via params)– Consider headers
● Look at user set– Everyone knows some english, technical crowd.
● Locale Choice page– Image “Please choose your country...”– Drop down box (in English)
Data Flow
● Message bundles– Infrequently changing
● Database loads– Frequently changing
● Why not all in database?
Message bundles process
Excel Access
Framework
Native .txtfiles
.properties files
Client
Us
deployment
manual
Export
native2ascii
HELLO_KEY=HiKeys
defined
HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5
Message bundles continued
● Dynamic generation of strings– Sample value: Email Dan now!– EMAIL_KEY1=Email and EMAIL_KEY2=now!– EMAIL_KEY=Email {0} now!– Struts allows in taglib; JS and Exp don't
● Access/excel– Character limits (1024)
● Images and common properties– Separate property file
● Native2ascii– Ant task
Native2ascii example
<native2ascii encoding="UTF-8" src="indir" dest="outdir" ext="properties"> <include name="*.txt"/> <exclude name="readme.txt"/> <exclude name="CVS/**"/> </native2ascii>
Data process
Legacy systems
SQL Server
Test staging tables
Access
Text files
Client
Us
sqlldr
feeds export
Export to UTF-8 text
test db
prod staging
PL/SQL
ODBC
prod
Data process continued
● Oracle– NLS_LANG=american_america.AL32UTF8
● Sqlldr– CHARACTERSET UTF-8– Case study on sqlldr i18n on OTN
● PL/SQL● Setup your database browser● Tried ODBC
– 3 rows/sec
Issues
● Translation time● QA of output
– External testing resources● Scheduling restart times
– “It's 5 o'clock somewhere.”● Locale fk everywhere that data is displayed● Custom locale code (not en_US, rather
eng_US)
Resources
● ziaconsulting.com● blogs.msdn.com/michkap/default.aspx● java.sun.com/docs/books/tutorial/i18n/● mooreds.com/weblog/archives/000199.html● joelonsoftware.com/Unicode.html● ppewww.ph.gla.ac.uk/~flavell/www/lang-
neg.html● mooreds.com/i18n/● databasejournal.com/features/oracle/
article.php/3493691● ant.apache.org/manual/OptionalTasks/
native2ascii.html