l10n and i18n in the real world dan moore moore consulting june 9, 2005

22
L10n and I18n in the Real World Dan Moore Moore Consulting June 9, 2005

Upload: roderick-bruce

Post on 27-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

L10n and I18n in the Real World

Dan MooreMoore Consulting

June 9, 2005

My experience

● Contractor who worked on 3 year old ecommercish site.– Added to existing working application

● Java on Tomcat with Oracle backend● 36 countries,15 languages, two frameworks.● Vast majority of work outlined here was done

by Zia Consulting– They get the blame!

● Reflects state of system as of February

Your experience

● How many folks have worked on i18n applications– J2SE– J2EE– J2ME?

● What locales (any programming lang)– European– Asian– Other

Outline

● General process● Definitions● Display● Locale Identification● Data flow● Issues● Resources● Ask questions anytime

General process

Locale

Key

Output

“HELLO_KEY”

“en_US” “Hello there.”

Datastore

Framework

Definitions

● Locale– ISO standard: language_country_variant

● fr_CA, en_US_jive● Internationalization● Localization● Character set

– UTF-8 or ascii● Bundle (resource)

Bundle Examples

● Greeting.properties– HELLO_KEY=Hello there.– GOODBYE_KEY=Bye

● Greeting_fr.properties– HELLO_KEY=Bonjour.– GOODBYE_KEY=Au revoir

● Greeting_kr.properties– HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5– GOODBYE_KEY=\uc0bc\ucd74\ud2b2\ub345

● One bundle, multiple files

Display

● Web● PDF/Email● Localized features● Other localization possibilities

Web

● Some class matches keys and locales to generate text

● Jetspeed (velocity)– $l10n.HELLO_KEY– Path like configuration

● Expresso (taglib) – <bean:message key="HELLO_KEY"

schema="com.zia..." />– Schema pointed to one bundle. – Also provided java method to do so.

● Struts similar to expresso

PDF/Email

● iText– Specify encoding, possibly specify TT font– com.lowagie.text.pdf.BaseFont– createFont() method

● Character set in content-type header– text/plain;charset=utf-8– text/html; charset=utf-8

Localized features

● Different sections of site were localized in different languages– Product A supported for en and de, but product

B only for en– In db, tie features (Product A) to locales– Always have a fallback locale of en

● Allow users to change locale easily● Locale specific fields on forms

– In db, tied fields (last name) on forms to locales.

Other localization possibilities

● Currency– DecimalFormat class

● Dates– Not localized—one common format

● Name of company● Sorting

– Not localized that I saw– Could have been done on client side

Locale identification

● How do you handle statelessness● Custom solution vs headers

– Headers: browser– Cookies, url rewriting or hidden form fields– Localization feature required folks to switch– Other business reasons (pass locale via params)– Consider headers

● Look at user set– Everyone knows some english, technical crowd.

● Locale Choice page– Image “Please choose your country...”– Drop down box (in English)

Data Flow

● Message bundles– Infrequently changing

● Database loads– Frequently changing

● Why not all in database?

Message bundles process

Excel Access

Framework

Native .txtfiles

.properties files

Client

Us

deployment

manual

Export

native2ascii

HELLO_KEY=HiKeys

defined

HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5

Message bundles continued

● Dynamic generation of strings– Sample value: Email Dan now!– EMAIL_KEY1=Email and EMAIL_KEY2=now!– EMAIL_KEY=Email {0} now!– Struts allows in taglib; JS and Exp don't

● Access/excel– Character limits (1024)

● Images and common properties– Separate property file

● Native2ascii– Ant task

Native2ascii example

<native2ascii encoding="UTF-8" src="indir" dest="outdir" ext="properties"> <include name="*.txt"/> <exclude name="readme.txt"/> <exclude name="CVS/**"/> </native2ascii>

Data process

Legacy systems

SQL Server

Test staging tables

Access

Text files

Client

Us

sqlldr

feeds export

Export to UTF-8 text

test db

prod staging

PL/SQL

ODBC

prod

Data process continued

● Oracle– NLS_LANG=american_america.AL32UTF8

● Sqlldr– CHARACTERSET UTF-8– Case study on sqlldr i18n on OTN

● PL/SQL● Setup your database browser● Tried ODBC

– 3 rows/sec

Issues

● Translation time● QA of output

– External testing resources● Scheduling restart times

– “It's 5 o'clock somewhere.”● Locale fk everywhere that data is displayed● Custom locale code (not en_US, rather

eng_US)

Resources

● ziaconsulting.com● blogs.msdn.com/michkap/default.aspx● java.sun.com/docs/books/tutorial/i18n/● mooreds.com/weblog/archives/000199.html● joelonsoftware.com/Unicode.html● ppewww.ph.gla.ac.uk/~flavell/www/lang-

neg.html● mooreds.com/i18n/● databasejournal.com/features/oracle/

article.php/3493691● ant.apache.org/manual/OptionalTasks/

native2ascii.html

Thanks

● Reviewers– Ben Galde– Susan Mowery– Corey Snipes– Karen Josey

● Ziaconsulting– Mike Mahon

● Y'all