sigcse 2008 it sounded like a good idea at the time… manipulated by strings
DESCRIPTION
SIGCSE 2008 It Sounded Like a Good Idea at the Time… Manipulated by Strings. Margaret Menzin Simmons College. A Data Structures Course The assignment:. Read a file of names like President George Washington Identify the titles from a list and strip them - PowerPoint PPT PresentationTRANSCRIPT
SIGCSE 2008It Sounded Like a Good Idea at the Time…
Manipulated by Strings
Margaret Menzin
Simmons College
A Data Structures CourseThe assignment:
Read a file of names like President George Washington
Identify the titles from a list and strip them
Isolate the last name and invert it to Washington, George
Alphabetize the list
Known issues for students to handle:
Equivalence of upper and lower cases for purposes of alphabetization
Generating a list of titles Matching from the list Isolating the last name by looking backwards
from the end of the name for the last blank Usual file handling Use of a simple sort
Some surprises:
Some titles are at the beginning, but also some are at the end
Titles must be stripped recursively:Hon. Father Robert F. Drinan, S.J., L.L.D.Rev. Dr. Martin Luther King, jr.Augusta Ada Byron King, Lady LovelaceMajor General Stanley
Some titles occur in the middle Bernard Cardinal Law
Some of these titles can also be first, middle and last names – a problem which is exacerbated when we add other languages
Jr, II, etc. must be handled
More surprises:
In alphabetizing apostrophes and hyphens are ignored ( O’Reilly and OReilly are equivalent)
We need to worry about alphabetical order using other alphabets
Alphabetize using first the Latin alphabet and then other alphabets in the order of their names in English (Cyrillic before Greek)
Simplification:
Ignore titles in the middle Use an abbreviated list of titles Ignore other alphabets
Still more surprises – where does the last name of these people begin: Leonardo da Vinci Catherine de Medici Ponce de Leon Vasco da Gama Jean de la Fontaine Gabriel Garcia Marquez Vicente Fox Quesada Wernher von Braun Elizabeth Alexandra May Windsor Thomas a Beckett Mao Tse-tung (Mao Zedong)
The answers Leonardo da Vinci Catherine de Medici Juan Ponce de Leon Vasco da Gama Jean de La Fontaine Gabriel Garcia Marquez Vicente Fox Quesada Wernher von Braun Elizabeth (Alexandra May Windsor) II Thomas (a) Beckett Mao Tse-tung
The solution Use the alphabetization standards of the
American Library Association According to the A.L.A. you alphabetize using the
rules of the language the person wrote/spoke in There are special rules for monarchs and saints
– they are alphabetized by first name Note: The A.L.A. keeps the name as
first_name last_name and has another field to specify the character where the last name begins!
Conclusion
Internationalization is much harder than it looks!
p.s. The British use different rules for alphabetization than the U.S. does; surely other countries use other rules.