lexical databases for carrier - bill poser

24
Lexical Databases for Carrier William J. Poser University of Pennsylvania [This is the revised and expanded text of the paper I had prepared, but due to illness was unable to give, for the workshop “New Methods for Creating, Exploring and Dis- seminating Linguistic Field Data”, organized by Steven Bird under the auspices of the Talkbank Project at the University of Pennsylvania, on 6 January 2000 in Chicago, Illinois.] 1. Introduction Research on Carrier, a dialectally diverse Athabaskan language of the central interior of British Columbia, has resulted in a fairly large amount of information stored on- line. The material I have includes lexical databases for seven dialects containing over 36,000 Carrier entries. Carrier is dialectally diverse, with differences not only in details of pronounciation, e.g. Stuart Lake teˆ l “basket” versus Lheidli T’enneh tiˆ l, and in lexical items, e.g. Stuart Lake ˆ l ztih “knife” versus Lheidli T’enneh tes, but in morphophonemics, morphology, and syntax. The dialect differences are, moreover, considered important by the communities. Although the dialects are mutually comprehensible, people consider it important to record and pass on their own dialect. There are therefore actually seven separate Carrier lexical databases, for different dialects, varying considerably in size. The databases in which this material is kept are used both for research and to generate printed dictionaries. The database system is home-brew, developed incrementally over a decade. A similar system, though different in detail, underlies the Montana Salish (Flathead) dictionary (Thomason 1994) 1 and the dictionaries 1 The database files are in a similar format, and similar software is used to generate T E X files from the database. These T E X files are then further processed using T E X macros and format files written by Richmond Thomason. The database itself is maintained by the linguists, Sarah Thomason and Lucy Thomason, using Epsilon, a programmable text editor similar to Emacs that runs under DOS. (Epsilon is a product of Lugaru Software. http://www.lugaru.com.) When a printed dictionary is to be generated, the database files are transferred to a UNIX system and the software run there.

Upload: others

Post on 04-May-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Lexical Databases for Carrier - Bill Poser

Lexical Databases for Carrier

William J. PoserUniversity of Pennsylvania

[This is the revised and expanded text of the paper I had

prepared, but due to illness was unable to give, for the

workshop “New Methods for Creating, Exploring and Dis-

seminating Linguistic Field Data”, organized by Steven

Bird under the auspices of the Talkbank Project at the

University of Pennsylvania, on 6 January 2000 in Chicago,

Illinois.]

1. Introduction

Research on Carrier, a dialectally diverse Athabaskan language of the central interiorof British Columbia, has resulted in a fairly large amount of information stored on-line. The material I have includes lexical databases for seven dialects containingover 36,000 Carrier entries. Carrier is dialectally diverse, with differences not onlyin details of pronounciation, e.g. Stuart Lake teÃl “basket” versus Lheidli T’ennehtiÃl, and in lexical items, e.g. Stuart Lake Ãl ztih “knife” versus Lheidli T’ennehtes, but in morphophonemics, morphology, and syntax. The dialect differences are,moreover, considered important by the communities. Although the dialects aremutually comprehensible, people consider it important to record and pass on theirown dialect. There are therefore actually seven separate Carrier lexical databases,for different dialects, varying considerably in size.

The databases in which this material is kept are used both for research andto generate printed dictionaries. The database system is home-brew, developedincrementally over a decade. A similar system, though different in detail, underliesthe Montana Salish (Flathead) dictionary (Thomason 1994)1 and the dictionaries

1 The database files are in a similar format, and similar software is used to generate TEX filesfrom the database. These TEX files are then further processed using TEX macros and formatfiles written by Richmond Thomason. The database itself is maintained by the linguists, SarahThomason and Lucy Thomason, using Epsilon, a programmable text editor similar to Emacsthat runs under DOS. (Epsilon is a product of Lugaru Software. http://www.lugaru.com.)When a printed dictionary is to be generated, the database files are transferred to a UNIXsystem and the software run there.

Page 2: Lexical Databases for Carrier - Bill Poser

– 2 –

of Oapan and Ameyaltepec Nahuatl (Amith 2002ab).2 The purpose of this paper isto describe this system and explore its virtues and vices.

2. The Structure and Content of the Database

The database files are all pure ASCII text files. Records are delimited by extranewlines. Records are divided into fields by the field delimiter “%”. Each fieldconsists of a tag, which identifies the field, and the contents of the field, separatedfrom the tag by a colon.3

A typical record is displayed in (1):

(1) A Typical Record

%P:tsa

%G:beaver

%IH:beaver

%C:N

%SN:Castor canadensis

%SF:animals-water

%S:JOPA/BRBI/STJA/EDFR/VESE/JEKO/MAGO

%UID:000044

%MD:2001/04/18

The first field, with the “P” tag for “pronounciation” is the Carrier form. The “G”field is the gloss and the “IH” field, in this case identical, is the “inverse header”,used as the headword when going from English to Carrier. The “C” field containsthe syntactic category, in this case “noun”. The “SN” field contains the scientificname. The “SF” field contains a specification of the semantic field. This is used ingenerating topical indices; it is also useful when extracting words of particular types,such as kinship terms or biological terms. The “S” field identifies the sources of thedata; entries are separated by slashes. In this case, the sources are all individualspeakers, identified by abbreviations, but written sources may also be referred to.The last two fields are used for bookkeeping. The “UID” field contains an integerthat uniquely identifies the record. This is used for integrity checks and for cross

2 These dictionaries are generated from a single database, in SIL Shoebox format, edited usingMicrosoft Word under Windows. The database is copied to a GNU/Linux system for processinginto a printed dictionary.

3 This format is isomorphic to the widely used Summer Institute of Linguistics Shoebox format,in which the field delimiter is the backslash (\) and tags are separated from field contents byspaces. The reason for preferring the percent sign to backslash is that backslash has specialmeaning to numerous UNIX programs and requires quoting and other special treatment. Thepercent sign is much less trouble to deal with. I prefer the colon to space for two reasons. First,when fields are parsed during processing, since many fields will contain spaces, extra work willbe done breaking the field into unnecessary pieces. More important than this rather minorefficiency consideration is the fact that what appears visually to be a space may in fact be aspace, a tab, or another invisible character, while a colon is unambiguous.

Page 3: Lexical Databases for Carrier - Bill Poser

– 3 –

referencing. Finally, the “MD” field contains the date on which the record was lastmodified.

Records for verbs tend to be more complex. In (2) we see most of the same fieldsas in (1), along with a number of new fields. The “VF” field indicates that this formis Imperfective Affirmative. The “ASP” field indicates that this is a customaryaspect form. The “X” field gives the verb stem as gok. The “VA” field gives thevalence prefix as l. The “STEMLISTED” field is used for book-keeping; it indicatesthat the stem of this form has been identified and that this form is listed in theentry for the stem as one of the forms containing it.

(2) A Typical Verb Record

%P:tanalgok

%G:he walks on all limbs back into the water

%IH:walks on all limbs back into the water, he

%VF:IA

%ASP:customary

%C:V

%X:gok

%VA:l

%S:STJA

%STEMLISTED:Y

%ES:000148

%UID:001377

%MD:1998/12/09

The “ES” field contains a pointer to an example sentence, which is stored in anotherfile. Here is the entry for that example sentence:

(3) A Typical Example Sentence Record

%E:Dzen totsuk duni tanalgok.

%T:Every day the moose goes into the water.

%S:MAGO/STJA

%SID:000148

Numerous other fields are available. These include: grammatical informationabout verb forms, including tense/mode/negation, aspect, noun classification, stem,and valence prefix, lattitude, longitude, and map sheet (for places), pointers to im-ages along with captions and picture credits, queries of several different types, re-marks, indications of stylistic level, indications that the record contains informationthat is private or questionable and should therefore not be printed, grammaticalnotes, cultural notes, extended discussions of meanings, possessed forms of nouns,etymologies, and notes on phonetic detail. All in all, more than sixty distinct fieldsare in use.

So far we have only seen a need for two components to the databse: the filecontaining the example sentences and their translations, and the main file containing

Page 4: Lexical Databases for Carrier - Bill Poser

– 4 –

the records for individual forms like those shown above. There are two additionalcomponents to each database, a stem list and a root list, whose presence is motivatedby the nature of the Carrier verb.

Carrier verbs consist of a stem, approximately the last syllable, and a set ofprefixes. In general, the stem carries the main meaning of the verb. In (4) we haveforms representing the tense/mode/negation paradigm of the verb “to go aroundin a boat”. For each form, the last syllable, which is the stem, is shown separatedfrom the remainder, which consists of prefixes. In this case the prefixes converyinformation about the subject, tense/mode, and negation, except for the initial /n/,which denotes motion in a loop. We see that the stem varies considerably in formwith tense, mode, and negation.

(4) Stems of “I go around in a boat” (Stuart Lake dialect)

TMA/Neg Prefixes StemImperfective Affirmative n � s keImperfective Negative n � Ãl � z � s kohPerfective Affirmative n � s � s kiPerfective Negative n � Ãl � s kelFuture Affirmative n � tis keÃlFuture Negative n � Ãlt � zis kelOptative Affirmative nos ke

Optative Negative n � Ãl � zus ke�

These various stems are all associated with an abstract root, in this case ke, meaning“go by boat”, from which they are considered to be derived. Although there is apattern to the changes in stems, it is complex if not irregular, and so someonelearning the language must to a considerable extent simply memorize the stem setfor each verb.

The database therefore contains, in addition to the individual verb forms, a tableof verb stems and a table of verb roots. Each verb stem record contains pointersto all of the actual verb forms in which it has been found as well as to the rootwith which it is associated. At the level of individual forms, we may represent thisrelationship as in (5). For example, the root ke2 “go by boat” has an ImperfectiveAffirmative stem /ke/ which is attested in three verb forms.

Page 5: Lexical Databases for Carrier - Bill Poser

– 5 –

(5) Relationships of Verbs in a Carrier Dictionary

Root List

.....

ke2 (boat)

ket1 (buy)

ket2 (slip)

.....

Stem List

.....

ke (IA))

keÃl (FA)

kel (FN)

kel (PN)

k � ih (Cust)

ket (PA)

ket (OA)

.....

Verbs

.....

n � ske

n � ts’ � ke

ninke

n � tiskeÃl

n � Ãlt � ziskel

n � hikel

n � ts’ � kih

osket

tolket

.....

The same relationship is ilustrated in (6) now at the level of the components of thedatabase. This diagram shows that the collection of forms contains pointers, in theform of S(entenc)e IDs, to example sentences. The stem list contains pointers, inthe form of U(nique) IDs, to individual verb forms, and it also contains pointers, inthe form of R(oot) IDs, to verb roots.

(6) The Relational Structure of a Carrier Database

Forms

Stems Roots

Sentences

UID RID

SID

3. Modifying the Database

Since the database consists of ordinary text files, it can be modified by any texteditor. In practice, this is almost always Emacs (Stallman 1984), an interpreterfor a dialect of Lisp with special primitives for text editing. Since Emacs really isa programming language interpreter, it is fully programmable. It is also possibleto bind any function to a combination of keystrokes. Functions written in EmacsLisp facilitate working with the database by providing such capabilities as insertionof templates for new records of particular types with certain information alreadyfilled in, a variety of specialized searches, and shortcuts for modification of selectedproperties of the current record. For example, typing “Control-x t” updates thetime stamp in the modification time field of the current record.

Page 6: Lexical Databases for Carrier - Bill Poser

– 6 –

Another useful facility of Emacs is its ability recursively to split the screenboth vertically and horizontally, allowing for the simultaneous display of multiplewindows. In (7) we have a screen shot of Emacs in use editing a database. There arethree Emacs-internal windows open. The lower left window displays a portion of theverb database.4 The lower right window displays the portion of the stem databasecontaining the entry for the stem of the verb in the lower left window. The topwindow is open to the sentence database and displays an example sentence associatedwith the verb in the lower left window. This configuration would be useful in addingverbs to the database. If the new verb form comes with an example sentence, theexample sentence would simultaneously be added to the sentence database and thesentence ID entered in the verb form’s record. If the stem of the verb form can beidentified, the verb form’s ID can be added to the record for the stem, if it is alreadyin the database. If not, the new stem can be added.

(7) EMACS with Three Windows Open

4 In some cases it has proved to be convenient to keep verbs and non-verbs in separate files. Forsimplicity’s sake, this is not reflected in our discussion of the organization of the database.

Page 7: Lexical Databases for Carrier - Bill Poser

– 7 –

Emacs is normally not invoked directly. Instead, a database is edited by executinga shell script, which in turn invokes Emacs. The shell script backs up the databasebefore invoking Emacs and computes certain information used by Emacs. At the endof a session, when Emacs returns, the shell script performs some integrity checks,makes a record of the differences between the previous version of the database andthe current one, and performs some clean up.

4. Printed Dictionaries

A variety of printed dictionaries are generated from the databases. From time totime specialized dictionaries, such as lists of biological terms, have been printed, butmost of the time what is desired is a comprehensive dictionary. A few sections arefixed, that is, consist of information not generated automatically from the database:(8) Fixed Sections of Printed Dictionaries

• Explanation of the Writing System• Grammatical Information• Alphabetical Order• Abbreviations for Grammatical Categories• Appendices (Numbers, Months, Days)• References

The appendices on numbers, months, and days are provided because language teach-ers and parents of small children like them.5 The inclusion of the appendix onnumbers also provides the opportunity to explain the rules for generation of num-bers from their basic components as well as the classificatory system, according towhich numbers and some quantifiers take different forms depending on which of fivecategories the thing counted belongs to.

The bulk of the material, however, is derived from the database.

5 A common belief of both parents and language teachers is that the language material appro-priate for young children is the same as what they concentrate on in English in preschool andkindergarten, namely numbers, dates, and colors.

Page 8: Lexical Databases for Carrier - Bill Poser

– 8 –

(9) Sections of Printed Dictionaries Generated from Database

• Carrier to English• English to Carrier• Topical Index• Root List (By English Gloss)• Root List (Alphabetical)• Stem List• Stems Sorted By Root• Notes on Stems• Affix List• Index of Affixes by Gloss• Index of Scientific Names• Index of Place Names• Index of English Place Names• Index of Loanwords• Credits for Illustrations

In addition to the usual main sections, there are also quite a few indices, whichgroup together words of certain types and simplify finding them. The index ofscientific names, for example, is useful to biologists.

Typical pages from the Carrier-to-English and English-to-Carrier sections of adictionary are illustrated in (10) and (11). A topical index page is illustrated in(12). The various indices are exemplified by (13), which contains a page from anindex of scientific names. There are also lists of roots and lists of stems, organizedalphabetically and by root. These provide a means of looking up a verb form thatis not in the dictionary. If the user is sufficiently knowledgable, he may be able todetermine what stem of what root he is dealing with and on this basis analyze theform. Typical pages from the root and stem lists are illustrated in (14)-(17).

Page 9: Lexical Databases for Carrier - Bill Poser

– 9 –

� ������� �� ����� �� ���

� ������� �������� ��� �������� ���� "! #�$&% '&% %������� ()�� *�,+�-�. /�0 1 032�0 � 1 �4/&#657� . 0 /"8 ��62�0 � 91 �/�%7:�8 ;"<3�� �=;&>@?��4�/A� � �<B:�/�=� . 1 -DC7��! �E "FG E %������� ()�� IH�&J ��� �&K4J "L � �M��N��O&%6:68 ;"<3�� �=4;&>?�. 8 0 � ��� � ;#&P 8 -�0@8 -". /�=Q8 -��86�Q+@-�. /�0 1 0@2&0 � 1 �/�"1 0 1R8 �S� � ;GNR. 8 -�T�%������� ���()�� U�V+@-�. /�0 1 0R2�0 � 1 �/�%6:68 ;"<3�� �=4;&>?��4�/W� � �<�:�/"=4� . 1 -W����! E "FG E %������� &H��YX[Z \�9 ;4�M]^;4�4_ `�;4��ba c�d7N�0 � 03e �/"9e 0 . '40 f�%7Z ��g7`� �������[��� . h"1�%� ���������L i�j3��� �[��/�� '40 � %� �������6J "L K�L�J k&�l����� <32�. 8�%������� �&K�L�J ���m�nh���h4;&%poQ���2�� �"� ��� >Ye -���� 9-48 1�q �"� /�0������� �&H����,�rh��1 O0 8�%sgt� ��� =0)h���1 O0 8W��1 0 f� �4�31 8 �� �=0A��3. /bNR-�. e -bh�0 � � . 0 1W��� 0A�e e �"9<u��� ��8 0�f&%wvQ�� <W��� � ;xe���� � . 0 fb�/y8 -�0Ah��e O&%

gsz ! { �)��/�f)�u���� 4{ �"|4 "{ %

������ }�~� ��. /�%�� ��� +@-���/B8 1 ���Aq ��� ��%~� � 81 8 �2�2�0 fA� ��. /�. /�=�% q������ "� k���/�0 NR� ;l1 0 ������� � ;,<G�8 ��� 0�e ��N<3�"�1 0%������ &K�k&k,��� ��. /4NR��8 0 ��%� ������ ���� �&H�����r8 ��<3�4�G. /yh�0 � � ;#7. /����<30 f��2�2�0 /�f�. ��%� ����~�,1 � 0 0 2�%y� ���3oQ��e -�0A8 0 � � � ���%I� �Q0GN�0 8-". 16h�0 f�% qR� ����oQ��e -�0R. � ���%�� �Q0@N�0 86-�. 16h�0 f�% q

� �&�@oQ��e -"0Q;4��� -48 ��O&%6� �Q0R8 �� O"1�. /3-�. 1�1 � 0 0�2&% q� ������ �B����1 -A8 ��. � %����� � ��k�KJ � l��+@-�. � e �48 . /W2�0�� 1 �/�%����� � �Y��;�4�"/�=G<G��/�%RoQ���2�� ��� ��� 1 >@e -�. � -4��O40#e -�. � -"/�0#�e -�. � -�O0����� � ���������� � � 0 =��� ���R2�� ��� ����������&! { ��#�$&% '&% %����� &K�k��l��� �4� 0 1 8 #�h���1 -�%� ����� L K�� J k&���t;���/�=0��uh�� �8 -�0 � %WoQ���2�� ��� ��� >9 e -�. 1 8 � q �-�O0� ����k��"�S�6��� �rh". =�%B� �"�A?�-4��O"N�e -���%�� ��. =��1 -�% q~� ���DvQ��NR-4��� /4��Oe -"��%�� ��. 1 -��42UZ h�. =2�� . 0 1 8 ` % q)� �&�S��0 8 1 q ��� -�q ��� e -���%W� �"� 0 f�=0 -���<39<30 ��Z h�. =S-���<3<G0 � ` % q����k&Ls��e��/��40�2���f�f�� 04%� �������D��;���/�=40 �@h�� �8 -�0 � %� ������ ���-���/�f�� 0%R� �����@1 0 8 1 0 � -uh"��e -4��/�%�� �@-�0����0q 1@-���/�f�� 0% q� ������ l��8 � ��/�OG���8 � 0 0%������ ���k�k��U?��"f�=0 2��4� 0Q��. /�0#�� �"e��� � ;3O"/"��NR/��1uP �4��e Ou��. /"0�T�#4NR-�. e -31 8 �/�f��� f�� ;S� 0�� 0 � 1�8 �  ! �E�¡�¢3£ E�¤�¢ ! E �%M¥ ¦@§ ¨�©"ªG« ¬�¨�­ ¬�® ­ ¯A° ¯­ § ±�¬�° § ¯ ²

��� ¡&E�³&´�´Gµ ?��4f�=40 2��� 0���. /"0

������ ������ �&jAJ ��¶��UN��"�"f�2�0�e O0 ��%������ �� �[��<G��� 8 0 /�%�¥ ·)¯® ­ ¸ ª�¯¹3¸ ® § « ¯¨�¯ ²������ �� ������k��V��1 -�0 � %R¥ ·M¯�® ­ ¸ ª6º�¸ ¨�¨�¯¨�­ § ²�:�8 9;"<3�� �=;&>7P h". =3<G��� 8 0 /�T�%������ �� ���,�r1 ��� ��<G�/"f�0 ��#R� . ����� f�%V�@-�0M�/�� ;1 2&0 e . 0 1Q��61 ��� ��<G�/�f�0 �7� �4�"/�fM. /)8 -�0u� 0 =. �/

» ��¼ »

½�¾�¿ À�Á" ½�Á�à Ä�Å�Æ&¿SÂ�ÄDÇ�Ä&ÈsÉ Ê�Ë�Ì6Í Â

Î�Ï�Ð Ï�Ñ Ò Ó4Ô Õ Öb×�Ï�Ð Ð Ø�ÙtÏ×�ÚwÚ�Ô Ô Î�ÛUÜyÝ�Ô ×wÞ Ø4Õ ßÕ Ô Þ Ñ Ò ×�à^á�Ô�Ð Ð Ò Ô â ã)Ïsä å æ ç�ÙRÒ Õ ÕG×�Ø4Ð èWÏ�Õ Õ Ö^á&ÔÔ è3Î�Ñ Ò Ô ÚWÒ ×"Ñ ØGÏSé�ç�ê"æ ç"ë4ê"æ Û½�¾�¿ À�Á" ÛíìtÞ Ý�Ï�Õ Ý4Ö"Ï�Õ ÛGî�Õ Ï�Ð àÔ3á�Ïâ ï4Ô Ñuð�â Ô Úñ ØÐ�â Ñ Ø4Ð ÏàÔ@ØÐ�Ò ×SÙRÝ�Ò Þ Ýuá&Ô Ð Ð Ò Ô â�Ï�Ð Ô@ÏÞ Þ ð�èSð"ßÕ Ï�Ñ Ô�Ú�Û�òQØÐ èGÏ�Õ Õ Ö3Þ�ÏÐ Ð Ò Ô ÚAØ×AÑ Ý�Ô�á&Ï�Þ ï�Û½�¾�¿ À�Á"Â4ÌR¿ Ä�¾�ó�½6Á"ô�ô�Ç Ûõìt×�Ï ÙRð�â Ñö âö�÷ ÏÒ ÛSøQâ Ô Úñ ØÐRÞ Ø4Õ Õ Ô Þ�Ñ Ò ×�àGù�êú�û�üQâ Ø"Ï�Î�á&Ô Ð Ð Ò Ô â�Û½�¾�¿ À�Á"Â4Ì�Â�ô�¾"Ç�ý Ã Í À�Á Ûþì[Ñ Õ ÷ ð�âö Ñö âö ÷ Ï�Ò Ûxÿ ×IÞ�Ø×"ßÑ Ð Ï�â Ñ7Ñ ØWÏ3ä å æ ç�ã&ÙRÝ�Ò Þ Ý)Ò â7×�Ï�Ð Ð Ø�Ù^Ï�×�Ú)Ú�Ô Ô�Î�ãÏRä æ � û�ü ä� ü� � ê"å"Ò â�ÙRÒ Ú�ÔRÏ�×�ÚSâ Ý�Ï�Õ Õ Ø�Ù�Û�ÜyÝ�Ô ×SÞ Ø4Õ ßÕ Ô Þ Ñ Ò ×�àSá&Ô Ð Ð Ò Ô â�ã"Ò Ñ@Ò â@Ñ Ö"Î�Ò Þ�Ï�Õ Õ ÖSÔ�è3Î�Ñ Ò Ô ÚWÒ ×4Ñ ØÏSé�ç�ê"æ ç"ë4ê"æ Û½�¾� Û^ì ÷ ð�Ñ ÷ Ï��4Û½�¾� Û�� ����� ��� ��� � ������ �@ì�Õ Ò Ö4Ïá�Ú�ð�Ñ ÷ Ï�Ò Û��@Ý�ÔØ×�Õ Ö3ï"Ò ×�ÚWØ ñ á�Ï�Ñ ñ Øð�×�ÚGÒ ×WÑ Ý�Ô7Ð Ô àÒ Ø4×GÒ â@Ñ Ý�Ô� Ò Ñ Ñ Õ Ô��@Ð Ø�ÙR×��)ÖØÑ Ò â Û½�¾�Â���Ì �G ¾�À�ÁM¾ Û"!UÑ Ø"Ø×�Ïâ Ö"Ï"Û$# ÿ î&%'# (�ß Ö4Ï�)Ö4Ï* %½�¾�Â���Ì�Ä&Å�Á) ¾�À�Á�¿S¾ Û+!UÑ Ø"Ø4×�Ï�Ñ â ÷ ð�Ö4ÏÝ�Û&# ÿ î-,Þ ð�â Ñ Ø4èGÏ�Ð Ö%.# (�ß Ö"Ï�Ý/)wÖ"Ï�* %½�¾�Â���ÌQÂ���Á"Ç�É 0�ËA¾�ô�ÁbÂ�¾�À�Í Å�Æw¾ Û1!tÑ Ø"Ø4×"ßÏ�Ý"ð ÷ Ïâ Û2# ÿ î&%.# (�ß ÷ Ï�â�) ÷ Ïâ * %½�¾�Â���Ì43uÁ*É 576AË�Â�¾�À�ÁY¾�½�¾�Â�� Û !Ñ Ø"Ø4×�Ï�Ñ â ÷ ð�Ú�Ô Õ Ý�Û2# ÿ î2%�# (�ß Ú�Ô Õ Ý8)�Ú�Ô Õ * %½�¾�Â���ô�Ä&Ä79 ÛVì�Ñ â Ï�×"á�Ï�ÖØÝ�Û½�¾�Â���ô�Ä&Ä79IÌ��M¾�9 Æ�Ä&Í Å�Æ�Â�ÄwÂ���Á Û:! ÷ ð�××�Ï�Ñ Ô â Ú�Ï"Û2# ;"î&%.# (�ß Ú�Ï�%.� <��� � =>= ��<7?@= A> � A'�½6Á"¾7B�¿ Û^ì�ï"Ù ÷ ð�â ð�Õ Û½6Á"¾&Å�¿ Û^ì�Õ ð�á�Ò ×4ð�â�Û½6Á"¾�ô�ó�¾3t¿�Å�Ä3G¿���Ä�Á�¿ Û^ì�â ð�â ï4ÔÛ½6Á"¾�ô�ó�¾3t¿�Å�Ä3G¿���Ä�Á�¿ Û^ì�â Ý�Ï�â ïÔÛ½6Á"¾�ÂC��Í 9~¿ Ä&È�Å B�à ÇD3GÍ Â��y¾I¿ Â�Í E�À6Ì'��Á ÛF!Ö4Ï�Ö4ÏÕ Ý"Ñ ÷ Ø�Û'# Þ Õ Ï�â â G�Õ Ð Ø�ß ð@%H# I&î&%H# Õ Ý"ß Ñ ÷ Ø>)bÑ ÷ ØKJ�%½6Á"¾&È�Â�Í L È�à ÌRÍ Â)Í ¿ ÛM!BÚ�Ò ×@� Ø"Ø�Û8# Ï�á�â G B %2# ÿ î&%# (�ß ��Ø4Ø�)N��Ø4Ø@* %½6Á"¾O�Á"ô ÛP� Q�R�� � �S>� RKT�R�U= T�� � ��ì�Ñ â Ï�Û½6Á"¾O�Á"ô"Ì�½�¾�½�Ç Ûsì�Ñ â Ï Ö"Ï��Û½6Á"¾O�Á"ô"Ì�½�Í Æ Û^ì�Ñ â ÏÑ Ò Û½6Á"¾O�Á"ô"Ì L Á@9M¾&à Á Û^ì�Ñ â Ï ÷ Ï�Ñ�Û½6Á"¾O�Á"ô"Ì�Ä&à B Û^ì�Ñ â ÏÑ Ò Û½6Á"¾O�Á"ô"Ì�Ç�Ä&È�Å�Æ Û^ì�Ñ â Ï�Ñ â ð�Õ Û½6Á"¾O�Á"ôCB�¾79 Û^ì ÷ ð�Õ Ý�Û½6Á"¾O�Á"ôCB�¾79 Û^ì�Õ Ý4Ñ ÷ ÏÑ Û

½6Á"¾O�Á"ôGà ÄHB�Æ�Á Û^ìUÑ â Ï�ïÔ�×�Û½6Á"¾O�Á"ôGó�¾�Â��wÈ�Å B�Á"ôGÂ���ÁMÍ EÁ Ûtì�Ñ ð�×"ÖØ4Ý4Ñ ßâ Ï�Ñ Ò Û½6Á"¾O�Á"ô3ó�¾3G¿ Ûsì�Ñ â Ï�ïÔ4Û½6Á"¾O�Á"ô3Â�¾&Í Ã Û^ì�Ñ â ÏÞ Ý�Ô4Û½6Á@E¾�9DÁ�Ì.��Á ÛV!Uâö ð�Õ Ò Û2# I&î&%�# (�ß Õ Ò %½6Á@E¾�9DÁ�Ì � ÛC!Uâ ð�â Ú�Õ Ò Û2# I&î&%.# (�ß Õ Ò�)wÕ Ò W %½6Á@EÄ79MÁ�¿Ì6Í Â ÛX!UÙRð�Õ Ô�Ý�Û2# YRî2%'# (�ß Õ Ô Ý/)wÕ Ò W %½6Á@B Ûsì�Õ Ò Õ Ò Û½6Á@BIÄ�LQ¿ ó�ôÈHEÁ)½6Ä&È�Æ7��¿ Û^ìUÑ Ô Õ Û½6Á@Bb¿���Á"Á"Â�¿ Û^ìUÕ Ò Õ Ò ï ÷ ð�â Ú�Õ Ï"Û½6Á@B�ó�¾&Å Ûsì�á&Ô ×�Ï�Ñ â ÷ ð�Ú�ð�Ñ â ð�×�Û½6Á@B�ô�Ä&Ä79 ÛVì�Ñ â ÷ ð@� Ñ Ô�� á�Ï�ÖØÝ�Û½6Á@B�¿ ó�ô�Á"¾7B Ûsì�Õ Ò Õ Ò ï ÷ ð�â ð�Õ Ý�Þ Ý�Ø4Ø��Û½6Á"Á ÛP� Z�<7 �&A[= � � ��= S R ��ì�Ñ â ÷ Ò Ý�×�Ï"Û½6Á"Á@��Í O�Á Ûsì�Ñ â ÷ Ò Ý�×�Ï�Ñ ÷ Ø�Û½6Á"Á@��Í O�ÁA½�È�ôÅ�Á"ô Û^ì�á&Ô Ú�ð@� ï ÷ ð�×�Û½6Á"Á"ô Ûsì�Ý�Ï ÙRð�â�Û½6Á"Á"ô Ûsì�Ñ Ø"ØS×�Ï×4Ñ Ô Õ Ú@� Ø"Øâ�Û½6Á"Á"Â�¿ Û^ì�Õ Ý�Ò Ñ â ÷ ÔÛ½6ÁL Ä�ô�Á Û]\&^�_a`�ÙRÝ4ð�Ñ â Ø�Û��@Ý�Ò âQÒ â7Ñ Ý�ÔuÎ&Øâ Ñ ßÎ&Øâ Ò Ñ Ò Ø×�ä ü b�c á&Ô ñ ØÐ Ô d7Ò ×@e�Ô�Þ Ñ Ô�Ú ñ ØÐ�ÏÐ Ô�Ï�Õ4Øá"ßf Ô Þ�Ñ Û½6Á@��Í ÅHB ÛX`-`yß ï ÷ ØÝ�Û½6Á@��Í ÅHB ÛX`-`wÑ ÷ Ï�ï&Û½6Á@��Í ÅHBIÂ���Á���Ä&È�¿ Á ÛPg�h&!UïØ"Ø×"Ñ ÷ Ï�ï&Û½6Á�à à Ûsì�Õ ð�à4Õ Øâ Û½6Á�Ã Ã Ç ÛVìVß á�ð�Ñ Û½6Á�à à ý ôÍ Å�Æ�Á"ô Û~ìBÕ ð�àÕ Ø4â èGÏ×�Û>�@Ý�ÔSèGÏ×)ÙRÝ�ØÐ Ò ×�àâRÑ Ý�Ô�Þ Ý"ð�Ð Þ Ý)á�Ô Õ Õ Û½6Á�à Ä&Å�Æ&¿AÂ�Äi��Í 9IÌ�Í Â Ûj!Bá&Ô ÷ Ò Õ Ý�Ú ö �ö ð�×�Ûa# ÿ î2%# Õ Ý"ß Ú ö � ö ð�×/)wÚ@� ð�×7* %½6Á�à Ä&Å�Æ&¿)Â�Ä�Í Â4Ì�Í Â Û:!,Ù@Ý�Ô ÷ Ò Õ Ý�Ú ö � ö ð�×�ÛD# Ï�á�â�G3[� %�# ÿ î&%�# Õ Ý"ß Ú ö � ö ð�×/)wÚ@��ð�×7* %½6Á�à Ä&Å�Æ&¿DÂ�Ä49MÁ&Ì3Í Â Ûk![â Ô ÷ Ò Õ Ý�Ú ö � ö ð�×�ÛP# ÿ î2%# Õ Ý"ß Ú ö � ö ð�×/)wÚ@� ð�×7* %½6Á�à Ä&Å�Æ&¿xÂ�Ä^Â���Á@9IÌGÍ Â Ûl!lÝ4ð�á&Ô ÷ Ò Õ Ý�Ú ö � ö ð�×�Û# ÿ î&%.# Õ Ý"ß Ú ö � ö ð�×/)wÚ@� ð�×7* %½6Á�à Ä&Å�Æ&¿GÂ�ÄyÈ�¿ÌRÍ Â Ûm!�×�Ô Ý�Ø"ØÕ Ý�Ú ö � ö ð�×�ÛC# Ï�á�â�G3[� %�# ÿ î&%�# Õ Ý"ß Ú ö � ö ð�×/)wÚ@��ð�×7* %½6Á�à Ä&Å�Æ&¿MÂ�Ä^È�¿yÉ 0�6AË�ÌSÍ Â Ûn!,×�Ô ÷ Ò Õ Ý�Ú ö � ö ð�×�Û# ÿ î&%.# Õ Ý"ß Ú ö � ö ð�×/)wÚ@� ð�×7* %½6Á�à Ä&Å�Æ&¿)Â�ÄwÇ�Ä&ÈBÉ Ê�Ë�ÌuÍ Â Û:!,×4ÖÔ ÷ Ò Õ Ý�Ú ö � ö ð�×�Û# ÿ î&%.# Õ Ý"ß Ú ö � ö ð�×@%

o �prq o

(10) (11)

s�t�u.v wKx�y�z�{ |H}~� }x��}�[uHx����D� {H|H}�[���H}8v wK}[� �@�����K��� � ��� �� y x�w�� � }x@�[|H}@{/� �@��� ���|Hx��a� � }x��}��� �@� �|Hx��a� � }x��}��� �� � ���y }}@w��8�@���� �� � �@��y }}@w��8�@���� � �@��8trt7� }�����x�w��r�&�@�@�@� �7� �K�����x�w��H�K� ���Kv ���y � � }x��$� ����� �7� �������x�w��H�K� ��x � � v ���K���@�7� �K��'�N�C  ¡¢¤£.¥&¦¨§2¢&©Hª ª «�¡@¢�'� ¬��i­N© ¥&©H®�£.ª¯ ���/� ��@��¯ ���.� � v ��� ���@���° �@��'� �-�±�C  ¡@¢� �H� � t��&� � � �� ²��w��Hx����7² �³ try y �a´�x@�K|H}@{8� � �K�������t��@x�{H}}[�K² � �@�y x@�7}�� ��t7�H����² �y v {H�µwKtr|/� � � �� ²��¶ t7�H{r� x7v {µ·"�Hv ��} ¯ ���8� �@��¸ }@|N¹7x7y �8t7{/� ��� �@��¹rx�y �8t7{���¹rtHw���}��}[� �K� �@���¹rtrw���}��}/¹rx�y �8t7{/� �K� �@��¹Hº7�Hx» ¯ � ���.¼Ct������r}�K{8�-��@� �¹r���r����}t7{��H·"�Hv ��}[� �@���� ²��@° �@���� w���}��.½.x�����}@��wKx7y }[����K� � ���K�����t7�r��� ¸ x�v { � t»+�@�@�7� ���»[� v ��} ¯ ���8� �@² �-��'� ¾-�±¿�£.®@À@¡� x�w�� � t7{r}�t�Á ¯ ���/� ���@��-��@�� t7{H}@�>t�Á ¯ ���/� ���@��-�������¯ {/� ���@��-�K�7�����wKx�y }@�>t�Á ¯ ���8 �K��K�Ã� ��}@y }� t7{Dt�Á ¯ ���/� ���@��-���������x�v y � ¯ ���8 ° �@² � ��'� Ä2�±Å]  ¡@Ær©Hª ª £�¥&©HÇ'È&¡��y v �8}7� ¯ ���/� �@�K��-� � � �@�

¾'�±ÉNÊ+Ë&¢&  Ì&  £�¥&¡DÉN¥&¦¤Í±© Ë2À@  ª ©H¡Î7��t��H�'¹ru.t���� }@|/�@�@� ���� ���y v ��x��K|8° ���@�@� �K�¹7u�t���� }@|DÎr��t����@�@� ���� ���� x7y x��8x7{ |H}��° ���@�@� ���� {Hx���}[� � � �@�����@�s�t�x�|�� ·V}@� � }�K{/� � �K� �@� �·V}@� ��}�K{±s�t�x�|/� � ��� �@� �Ä-�±Ïa  ®�¦&¡Ä-� ¬.�V­N©H¥&©H®7£.ªÐ �8}��v wKx7{±Ñ$��t»+���K� � ���Ò x7y |DÓ2x��7y }[� � ² ���K� �����Ò y �H} Ô x�iÕ ¹r� }@y y }�Ö ��×7x� Ø2� ²��@�K�-�@Ù ² �� v �K|/�@�@��� ���� v �K|�� y v ����y }[�@�@��� ��� ���KÙ� v �K|�� y v ����y }[����K� ���ÙwK��x7{H}���² � �wK��t@»+����� � ���|H� w��/�@�@� � ���Î7trt7y-ÚC}@{/������� ���Û ��} � }7� ¸ }@|rÜ ¼C}@w���}@|8� ��� �K� �Û ��t7� � }7��¹ru �K�HwK}[���K� � �K���t7t7� }C��@�K�ÚC}@y y | v �7}��� ��� �K� ��r� �/�/v {r� � v �K|/� ��� �@����� �@�@�@Ù×7x�@�'Ñ�x7{Hx�| v x�{8�K�-�@Ù�² �×7x�@� Û ��x�8�K�-�@Ù ²��Ô x�@��¹r��}@y y }��Ö �&� ²��@�K�-�@Ù ² �y t7t7{/���K�@Ù �¶ x7y y x��K| ³ �Hw��8� � �@�K� ° �@�t»[yHÝ>�@� �@Ù ��K�¸ }@|rÜ ¼C}@w���}@| Û ��} � }[� ��� �K� �¸ t � v {.� Ð �8}�Kv w�x7{/� ² �-�¸ �HÞ.}@|a����t7� � }�� �@� � �@�¹7u �K� wK} Û ��t7� � }[���K� � �K�¹7��}@y y }�Ö ��×7x�/� ²��@�K�-�@Ù ²��� x7{ |ru.v u.}�$��² �@� ² �� }x��7� y yr�7² � �7� �� »>x�{/� ��� � �@° �@�s���}}/¹r»$x7y y t»�� �@� ° ���K�·�� v � ��}�D×7x�w��8�K�-�@Ù ² �»$t7tH|Hu.}@w���}��° ���@�@�@�@� ���� �@Ù

ßDàáHâ�ß

ãCäKåæ�ç7è é�êHæKë ì í�î ïråðañ�ò�órä�ô å@î7õ�ô ö.äC÷Cé7ì8å@ø ù�é7äKä�ô î ô ë ì]ì�ú7æ�õ�ô è è ñ7ô ïrå@ø�û

ü.ý þ ÿ���� � �Kÿ ��� � � � � � �ü.ý � �Kþ ��� þ ÿ�� ÿ � ��� ��� ��� � ����� � ��� � � �ü�� ý þ ��� � ý þ ��� ��!Kþ ÿ � ����� �" � #ü�� �����%$�� ÿ � !�� � � " �ü&�'��(�� � ��������� ý ÿ ��!�� ý � (�� ��� " � � � #ü&�-þ � � �Ký )�� þ ÿ�� � ��� * � � � � � � � " �ü���� ����� � � (�ÿ )�(��Ký )���� + � " , # �ü���� ���-þ � � � * þ ÿ � + � � # � �- þ � ��� �.� � � ��!���� � � �.��� � ��!���� � * þ ÿ �/�� " � � �- þ � ��� �.��ý ý � !Kþ ��� � � � � � � �� " � - þ � ��� �.��� ��(�ÿ � * þ ÿ � � " #- ÿ (�� ÿ � ��� � �Kþ � � ÿ � � � , �- ��* �.��� ÿ þ � � + � � � � #0%� ��� ��� � � ÿ � ��� + # � + " � #0%� ��� ��� ������� 1 " �0%� � � � � � þ 2 �3��� ��� � � � # + � # ��" �0%� � � � ÿ.ý � ��� !Kþ ��� � � + � �0%� � ��� � ���'������� ý ÿ ��ý )Kþ � � ��� , � � � # 4 � #0%� ý ��� �.!�� ��� � � � � � � � , � � # �0%� ÿ ������� � ��� � ��� * þ ÿ � � � � + � # 0%� ÿ $������Kÿ � ý )�(�ÿ )�(��Ký )���� � + � � �0%(�� ����ý � � � �.� � þ � � þ ÿ � + � , ��" � � 5 þ ��!Kÿ � � � �����.ý � ��� !�þ ��� � � � � + � � 6 ÿ þ � )�� 7 � �8!�� ÿ � � � ��� + � � # +9 þ � � � ý � �Ký � � � ÿ 4�� � � �9 þ � � ��!����-þ � � � ý ��� 4�� � �9 þ � � ��� (���: �;� � #9 ÿ � � � ÿ � �.� ��þ ý � þ � � # � � #<�� $�� �.� �.�-þ ÿ � � #<�� � �Ký ���'(���� � ��ÿ � ������� � ��� ����� + � � " � � � 4�" �<���� �.����� � � � � � + � �= � � � � þ þ � ����� þ �Ký ��ý þ ��)�� � ��� + � � 4�� � 1 � �= þ ÿ � ý � þ ���>� � ��� � ��� , � � �? ����� ��þ ÿ ���.ý ���.�'����� � � + � � � � � � , " +@ � ÿ � :A� � ÿ � ý � ��� + � � � � 4 # �@ þ !����>� ÿ ��þ ��� � ��!�� ý ��� � " 1 � � "�� " � � �@ þ !����>� ÿ ��þ ��� � ��!�� ý ��� 1 � � "�� " � � �@ � � �.� ��� � + � # � + � �@ ��� ÿ �2ý � ��� !�þ ��� � � � � B � � ) + � # �C � ÿ ��� � �2ý � � � ��� � � �" + � #C � ÿ ��� � �3��� ��� : � � � #C � ÿ � þ �����-þ ÿ � ý � ��� " � # C � ÿ � þ ����þ ����� ��� � " � # �C ��� � þ � ��$�� � ��� + � � D � � �C (�� � � ��� �Ký � * ��� ��� � # 1 � 4� �" + � � #E��Ký � ÿ )�(��Ký )������'(�F�� � � �" � � #E��Ký � ÿ )�(��Ký )������Kþ ÿ F � , � � " �E��Ký � ÿ )�(��Ký )������Kþ ÿ F � + � � " �E���!�� � ÿ �.7 � ��þ � )�� ý ��� + � � � � +G þ ÿ � � ��ÿ þ ��� ý � ��� !Kþ ��� � � , ��" � � G � ����� ý � ��� � ÿ � �.� � � � * � � � � " � � �G ��!�� ý þ ����� ÿ � � þ � þ ��� + � � � � G � ����� ����� ÿ þ �'��� ��� !Kþ � + � � " , " �G � ����� ����� ÿ þ �'��� ��� !Kþ � + � � " , " � 1 � �G � ����� ����� ÿ � ý )���ý � ÿ ��� � � � � � �

G � ÿ ��)�(�ÿ ��� ����� � � � þ�H�ÿ � � )������&�Kÿ � )�(I� � , � � � �G ÿ ������� þ �-þ ÿ ��� ��� � � � " � , � � � J%� #G � þ ��!�� � � ��� �.�-þ ��7 � þ � � �;� � � �Ký � " � + � #G � (�ý )���ý )Kþ � � ����� ÿ þ ��� �Kþ ��� � � �� " � #K � ���.��ÿ þ � � � � � �" � ��� � K � ��� � * þ ÿ�� � ÿ � ��!���� �� " � # K ��� �.����� F � ��� �� " � " �L�� � � :A� �Kþ ý � þ � � " # � # L�� � $�þ � � �������.� � ��� + � � 4 � #L�� � $�þ � � ������������� (�ý ��� ) 4�� +L�)Kþ ��)Kþ ÿ !�� �-ý � ��� !Kþ ��� � � � � �M" �L�)Kþ ��)Kþ ÿ !�� �-ý � ��� !Kþ ��� � � � " ��" �L���ÿ ������� � � ý )Kþ ��� � � � � �L���)�� � �����Dý � ��� � � � ý þ ��� +N �N � � �O � ý )�(�ý � �Kþ � ����� ý � � � ÿ � " � � �O � ��� � � ý � �Kÿ ����)���!�� ����� ý ��� ý � � ���3PQ� � + � � # � � � O )�������� ��)�� ��� � ÿ � � � � � + � � " , " �O ��ÿ !�������� � ÿ � � ��ÿ � ��� � � ��R ÿ � �������-þ ÿ � ý � ����� � " �S � ý ý � ��� ���T�'(�ÿ � � � � � � !�þ � U � # � + � " �

VXW�Y[Z\V

(12) (13)

Page 10: Lexical Databases for Carrier - Bill Poser

– 10 –

���������� ���� ���� ��������� � ������� ������ �!�"

#%$�& #%'�( )* +�,* -#%.!/ 0 ( 1!2%2%-#!3�.%& 4 5 6�7 286�-#%9 : ;8<!= -#�>?#%:%( @!A 6�-#%B"( =* ,* C D -$�#%' = , A 7 -$�#!0 ' ;8EF<!@�-$�#%& : =* ,* C�-$( D A G$( = 7 286 -$(%0 H!3�. 1!) <!@�-$/ . 5 6!+8-$/ 3�' I�6!C D -$/ J ( 5 6�7 <!D G$/ J J (�& ;!7 2%28)�-$0 #%K : I�<!, -$0 / 3�: L!<!D -$0 H8> M�28D 6�-$0 N�( 1!) +�@ -$H"/ 0 O�<!D 6 -$H!J P = 7 C�= -$�& #!/ ' 7 2%2�D -$�& (8#%: = <!;%G$N"&�3 ;!7 <!@!G$N"& Q ;8ER7 +87 -$N"4 ;�C�= -K�#%K�P"( = , +%-K�#!0 0 M8A 6 -K�#!0 S I�6!C D GK�#%Q�9 / T�( I�68<!) -K�P"#%9 ( M�2%28= -K�P"(8> 7 +�D 6�-K�0 #89 9 U $H"'�4!U K = A -K�0 #89 9 U $H"'�4!U N @!2 -K�0 #89 9 U K�H�K U K ; +�A 6�-K�0 #89 9 U (%N"H%U K 1!) +�A -K�0 #89 9 U VN"W�4!U K 1!2%-K�0 #89 9 U P�#�4!U K 1!) 2�6 -K�0 #89 9 U 0 / X�N�/ '"U K 1!) C 6�-K�0 #89 9 U 0 / X�N�/ '"U N M8<!D GK�0 #89 9 U 0 & H%U K = +�@�-K�0 #89 9 U 0 & H%U N = 7 28YK�0 #89 9 U S�'�H%U K D C�-K�0 #89 9 U S�'�H%U N 1!C D GK�0 #89 9 U SZN�9 P!4!U K = D 286�-K�0 #89 9 U 9 '�H%U K 7 +�A -

K�0 #%9 9 U 9 '�H%U N 7 A , -K�0 #%9 9 U [8'�\ U K 5 68<!,* -K�0 #%9 9 U [8'�\ U N 7 +�= -K�0 (8#8& 1!) <!OZ-K�0 H"N"']SZH8^!(�9 1!C GK�H!0 ' ;"7 <!)* -K�H!0 ' D A _K�H!N�.!P ;%EF<!, -K�H!N�3!J = 2%-K�& #%K : E`<!D -K�& #!3":"4 ;�C YK�& #�>�0 = , 7 C 6�-K�& #�>�0 U a�[ I8C = -K�& #�>�0 U b 7 +�, GK�& H"H%:%(8' ) 286 -K�&�N�3�K P I82%2�, -K�&�N"9 P =* ,* <!= -K�& 4 =* ,* 2!-K�N�& 0 1!2%28)�-K�N�Jc>�/ J P�$0 #%'�( = 7 +�,* -'�#%$ = +�, -'�#!3�K�( 1!+�A 6�-'�(8#%0�>�/ J P @!+%-'�(8K�#�4 d <!= -'�(8(�Q ;%68<!D 6 -'�/ #8&�P�(�# d 2%2�= -'�/ ( =* ,* +�A -'�/ (eK�H!0 0 (�K�J / ^!(80 4 D +8-'�/ & J 4 = , <!@!G'�/ 9 9 H!0 ^%( D A = -'�/ 9 J #!3!J 1!) + -'�H 7 C @ G'�& #%. I8<!, Y'�& #�> = , A G'�& (�#!S 1!D +�D -'�& (�#!Sgf a�U [!h = C�-'�& (�#!Sif b"j�h = C , -'�&�/ 3�: @!+�A -'�&�/ ^!( ;%EF<!) -'�& 4 I8<!A -'�& 4 M%A G'�N�0 0 I8EF<!= -'�N�0 0�Q�#%/ 3 ) 2%28@ -'�>?(80 0 = 7 A Y'�4!( 1!A D 6 -(�#%J L!+�6�-(�#%J M%A -

kml�n�opk

q�rs�tpu v"w�x!y zZ{�{�t�|�} ~�� ��r������!t�� �8��� � ���"r�u v!�"��y

�!�"���������c� �%����� � ���c� � �`����� �8��� � �`��!�"�� 8�����c¡ ��¢ ¢"� £�¤ ¡ ¥!� ��¦%� � § ¤����������� ��¦%� � § ¤ ��%¨���© �������� ��� � �?� ª`�F� ¡ ��¦�« ����� ¦%¦"��� ¤ ¬`� ����� ª%��¡ ¤��¡�ª%¤ � ­ ����� �8��� ¡ ��¦"�

�%¨��"���]���c®!¤`¬%�%¢ ¢ ¥8ª%����� �%� ¡ ¦"���� �8��¬8�%¢ ¢ �¯ ° � ���c� � ±����������� � ± �¯ °%² �]���c³%� ����� � ���%����´���ª%¬e¢ � ª%¤ ���� �8��³%� �"�¯ °%²µ ¶ ���c� ¤ � ¡ ¤ � ¤R®!��¬%� ¢ ±?·%�%� ¬"����� �8��� ¤ � ¡ ¤ � ¤ �¯ ¸ ��]���c� � ¡ ��¦"¤ ���� �8��� � ¡ � ¦"¤ �¯ ¹"¹ � � ���?�%��¦"¥%¢ � ´�¤`�c¡ ��®%®%� � ������8��¢ ¤ ��¦"�¯ ¹"¹ ��º�����¦%�%��³%� �e��¡`»c¤ � �`��ªe� �%¤R¬%¡ ±8� ª%�c¦"� ¢ ¤ � ������8��¡ � � ´%�¯ ¹"¹ ���]¼%� ¡F¬%� ��¡ �%¤ �c� �c·%� �`����� �8��¬%� ��¡ �%¤ �8�¯ ��½ � ���c¦!��¦"¥%¤ ����®�±�¦"� ¦%¦%� ª%�8����������¦"� ¦"�¯ �"� � ����¬8¤ � � ± ����������¬%¤ � � ± �¯ �"� º ¼%��¡�� »c��´ ¤R� �c»�� £ ¤������ �8��� »c��´�¤`»c� £ ¤ � �¯ �"� ¾i���R®"¤�¿ ¤ ��¡ � � »c¤�¥8¬8��ª%��¤ ¡ ���%� ����������¿ ¤ � ¡ � ��»c¤ �À °"Á �]����� ¤ �F����������� ¤ �`�À ° © ��Â�����%��ª%¬%¢ ¤�� �8¤�� ��ª�� ¤ ª�� �c��¿���ª�� ¦"¤ ª�� ��ª8Ã� ��� ª%¤ ¡ ���� �8��� ¢ ��� � à � ��� à � �À ° À � ���c®"¤`� �%� ¢ ¢ � �`���� �8��� �%��¢ ¢ � �`�À ° � �ļ%��¡R»R�%¢ � � ¦%¢ ¤?��®�Å ¤ � � �R� �e»c� £�¤���� � �8���%�F� �%¤��¦8¦%¢ � � ��� � ��ª?��¿!��ªc��®�£8� ���%��¤ Æ8� ¤ ¡ ª%� ¢%¿ ��¡ � ¤���¡�� ª8±¿ � ¡ »Â��¿�¢ ��� � »c��� � ��ª"¥����R���%¤ ª�¿ ��¢ ¢ � ª%��¬%�%¤c� � »?æ%¢ ±�� ����¡ � £�� � ± �FÇF¦%¦%� ¡ ¤ ª�� ¢ ±���¢ � ���%� ¤ ¬�¿ ��¡F� ��»c¤� � ª%��¢ ¤���®�Å ¤ � � �c¤ � � �%¤ ¡c®!¤ � ���%� ¤�� �%¤ ±�� ¡ ¤�·%���?��¡¢ ��¡ ��¤`��ª%¬��%¤ � £8± ����������¿ ��¢ ¢ à È��À ¸ �¼%��¡c� ���Z¦"¤ ��¦8¢ ¤�� ��� � �R��¡c®!¤�¢ ��� ��� ¤ ¬"���������� � � à É��À ¸ º ���c���c®�±�®"� ��� �������8�����c®�±�®"� ��� �À ¸ ¾m���c®"¤`� ¡ � ª%´8± �������8��� ¡ � ª%´8± �À ¸ � �]���c®%�%± ���� �8��®%�%± �À ¸ � º ���c� ¢ � ¦"�������8��� ¢ � ¦"�À ¹ �������c� ¡ ��� ´%����������� ¡ ��� ´!�À ��© � ����®"¤R� ��¤ ¤ � ¥�� ���%� £�¤?��« �����8¬%­�� ��� � ¤ �`�������� ��¤ ¤ � �À � À �����c� ¢ ��¦"¥%� ¢ ��¦"���������� ¢ ��¦"�À ��� �]���c� � ´��cÊ��%¤ � � � ��ª"���� �8��� � ´!�À Á ° � ���c®!¤`¿ ��� ����� �8��¿ ��� �À Á °!Ë �]���c� ��ª%� � ¤ ���������� ��ª%� � ¤ �À Á °%²µ �����R³%¢ ¤ ¥�� ¡ � ª%¬?��ª�¤ ¬%��¤ ¥8� �����%¤ ªc� �8��¡ ¦"¤ ª%� ª%��?´8ª%� ¿ ¤ ������8��³%¢ ¤ �À Á ¸ � � ���c� ��¤ ¢ ¢"�%¦"���� �8��� ��¤ ¢ ¢ �À Á ¹�Á �e���F� ¢ ¤ ¤ ¦"�����%� ��� ��®%� ®8±�� � ¢ ´%¥��%� ¤ ¬R��� � �R� »c��¢ ¢� �%� ¢ ¬%¡ ¤ ª`��ª%¬R� ªR� »c� � ��� � ��ªR��¿8� �%¤�� ¦!¤ ¤ � �?��¿8� »c��¢ ¢� �%� ¢ ¬%¡ ¤ ªZ��¡?��¿���¢ ¬%¤ ¡?¦!¤ ��¦%¢ ¤c� ¦"¤ ��´�� ª%�e� ��� »c��¢ ¢� �%� ¢ ¬%¡ ¤ ª"��ÌF� ¤�®�±���¬%�%¢ � � ¥���%¤ ª�ª%����Å � ´8� ª%�8¥�� �

¤ ª%¬%¤ ��¡ � ª%�8����������� ¢ ¤ ¤ ¦"�À Á ¹"¹  %�]���c®!¤`®%� � � ¤ ¡ ���� �8��®%� � � ¤ ¡ �À Á ��© Á �]���c¢ � »c¦"������8��¢ � »c¦"�À Á � Ë �]���c®!¤`¡ ¤ ¬"���������¡ ¤ ¬"�À Á � Ë º ���c®%�8¡ ª"¥%�����8��®%�%¡ ª"�À Á � ² � ���p·%� � ´%¥`�%� �%� ¢ ¢ ±p��� � �i� �%¤Z³%ª8��¤ ¡ �Ä�����8�·%� � ´%�À Á �� µ �]���c®!¤`� ��¢ ¬e� ��� �%¤`� ���8� �"�������8��� ��¢ ¬"�À �"��Í ���]¼%��¡�¢ � Ê��%� ¬�� ��®!¤`¬%¤ ¤ ¦"���������¬%¤ ¤ ¦"�À ¨���© � ���c£���»c� � �������8��£ ��»c� � �À ¨�� Ë � ���c®"¤F®%��¢ ¬"¥%®8��¡ ¤ �������8��®%��¢ ¬"�À ¨�� ² �]���c� � �%���"�������8��� ���%���!�À ¨��" %�]¼%��¡��c� � ª%� ¢ ¤F�%¤ � £�±�� ®8Å ¤ � ��� �c»c� £�¤ �������8�¬%¡ � £�¤ �À ¨ Á °"Á � ���?®%�%¡ ¦"������8��®%�%¡ ¦"�À ¨ Á �� 8�]���c´8ª%��� ´!�������8��´�ª%�8� ´%�À ¨ Á �� �ºm���c� ª%� ¦"¥8������� � �e� � � � � ��¡ � ���� �8��� ª%� ¦"�Í ° �]���c¬%� ¤F� ��¢ ¢ ¤ � � � £�¤ ¢ ± ������8��¬%� ¤F� ��¢ ¢ ¤ � � � £ ¤ ¢ ± �Í ° © �]����®!¤`»c��ª�± ����� �8��»c��ª�± �Í °!Ë � ���c®!¤`»c��ª�±����������»c��ª�±��Í ° � � ���?·%����� �������8��·8����� �Í ¸ �����R�%��ª%¬%¢ ¤�¦%¢ �%¡ ��¢8¬%¤ ¿ � �%¢ ���®8Å ¤ � � � ����%� ��� ����ª8¤��¿`� �8¤e� ¢ ��� � � ³%� ��� � ¡ ±p£�¤ ¡ ®i� � ¤ »c� �]�������Z� ¢ ��� � ûc¬%� à � �Í ¸ ºm���c»c��´ ¤ �������8��»c� ´�¤ �Í ¸  %�Î����� ����´m®�±�� »c»�¤ ¡ � � ��ª�� ªm�%� ��¢ � Ê8�8� ¬���¡� � ¤ ��»��������8��� � ¤ �F�Í © º ���c®!¤ ¥%®"¤ � ��»c¤ �������8��®"¤ �Í © ¶ ���c¿ ¤ ¤ ¢"� �%� ����ª%¤ Ï ��®"��¬%±�� ��� ��¢ ¬"�������8��� ��¢ ¬!�Í © ���]¼%��¡F�c��¡ ��ª��%¢ � ¡�� �%®%� � � ª%� ¤R� �%� �e���F� ª8� �F¥!� � ¤ ¥��¡�� �%��� ¡�� �c¬%� � � ��¢ £�¤ ���� �8��¬%� � � ��¢ £�¤ �Í ¹"¹ ��]���?� ª%��¡ ¤ ����� �8��� ª%��¡ ¤ �Í ¹"¹  %�]���c� �c®8±�� ¢ ¤ � ���"������8�� ¢ ¤ � ���"�Í �� 8�����`¤ Æ8� ¡ ¤ � ¤�·%�%� ¬"��ÌF� �8��¢ ¢ ± ¥�� �F¦%� � ��� ¡�%¡ � ª%� � ¤ ¥®%�%����¢ � ��� ¦%¦%¢ � � � ®%¢ ¤F� �c� ��¤ � � � ª%������� �8��¦8� � � �½ ° © �����c®"¤`� �%� ¡ ��� ª%¬�� � �%®%®�± ����� �8��� � �%®%®�±��½���Í ���¼%��¡F¢ � Ê��%� ¬�� �c®!��� ¢ ������8��®"� � ¢ �½���Í � º ���Z¡ ��¢ ¢ �i���%� �?� �?�%� ¤ ¬p���%¤ ªp� �%¤���®8Å ¤ � �� � � ¤ ¢ ¿�¡ ��¢ ¢ � ����������¡ ��¢ ¢ �Ë"° �]����¬8¤ � ¢���� � �"¥8��� ¡ ´���ª"������8��¬%¤ ��¢���� � �"�Ë"° º ����»c� £�¤ ����� �8��»�� £ ¤��Ë"° ¾ ����®!¤`��¢ � £�¤ ���������¢ � £�¤ �Ë"° © � ���c¬%¡ � ª8´!����� �8��¬%¡ � ª%´%�Ë"° �������?� ¦%¢ � � ¥�� �����8��¬"¥8®�±c��ª�±�»c¤ � ª%� ¥8¤ � ������� � ���ªe� Æ8¤`��¡���� � ���c��¤ ¬%� ¤ �������8��� ¦%¢ � � �Ë"° � º ¼8��¡F¢ � ������� �c»c� £ ¤ ����������¢ � � �8��»c� £�¤ � �Ë"¸ � º ����¤ Æ8� � ª%���%� � ����¢ � �����F��¡`³%¡ ¤ �c�����8�`¤ Æ%� � ª8Ã

Ð w"Ñx Ð

(14) (15)

Ò!Ó�Ô�Õ Ö×�Ø%Ù ÚÛ�Ü"ÝßÞ`Ô à Û ÒáZâ�ÔcÕ ã"ä"Ö�Ù

å8æ!çéè!ê ë ì8í îðï ñ òå8æ!çéè!êôócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ù ü ùå8æ!çþý�ê ë ì8í îðï ñ òå8æ!çþý�êÿócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ùå8æ!ç ��� ê ócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ù ü ùå8æ%å�� ê ï � ì � ë ì�ë î ï ø%ì�û û õ òå8æ��éè�� ócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ùå8æ�� ��8ê ócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ùå8æ�� � ý�êÿócõ�óöë ì�÷ ø"îúù û ì�ï ï ü ù õ�ùå8æ��� ê ù õ ��� ë ì � î�� ì�û û ü �å8æ��� ê ë��%ë î ï û ì �å8æ�þè!êôócõ�óöë ì � î � ì�û û ü �å8æ�þè!êôócõ�óöë ì � î � ì�û û�� � ���å8æ�þè!ê�� ñ � ë��%ë î ï û ì �å8æ�þè!êô÷ � ñ � ë��%ë î ï û ì �å�� � ê ë ñ î ï ÷ � ü �å�� � ê ù õ ��� ë ñ ���"õ�ì �å�� � ê ï � ì � ë ñ ù � ì �%ë"!å�� 8ê ï � ì � ë ñ ù � ì �%ë"!å�� ý�êÿócõ�óöë ñ ���"õ�ì �å�� ý�êÿù õ ��� ë ñ ���"õ�ì �å�� #� ù õ ��� ë ñ � �"õ�ì �å�� ý$� ù õ ��� ë ñ � �"õ�ì �å����éè�� ù õ ��� ë ñ ���"õ�ì �å����%#� ócõ�óöë ñ ���"õ�ì �å���� ��� ê&��� õ ' ë ñ � �"õ�ì �å���� ��8ê ócõ�óöë ñ � �"õ�ì �å���� ��8ê ù õ ��� ë ñ � �"õ�ì �å��"(�� ê ù õ ��� ë�ò)��* î,+ � ÷ -�ñå��"(�8ê ù õ ��� ë�ò)��* î + � ÷ -�ñå��"�� ê ë ñ � î ����!å��"þè!ê ë ñ � î.����!å��" ý�ê ë ñ � î.����!å��" ý�êÿócõ�óöë ñ � � ï û ÷ �å��"þè�� ë ñ � î ����!å��"/þè!êôócõ�óöë�ò)��* î + � ÷ -�ñå�01�2� ê&��� õ ' ë õ�ø"î3� � ì�ù ëå54�ç6� ê ï � ì � ë��%÷ î7� ì�ï � !å54�çðè!êôócõ�óöë ñ ���"õ�ì �å54�çðè!êôù õ ��� ë ñ � �"õ�ì �å54�ç �3� ê ù �%ï � ë ñ � �"õ�ì �å54�å è!ê ë��%ë%î ï û ì �å54�å88ê ë��%ë î ï û ì �å54#� �38ê ë ñ � î.����!å54#� �3#� ë ñ � î ����!å5458� ê ï � ì � ë ñ � � ï û ÷ �å5458� ê ù õ ��� ë ñ � � ï û ÷ �

å5418� ê ù �%ï � ë���� î ì�ï ëå5418� ê ë���� î ì�ï ëå541 è"êôócõ�ó ë�ñ � � ï û ÷ �å541 è"ê ë���� î ì�ï ëå54188ê ócõ�ó ë ì � î � ì�û û�� � ���å54188ê ë���� î ì�ï ëå541 ý�ê ócõ�ó ë�ñ � � ï û ÷ �å541 ý�ê ë���� î ì�ï ëå541 è�� ë���� î ì�ï ëå5418�� ócõ�ó ë�ñ � � ï û ÷ �å5418�� ë���� î ì�ï ëå#9 æ:� ê ï � ì � ë%í ì�î � ì �å#9 æ!ç �;� ê ë%í ì �"î ÷ ' �8÷ � ñå#9 æ�� è�� ë%í ì � î ÷ ' �8÷ � ñå#9 æ��<�� ë%í ì � î ÷ ' �8÷ � ñå#9 æ�� �;� ê&��� õ ' ë%í ��� � ����� �å#9 æ�� �;8ê ë%í ì � î ÷ ' �8÷ � ñå#9 æ�� �;8ê ë%í ��� � ����� �å#9 æ�= è"ê ë%í ì �"î ÷ ' �8÷ � ñå#9 æ�= è"ê ë%í ���"�>����� �å#9 æ�= ý�ê ë%í ì � î ÷ ' �8÷ � ñå#9 æ�=<� � ë%í ì �"î ÷ ' �8÷ � ñå#9 æ�= ý�� ë%í ì �"î ÷ ' �8÷ � ñå#9 æ�(?� ê ë%í ì ï@ îBA û ñå#9 æ�(?8ê ë%í ì ï@ îBA û ñå#9 æ�(C � ê ë%í ��*@ î ù õ�û +å#9 æ�/ è"ê ë%í ì ï@ î A û ñå#9 æ�/C è"ê ë%í ��*@ î ù õ�û +å#9 æ�/C ý�ê ë%í ��*@ î ù õ�û +å#9 æ�/C è�� ë%í ��*@ î ù õ�û +å#9 �� è"ê ë%í ñ � î ï ò�ñ û û1���å#9 019D� ê ë%í õ�í î ï û ñ ñ �å#9 010�/E� ê ï � ì � ë%í õ�õ * î,�%÷ � � ñ �å#9 4�ç 9F� ê ù õ ���úë%í �%÷ í î û ÷ óG�å#9 4#=H� ê ï � ì � ë%í ��� î � ñ +å#9 4#=H� ê ë%í ��� � ����� �å#9 4#= è"ê ë%í ��� � ����� �å#9 4#= ý�ê ë%í ��� � ����� �å#9 45(3� ê ï � ì � ë%í �%ï îJI ÷ ù ëå#9 45(3� ê ù õ ���úë%í �%ï î I ÷ ù ëå#9 45(3� ê ù �%ï � ë%í �%ï î I ÷ ù ëå#9 45( è"ê ë%í �%ï î I ÷ ù ëå#9 45(C 8ê ï � ì � ë%í ��*@ î ù õ�û +å#9 45(C 8ê ë%í ��*@ î ù õ�û +å#9 45/C � ê ï � ì � ë%í ��*@ î ù õ�û +å5�145� ��� ê ï � ì � ë�ø"�%û ø"î + ñ ñ �å1KL4�çE� ê ë�ò)�%÷ îM-�õ�óc÷ �

NPO1Q Ö N

R#S�T1U V#W T1XZY[V�\5]�W T1^P_a`cbd\�\5W e�f�e

g�hji k l knm o p q or�s�t u s1v i w x l y�p z h m o p q or�s�t { v}| l q q�~ � p x z h m o p �

� h m o p �� h m o p i� h m o ���

r�s���v m x ~ ��q z h m o ~� h m o ~

r�s��"s1v �5p � z h m o ~ or�s���� v�� � q q � � � � h m o � �r�s���� v�� � q q � � � � h m o p q or�s���� v i l � z h m o � k

� h m o � k� h m o � kg�h m o � k�5� m o � k��� m o � kg � m o � k

r�s���{ v w l y�i � � p � h m o ���� h m o ��ig�h m o p ig�h m o ��i�5� m o ��i��� m o ��i

r������ s1v k x l y � h m ��� �� h m ��� �g�h m ��� � or������ v�� � q q z hEi k l knm ��� k� � v i l � � h � l� ��� v�� i o z h � l i

� h � � i� h � � ig�h � � i

� ���� � i p w x p k p z h � l i�� h � l � �� h � l i�g�h � l i��5� � l � �

� t s�v i w x l y�p z h � p o� ���"s v o ~ y z hEw ~ � k�� ~ ~ o

z hEw ��i k}� ~ ~ o� h¡ $~  ¢� ~ �� h¡w ~ � k�� ~ ~

� ���"s�£ x l w � z h � ~ ~ o� ���"� v � � l x o p l z h � ~ ~ k

� h � ~ ~ k� ��¤dv y�~ y z h � �  

� h � �  � h � �  

� ��� v � p w l � z h � � k� h � � k

� h � � kg�h � � k

� ��� £ i  $~ � p¥ �~ � p i�z hEy�x ~ m¦� ~ q o� ��� §¨| p l x i ~  $p z h � � k© ��ª v i p � z h � l «� h � l �� h � l «g�h � l �© � � s1v w q l i i � w ~ w � w¬z hE �~   � l � oz hEw ~ � k � l� h­ �~   � l �© � � s1v w q l i i � w ~ w � h® �~   � l q og�h¯ �~   � l �g�h¯ �~   � l q o�#�  �~   � l q© ��© v i o l q q ~ � z hEi k l k � l �© � � v | l q q � ° z hEw ~ � k � l k© � � v.| l q q5± ° ²¥³ � h­ �~   � l k© � � v.| l q q � ° � h­ �~   � l k© � � v | l q q5± ° ²¥³ � h® �~   � � k© t�v i � k � � z h � p© t £ �"~ l k z hEw ~ � k � pz hEy�x ~ m � p q oz hEw � i k � � � o� h­ �~   � � �� h­w ~ � k � � �� h® �~   � p q o� h®w ~ � k � p q o´�g�h¯ �~   � pg�h¯w ~ � k � p�#� w ~ � k � p q�1�  �~   � p q�1� w ~ � k � pg � w ~ � k � p© t § w x l � � � z hEi k l k � p� h®i k l k � p© t � v � ��� z h � p k� h � p k� h � � q og�h � p k�#� � p k�1� � � q o© t � £ i q � y z hEi k l k � � kz hEw ~ � k � � k� h­ �~   � � kg�h¯ �~   � p kg�h¯ �~   � � k�1�  �~   � � k© �"s v k x l w � z hEy�x ~ m � ~ o© ��� v k l i k � z hEi k l k � � �© � © v i q l y z h � l k

µ·¶5¸�¹ºµ

(16) (17)

Page 11: Lexical Databases for Carrier - Bill Poser

– 11 –

The provision of a topical index is one of the more unusual features of thesedictionaries. Reference dictionaries are almost always organized orthographicallyor phonologically, but it is not uncommon for dictionaries produce by communitieswhose language is endangered to be topically organized. This is probably due tothe fact that community members regard the language as a repository of cultureand so see a topical organization as a sensible reflection of their culture. From theirperspective, an alphabetically arranged dictionary is in an essentially random order.

Where the topical organization is the only one provided, this is problematic, forit is very difficult to use a topical dictionary to look up an unknown word. Fur-thermore, topical dictionaries are usually limited to nouns, since it is nouns thatare most easily organized into topical categories. Even where other syntactic cate-gories are included, a topical organization tends to limit the comprehensiveness ofthe dictionary because of the difficulty of fitting many words into topical categories.

In most circumstances, therefore, a purely topical organization is not desirable.However, a topical index to a dictionary whose principal component is organized or-thographically, is a useful addition. In addition to providing a facet of the dictionarythat makes sense to the elders, a topical index is very useful for language teachersand curriculum developers who wish to create lessons dealing with particular top-ics. For similar reasons, it is useful to language learners, who can bone up on thevocabulary for a particular area. A topical index is also useful for lexicographers.If well done, it provides a good record of what has been done and shows up gaps inthe coverage of the existing dictionary.

It is not terribly difficult to generate a topical index from a computer database.Indeed, I am surprised that it has not been done more often. The key is to tageach record with a semantic field specification. The records are then sorted bythe semantic field specification and index entries generated in that order. Sectionheadings are generated from the semantic field specifications by keeping track ofchanges and emitting a new section or subsection header whenever there is a change.Once the categories have been chosen and records tagged, the arrangement of thetopical index can easily be changed by changing the ordering of the categories inthe sort order specification.

For those such as myself with a propensity to classification, it is tempting tocreate a deep classification, one with many levels. This is unwise. It is difficultto format more than two levels of headings attractively, and a classification thatis too highly ramified produces many sections with very few entries. This is notmerely an aesthetic consideration — not only does the user not need an elaborateclassification, but I believe that one actually makes it more difficult to find things.

We have only begun generating topical indices fairly recently, so due to ourown lack of experience and, apparently, that of other lexicographers, some problemsremain. One is that of redundant entries. The entries we make come from the inverseheader field, which was originally intended for generating the English to Carrierheaders. This field often contains multiple entries, since it is often convenient toprovide more than one way to look up a word. The various inverse headers tend tobe distributed through the dictionary by alphabetical sorting, but in a topical indexthey of course are clustered together. This looks odd and in general is probablyunncessary. For example, in the page illustrated in (12), we have both “Sockeye

Page 12: Lexical Databases for Carrier - Bill Poser

– 12 –

Salmon” and “Salmon, Sockeye”, which seems unnecessary and a little bit ugly. Onesolution would be to create a separate topical index header field for each record andenter it manually. Another would be to use only one of the inverse headers, perhapsthe first. Still another would be to generate topical index headers automaticallyfrom the gloss field, as Nathan and Austin (1992) suggest for inverse headers. I donot know whether there is a good algorithm for this.

Another problem is what classification to use. It is not too hard to devise areasonable classification for nouns. Indeed, there are a number of models that canbe used. However, it is rare for topical dictionaries to include verbs. Where they do,they tend to list those verbs that are obviously and specifically associated with thecategories used for nouns and to make no attempt at listing verbs in a comprehensivefashion. If, as seems desirable, one wants to include verbs comprehensively in atopical index, it is necessary to find a good classificatory scheme for verbs.

The classification used in the most recent Carrier dictionaries represents a firstattempt at this. There being little precedent known to me, it was arrived at largerby brainstorming. I found the classification of English verbs by Levin (1993) usefulas a stimulus, but it was not possible to follow it closely since it is based to a consid-erable extent on syntactic properties that are not relevant to a topical classification.Devising a really good classification system for verbs remains a topic for research.

One thing made possible by generating dictionaries from a database is printingthem in a variety of writing systems. Our system is set up to allow for printingin four different writing systems. These are exemplified in (18), which presents thewords “old man” and “frequently” in all four writing systems.

(18) Examples of the Four Available Writing Systems

Carrier Linguistic Committee duneti lhghunMorice Roman denethi ÃlrenCarrier Syllabics

����� ����

International Phonetic Alphabet d � neti Ãl � n

The writing system in general use for Carrier is the Carrier Linguistic Committeewriting system, a Roman-based system developed in the 1960s. The Carrier materialin the dictionary databases is represented in this sytem, with the slight modificationthat underscores, used to distinguish lamino-dental fricatives and affricates fromapico-alveolars, are written as separate characters preceding the consonant in thedatabase. A typical page is shown in (19).

Some elders prefer the writing system that they learned from the 1938 editionof the Roman Catholic Prayerbook, which is the phonetic writing system used byFather Adrien-Gabriel Morice in his scholarly publications. Rather than attemptingto implement allophonic rules, we generate a phonemicized version of this writingsystem. This is illustrated in (20).

The first writing system used for Carrier was the so-called Carrier syllabics, aderivative of the Cree syllabics introduced in 1885. It is no longer widely used, butsome people strongly favor it. A page in syllabics is illustrated in (21). Finally, forthe use of linguists and occasional others it is possible to produce dictionaries in theInternational Phonetic Alphabet as illustrated in (22).

Page 13: Lexical Databases for Carrier - Bill Poser

– 13 –

��������� ���� ������������ ���� ���� ��� ����������������� � "!$#�% #'&�(��) *�+-, � .0/" ����������1�2� 3�4� 5 ��� & * �6�7& * � � �!8#�% #9&�(�� ) : 5 % � ,� .8/6

������;�<�=� �>� ? (��@�'?�(���� �% �0% )0A B (�(�C�D # , � *�E�) FG 0� ! H" ������;�;�1I�4� �>� ? (�(�&J�K? (�(�& � �% � *�A ��D ) #���5 5 L ,� ! H" �M N�O0P�� )�Q ��& #���? (�(�& ,8R !�� *�S D * #���5 5 :T* % &% &VU$L E (�&�D )�, Q M WTO"P�L�(�� #���? (�(�& ,XR !8� *�S D *#���5 5 :T* % &9% &�U$L A ��D ) � , Q��Y ;�����1�Z 1�����1��[� ��� ? ��&\�]?^ ��& � 8� ��D L *>B D) UV% 5 % &�� , � ! H6 ��Y ;��1�Z 1�����1_�`� �>� ? ��&��\?^ ��& � ���Da% )8) UV% 5 % &�� ,� ! H"

��Y ;��1���� ����1[�<� ��� ? ��&� �! * U ) Ub% 5 % &�� , � ! H" ��Y ;�;�1�c>�;ed :T*>A C B * � ,� ����2� ]d ) ��% & ,� ������f[d=D *>BaA�* & * 5 ,������1�d<# * L , M N�O g6? D &h� (�� ) ��C Q ��� Q D�& ,$R i D% )a+ ( B C�% &��$D S D B L9# * L , Q������1] �;T >����fkjml6�e# * % 5 L�n8D S D B Lo# * L , M N�Og6? D�&9� (�� ) ��C]#���&�%m� * & * 5 ��(�C ,4R p0S D B Lh# * L� ��D-Ub(�( ) D6��(�D ) % &�� (b� ��D +a* � D B , Q������ �1�Z �=d=&�(�(�& ,� ����Z=d[��D *>B � , M N�OVqX��#�? %X&�#���# *�,hR i D9� *>)��D *>B �a� B (�� E 5 D , Q� ����Zm1�2�Y ���r �� s�I� 5 � � )�Q ���V�=� ) Q ��� � a� (t� * S D� * # * ��D *�B � * � � *�A C , � .�H" $uX��% ) 5 % � D B * 5 5 LUbD * & )Vv w$Q ) ��D *�B �6x D 5 5�#�( + &�yTn ) (V% �6% )"A (�&��z ��� * � D #tx ( Bm) � E z D A � E L : ( ) ) D ) ) % S D :�B D�{�| * �� % (�&�(�x�� ��D-&�(���&9} ~T��� v ��D *�B � y , M N�O6qX��#�? %& * 5 � ) Q ��� ,XR i D-� * # * ��D *�B � * � � *>A C , Q����Z � ��Ytd E ( *�B #�n�5 ��U E D B ,����Z � ��Y8�0��r ��Y 0r ��1[d ) * + UV% 5 5 ,� ����Z f�r �� `d=� ��Dm� :�: D B"A ��D ) ��n�� ��D B D ��% (�&@D &��A (�U :T*>) ) % &��b� ��D E�B D *>) � )�,� ����;hd7D *�B , u0�T% )�B D x D B ) � (a� ��D8D *�B�A (�& ) % #�D B D #*>)6*V+ ��(�5 D�n�D ) : D A % * 5 5 L�� ��DmD |�� D B % ( B�,a� ��D &� ��D A�* & * 50% & :T*�B � % A ��5 *>B % )-B D x D B B D #o� (�n�(�&�D� ) D ) } ~T�>�>��nT� , S�,����;T��2TYtd=D *>B Um��� )�,X� � � ����� �� ����;T��2TYod`D *>B 5 ( E D ,����;��jml6� ET* #�5 L , M N�O g-? (��_&�D�&���5 � ET*�)^ ,R i D-% ) # B % S % &�� ET* #�5 L , Q

����;�;T _d A ( * � ,����;�;T >����f��kd S D ) � ,����;�;T�� �r 2T1�d ) ��% B � ,� �������Y �d=UV(���&�� * % & ,� � � � ��Y ��Jr ��1����� �r 2�� � d��X��% &�D ) DK� *>: % # )�,uX��D�{ B ) �_&�% ����� Q )_A>* U : � : � ��D�� B *>) D B�"% S D B x B (�U4����D % #�5 % ,"p � L�Ub(�5 (���L F ��% � D B * 5 5 L�nv Ub(���&�� * % & A ���a% &9� + (�y ,� �8�����Y �c��;�d��"( A C�Lo�@(���&�� * % & ) ,7p � L�Ub(�5 �(���L F-v E % �bUb(���&�� * % & ) y ,� � � � ��Y � >Zhd<�a( A C�L��@(���&�� * % & ) ,$p � L�UV(�5 (���L Fv � B D * �aUb(���&�� * % & ) y ,� �8�����Y ��T2���c����1kdsuX��D�x ( B UbD BbS % 5 5 * ��D E D �5 (�&���% &��9� (9� ��D$�8����� ~�� � ET* &�#t(�&J� ��D$&�( B � �ET* &�Ch(�xm� ��DJ/"D A � * C�(�n + D ) ��(�xm�@L + ( B � � ,p � L�UV(�5 (���L F ��% � D B * 5 5 L�n v * �6� ��D ET*>) Db(�xX� ��D5 % � � 5 D9Ub(���&�� * % &�y , P�(t& * UbD�# E D A�* � ) D@� ��DS % 5 5 * ��D +X*�) 5 ( A�* � D # * �b� ��D ET*>) D�(�x *@A ��� �ET* &�C ,���=�a�- 0¡8¢ £�x ��� � B Db� D�& ) D , uX��% )-:�B D {�|J��(�D )% &J� ��Db� D & ) D�¤ *>) : D A � ) 5 (��-% UbUbD�#�% * � D�5 L :�B D �A D #�% &��-� ��Da% &�&�D B8) � E z D A � , ! � A (�U E % &�D )8+ % � �� ��Db¤�� ¤V(�x8� ��D ) D A (�&�# : D B ) (�& ) % &�����5 *>B-* &�#{ B ) � : D B ) (�&J#�� * 5 ) � E z D A �-U *�B C�D B ) � (9x ( B U¤�¥�¤ ,

¦ ($H B (���&�#@q0L9qX( * �m� ��H6 § ¨ © ª « ¬ ­ ® ¯ « ­ ¬ ° ¬ « ® ­ ¬± ² ³ ´ µ ¶ · µ ¸ ¹ ² ³ ´ º »�³ · µ ¸ ¹ ² ³ ´ ¶ ¼ ³ ´ µ · µ ¸ ¹½ ² ³ ´ º ² · µ ¸ ¹ ² ³ ´ µ ¹ · µ ¸ ¹ ² ³ ´ µ ¹ · µ ¸ ¹¾ ² ³ ´ µ · µ ¸ ¹ ² ³ ¹ ³ ´ µ · µ ¸ ¹ ² ³ ¹ ³ ´ µ · µ ¸ ¹

���s�a�6 X¡�¢ £ :�B (�� B D ) ) % S D *>) : D A � , uX��D :�B (>�� B D ) ) % S D *�) : D A �h% &�#�% A�* � D ) � � * � *>A � % (�&�% )% & :�B (�� B D ) ) � ( +a*>B #_� �TD A (�U : 5 D � % (�&](�x *��( * 5 , �T( B D | * U : 5 D�na% x ) (�UbD (�&�DJ% )9) + % Ub�UV% &�� *�B (���&�# + % � �J&�( :T*�B � % A ��5 *>B ��( * 5 n�� ��D% U : D B x D A � % S DJ¿�ÀTÁ��\% ) � ) D # , ! x-��Dt( B9) ��D% ) (�&J� ��D +a* LJx B (�U<(�&�D : (�% &��-� ( * &�(�� ��D B n� ��D :�B (�� B D ) ) % S D9À�Á���� �o% ) � ) D # , uX��% )m: ( ) % �� % (�&t :�B D {�| *�:�: D *>B ) (�&�5 L + ��D�&J� + ( A (�&��#�% � % (�& )a*>B D ) * � % ) {�D # , �8% B ) � n�� ��D B D"U$� ) �X&�(��E D *t: ( ) % � % (�&3à ) � E z D A ��U *>B C�D BtÄ � � * �V% ) n*\) � E z D A �@U *>B C�D B % UbUbD�#�% * � D�5 L :�B D A D #�% &��� ��D S D B EoET*>) D�Å , uX��% ) 5 % Ub% � ) % � ) ( A A � B D & A D� (o� ��DJ{ B ) � : D B ) (�& : 5 � B * 5 * &�#h� ��% B # : D B �

Æ\Ç W Æ

È0É�Ê Ë�Ì Í Î�ÏÉ�Ð�Ñ�É�Ñ Ì�ËTÒ�ÓTÌ Í�ÏÉÔXÕ�Ö-×�Ø�Ù ÚbÛ�Ü�ÝaÞ ß�Ö�àaáJÖ â ãaÛ�ä�×�Õ�Ö Ù Ù Ö å�æ

È0É�Ê Ë�Ì Í Î�ÏÉ�Ð�Ñ�É�Ñèç[é"ê�à Þ Ù æ�ë0ã å�ÚVÛ�Ù Û�ì�å�í@î ã Þ ÚbÖÛ�ä�ã Õ�ÖmïXÛ�ã ã Û�Ü�ðXÛ�Û�ñ9ò�ó�ñ�â ôTæ

È0É�Ñ�Ë�Ì>õ öTË�÷�øXÏÉ�Ñ�ç4ëXØ>â ã Ö à-×�ó�Ü�ñTØ�å�æ-ë0ã å�ÚbÛ�Ù ùÛ�ì�åTí-î ã Õ�Ö-ñTØ å9Õ�ÖVú û�Ö�â ó�â üXà Û�â ÖmØ>ì�Ø>Þ Ü�ô�æ

È0É�ýTÉ�Ñ�Ð�Ò�Ó�Òþç�ÿ"Û�â ê�Þ ã Ø>ÙmïXà Ö Ö ��æ�� ÜKû�Û�â Þ Ö� Ø>ó�Ù � âtã à Ø�ê�Ù Þ Ü�Ö�æ ë0ã å�ÚVÛ�Ù Û�ì�å�í<î ÚVÖ ñ�Þ � Þ Ü�ÖÚVØ>Ü���à Ö Ö���ô�æ

Èmõ �JÓ�Î�÷�øXÏÉ�Ñ�çe×�ó�Ü�ñTØ�å�æmë0ã å�ÚbÛ�Ù Û�ì�å�í"ïXÛ�Ú$ùêTÛ�ó�Ü�ñVÛ�ä � �� ��XòTÛ�à à Û>ð0Ö ñVä à Û�Ú��Tà Ö Ü�� Õ���� ������������$Ø�Ü�ñ@ïaØ>à à Þ Ö à���! �"�'î ñTØ�å�ô�æ

È0ÓTÐTÉ�Ñkç�Ú9Ø>Ü#� â$ì�Þ ß�Ö Ü\ÜTØ>ÚVÖ�æ\ë0ã å�ÚbÛ�Ù Û�ì�å�í$ Û�à à Û>ð0Ö ñ�ä à Û�Ú�ë0Ü�ì�Ù Þ â Õ%��&����'����æ

Ì (�)�*,+.-0/ 12 ù ��Ù Ø�â â8Ø>ò�â Û�Ù ó�ã Þ ß�Ö0Ø�à ì�ó�ÚbÖ Ü�ã æ!� Ü�ùñ�Þ ��Ø>ã Ö â"ã ÕTØ>ãaã Õ�Ö$Ø>ò�â Û�Ù ó�ã Þ ß�ÖmØ>à ì�ó�ÚbÖ�Ü�ã$ú ã Õ�Öâ ó�ò�3 Ö�� ã-Û�äaØ>ÜtÞ Ü�ã à Ø>Ü�â Þ ã Þ ß�Öbß�Ö à òtÛ�à-ã Õ�ÖbÛ�ò�ù3 Ö � ã9Û�ä$Øoã à Ø�Ü�â Þ ã Þ ß�Ö@ß�Ö à òTü9òTÖ Ù Û�Ü�ì�â9ã Û\ã Õ�Ö ù Ü�Û�ó�Ü4� Ù Ø>â â 5�ðaÕ�Þ � Õ�à Û�ó�ì�Õ�Ù åVâ ê�Ö�Ø'��Þ Ü�ì���Û�Ü�ùâ Þ â ã â\Û�ä@â ã Þ � ��ù Ù Þ ��Öhã Õ�Þ Ü�ì�â æ76 8�92:<;�Þ Õ�ã Õ>=Ö�ÙÜ�Ö ã Ö Ù Ø>ã�æ.? ÔXÕ�Ö-Ù Û�ì$Þ âA@�Û�Ø>ã Þ Ü�ìbØ>à Û�ó�Ü�ñ�æ �

Ì�ËCBD�<E`Ø>Ù à Ö�Ø>ñ�å�æ( Ì�Ë<ç=Ù Þ ê�â�æÌ�Ë�(F)�*,+.-0/ 1KÞ Ü�â Þ ñ�Ö�æ8ÔXÕ�Þ â0à Ö�ä Ö�à âXã Û-ã Õ�Ö"Þ Ü�ã Ö ùà Þ Û�àVÛ�ä-ØJÕ�Û�ó�â Ö@Û�àVò�ó�Þ Ù ñ�Þ Ü�ì�æG� ãH� Û�Ü�ã à Ø�â ã âðaÞ ã ÕG ��� 5-ðaÕ�Þ � ÕKà Ö ä Ö à â@ã Û'ã Õ�ÖoÞ Ü�ã Ö�à Þ Û�à@Û�äØ4��Û�Ü�ã Ø>Þ Ü�Ö àbâ ó�� ÕhØ>âbØ@òTÛ�IoÛ�à���Ø>Ü�Û�Ö�æJ6 8�9Ô�Ø>Õ�Ö Ü�Þ Ü�ã�=Ö Ù æK? ÔXÕ�Ö å]ðXØ�Ù ��Ö ñ]Þ Ü�â Þ ñ�Ö�æ �L6 M�9Ô�Ø>Úbê�Ö ;�æ2? ÿ6Ö9êTØ>à Ø'� Õ�ó�ã Ö ñ'Þ Ü�æ �G6 N9�O%=Ö ; ã Ù Þã Ø�ÜTØ>Þ Ü�Þ PÙ ã Ù Ö ��Õ�ð-æ�? ÿ"Ö-Ù Ö ñ�ã Õ�ÖmÕ�Û�à â Ö-òTØ"� ��Þ Ü�ùâ Þ ñ�Ö�æ �Q6 R#9�S4=Ö�à Ö ã Ø>Ü�T�ÛU��æ�? ÿ"ÖVÕ�Û�ê�ê�Ö ñoÞ Ütã Ûó�â æ �26 V�9�PW ã Õ�ÞXã Ø�ÜTØ>Þ Ü�Þ Ü�ã ÕTØ>Ü�æF? ÿ"ÖVò�à Û�ó�ì�Õ�ãã Õ�Ö-à Þ @�Ö-ò�Ø'� �VÞ Ü�â Þ ñ�ÖVú ã Õ�Ö-Õ�Û�ó�â Ö�ü æ �

Ì�Ë�Ì�Ë=ç_Þ Ù Ù Ü�Ö â â�æ�6 8�9 Ô�Ø>ã ØVÜ�ã â Þ æ ê>=Ö6Ü�ã Ö ã Ø�æ.? ÿ6ÖÕTØ>â"Ø$â Ö à Þ Û�ó�â"Þ Ù Ù Ü�Ö�â â�æ �

Ì�Ë�Ì�É�Ê Ñ�Ë�ÌXEZY PÙ ù ÜTØ�ã\[`ÜTØ�ãH] ^0ð0Ötú _�ü-â ê�Ù Þ ã�æY � é<^

Ì�Ë�Ì�É�`>É�ÌaEZY b>ù à Ö ãZ[_à Ö ã ] ^XðXÖtú _�ümâ Ø�ð0Ö�ñ�æY � é<^

Ì�Ë�Ì�ÏÉ�Ê Í ÏÉ�øcEdY Ù ù æ =Ö ;e[ æ =Ö'fâ ] ^"Õ�Ö9ðaØ>Ù ��Ö�ñ\Þ Ü#50Õ�Öê�ó�ãaÕ�Þ âaä Û�Û�ãaÞ Ü�â Þ ñ�Ö�æAY � é<^

Ì�Ë�Ì�ÏÉ�Ê Ð-ÏÉgE�Y Ù ù �.=ÖH[G�0=Ö ] ^�Õ�Ö-ê�ó�ãaÕ�Þ âAh�Ü�ì�Ö à"Þ Ü�ùâ Þ ñ�Ö�æAY � éA^

Ì�Ë�Ì�ÏÉ�Ê Ñ�É�ÐaE\Y Ù ù Ü�Ö �i[]Ü�Þ j ^�Õ�Öbê�ó�ã6Õ�Þ â-Ø�à ÚÞ Ü�â Þ ñ�Ö�ækY � éA^D6 8�99Ô�Ø�ã�=Ö Ü�ã æ Ø";�ã Ø>ã�=Ö Ù Ü�Ö ��æG? ÿ6Ö

ê�ó�ãaÕ�Þ â6Ø>à Ú=ã Õ�à Û�ó�ì�Õ�ã Õ�ÖVú Û�ê�Ö ÜTüXðaÞ Ü�ñ�Û�ð-æ �Ì ËTÌ�ÏÉ�Ñ�Ì>Ò�Ë�Ñ[ç=ñ�Û�Û�à æÌ ËTÌ�ÏÉ�Ñ�Ì>Ò�Ë�ÑFl�Ñ�Ì>Ò�É�Ñ[ç=ñ�Û�Û�àaÕ�Þ Ü�ì�Ö â�æÌ ËTÌ�ÏÉ�Ñ�Ì>Ò�Ë�Ñ!`�É�Ñ$ÏÉ�Î Í Ë�õtç�ñ�Û�Û�à ��Ü�Û�ò�æÌ ËTÌ�ÏÉ�Ñ�Ì Í Ë!mø=ç=ðaÞ Ü�ñ�Û>ð-æÌ ËTÌ�ÏÉ�Ñ�Ì Í Ë!mø_Ì ËTÌ�ÏÉUn Ê Ì�oUl.mø çK� ó�à ã Ø>Þ Ü�âJä Û�àJðaÞ Ü�ùñ�Û>ðaâ�æ

Ì ËTÌ�ÏÉ�Ñ�Ì Í Ë!møKÒ#l�Ð�ÒpmËTÌ�õ �Qq�Ë�Ê[çd��ó�à ã Ø>Þ Ü�â�ä Û�àðaÞ Ü�ñ�Û>ð-æ

Ì ËTÌ�ÏÉ�Ñ�Ì Í Ë!mø$Ò#l�Ð�Ò#pmË�Ì>õ Ñ�Ê Ë7çk� ó�à ã Ø>Þ Ü�â�ä Û�à�ðaÞ Ü�ùñ�Û>ð-æ

Ì ËTÌ�ÏÉ�Ñ�Ì Í Ë!mø$Ì Í Î Ël�Î Ì>Ê Ë_çk� ó�à ã Ø>Þ Ü�â0ä Û�à0ðaÞ Ü�ñ�Û>ðaâ�æÌ Ë�÷�ø�õtç`Ù Û�Û�Ü�æ>r sAt'u'v tHv w�w�x y zÌ ËTÐmÏÉ�Ì ç=â ê�Ö�Ø�à æÌ Ë�Ò�Ë�Ì�ÏÉ�Ê Í ÏÉ�øgE{Y Ù ù æ =Ö ;H[ æ =Ö'fâ ] ^�Õ�Ö-ã Û�ÛU�VÕ�Þ â"ä Û�Û�ãÛ�ó�ã æAY � é<^

Ì Ë�Ò�Ë�Ì�ÏÉ�Ê Ð-ÏÉ|EgY Ù ù �.=Ö�[G�0=Ö ] ^�Õ�Ö6ã Û�Û"�VÕ�Þ â�h�Ü�ì�Ö àÛ�ó�ã æAY � é<^

Ì Ë�Ò�Ë�Ì�ÏÉ�Ê Ñ�É�Ð}EZY Ù ù Ü�Ö �~[_Ü�Þ j ^XÕ�Ö9ã Û�ÛU�JÕ�Þ âÕTØ�Ü�ñ-Û�ó�ã æ�ÔXÕ�Þ â�à Ö ä Ö à â�ã ÛaðaÞ ã Õ�ñ�à Ø�ðaÞ Ü�ì"Û�Ü�Ö"� âÕTØ�Ü�ñKä à Û�Ú Ø�Ü3Û�êTÖ Ü�Þ Ü�ì�5-â ó�� Õ]Ø>âJðaÞ Ü�ñ�Û>ðã Õ�à Û�ó�ì�ÕhðaÕ�Þ � ÕhÛ�Ü�ÖJÕTØ>â9Þ Ü�â Ö à ã Ö�ñ7Ø>Ü7Ø>à Ú�æY � éA^

Ì Ë�Ò�ËTÑ�ËTÌ�ÏÉ�Ê ÐmÏÉ�EgY Ù ù �0=Ö,[J�.=Ö ] ^TÕ�Ö6ã Û�ÛU�mÕ�Þ â>h�Ü�ùì�Ö àaòTØ'� �VÛ�ó�ã æAY � éA^

Ì Ë�Ò�ËTÑ�Ë�õ Ñ�õ Ñ�Ì>Ò�ËTÑ�ELY b>ù ã ÕTØ>Ü|[[ã ÕTØ>Ü] ^$Õ�Öò�à Û�ó�ì�Õ�ãaÞ ãaòTØ'� �VÛ�ó�ã â Þ ñ�Ö�æAY Ø>ò�â�í8ÐmÏÉ�Ñ#^.Y ��Ù Ø�â â�íÙ à Û>ù � ^AY � éA^A6 8�9�PW ã Õ�Þ ã Ø>ÕTØ>ÜTØ�Þ Ü�Þ Ü�ã ÕTØ�Ü0æ%? ÿ"Öò�à Û�ó�ì�Õ�ã"ã Õ�Ö-à Þ @�Ö-òTØ"� �9Û�ó�ãmú Û�ä�ã Õ�Ö-Õ�Û�ó�â Ö�ü æ �

Ì Ë�Ò�Ë!mø�mÎ ËTõeE�Y b�ù ã fâ Ø>Þ0[7ã fâ Ø>Þ ] ^�ã Õ�Ö åVñ�Þ Ö ñ�æAY � éA^Ì Ë�Ò�É�ÑmÏÉ�Ê Ì"mÎ É�ÐLE�Y Ù ù ã fâ Ö ��[\ã fâ Ö � ] ^�Õ�Û>ð\ÚVØ�Ü�åÛ�ä�ã Õ�Ö Ú4Ø>à Ö-ã Õ�Ö à Ö���æAY � éA^

Ì Ë�Ò�É�Ñ�õ Ñ�Ì�ÏÉ�Ê�E�Y b�ù ã�=Ö Ù2[ ã�=Ö Ù ] ^Jã Õ�Ö�å[ú �"�-üðaØ>Ù ��Ö�ñ9Þ Ü�â Þ ñ�Ö�æ<Y � é<^

Ì Ë�Ò�É�ýTËTÑ�Ð�Ò�Ë�õ�E�Y b�ù ��ÕTØ�Þ%[i��ÕTØ�Þ Õ] ^9ã Õ�Ö�åò�à Û�ó�ì�Õ�ã�Þ ã9Þ Ü�æ�:6Ö â � à Þ ò�Ö â@ÚbÛ�ã Þ Û�Ühã Û>ðaØ>à ñã Õ�Ö-â ê�Ö�Ø'��Ö à�æAY � Ù Ø>â â í0��Û�� ù � ^>Y � é<^

Ì Ë�Ò�õ Ñ�ø�lZE{Y b>ù ; ó�[�; ó ] ^�ã Õ�Ö å�Ø�à Ömì�Ö�Ü�Ö à Û�ó�â æY � é<^

Ì Ë�Ò�ÓTÌ Í ÏÉ|E�Y b>ù ã æ =Ö ^�ðaÕTØ>ãXð0Û�ó�Ù ñVÕTØ�ê�êTÖ�Ü���æAY �aé<^6 8�9 æ =ë0ã æ Ö PÙ ñf; Þ fâ0Ù Þ Ù Þ � æ Ö�ã>fâ Ö â ã Õ�Þ�Õ�Û�ÕVÜ�ã Ö�Ü�Ö â f;�Ö Ü�íô $ Ö Ü�ã Ø"â Ö êTØ>Õ$ã Ø>Õ�Û�ã æ =Ö���ô,? áhÕ�Ö Ü���ðaØ>â�Þ Ümò�Ö ñã ÕTØ>ãaÜ�Þ ì�Õ�ã��0ã Õ�Û�ó�ì�Õ�ã�í8ô�áhÕTØ>ãað0Û�ÜTñ�Ö à â"ðaÞ Ù Ùã Û�ÚVÛ�à à Û>ðhò�à Þ Ü�ìU��ô��

� M�� �

(19) (20)

� �������"�U�"� ��� � �� � �U�4�D� �� |¡ ¢A£D¤�� ¥%� � �U�g¦��� �"�A��§ �"���H� � �H£U� ¨©4ª «¬ � � ­"�Q®%�"¦���� �"� �2¯A� ¤#��° ±²'³�´ µ�¶�¶ · ¸ ¹ º�» ¼�¶ · ¶ ½

©4ªA¾ ¬ � � ­'��®%�U¦��U� �'� �4¯A� ¤¿4ª�«|À �"� Á ¦ À � ��£,Â�¦�� § § �� Ã,Ä\Å�Æ�ÇÈÇ0É ÊË�"§ ¥#��Ì>�"� �2�U¦�� ��� �� �ÍAÎDÏ�Ã�Ä Ð¡ ÑA�'�U� � ¨Ò �Q��Ó,Ô.Ç0É Ê Ò Õ Á § �'� �A�'Ö�� �"§ ¦�� � ×U£��'� �"¦��H£��U���!Ø � Õ¥�� Á��'� £ �A� ¤��'��� ¤�£��'Ö�� �"§ ¦�� � ×U£D�'� �"¦��H£��U��Ù � ¤�£� ¦�Ö�Ú £�Á �,�U���'�e� �U� � �'��� � � � ×"£H×"£ � Öe�U�,� ¤�£H�UÖ ÕÚ £ Á �,�U�.��� � �'��� � � � ×"£H×"£ � Ö�Û<Ö�£ § �U���"�,� ��� ¤�£ ÒÕ ���U¦��eÁ § �"� ���ÈÌ�¤�� Á ¤�� �U¦��"¤�§ Ü%� À £��'­�� ���4Á��"� Õ� � � � �A�"�È� � � Á ­ Õ § � ­U£<� ¤�� ���"���A� �� �Ý!Ï�ÞQÄ�ß�à.á ÒÐ.¡ â>¤�£,§ �"�H� ��ã����'� � �����"� �U¦���¥#� ¨äcåDæ<ç �"§ � £��"¥�Ü�

� ä « § � À � �ä ����Ó,Ô.Ç0É Êc� ��� � ¥�£"��â>¤�� ��� £ � £ � �F� ��� ¤�£� �U� £ � � �"�4�U�H��¤��U¦�� £��"�4Ö�¦�� § ¥�� �����gØ �4Á��"� Õ� � �'� � ��Ì�� � ¤�è.� �AÌ�¤�� Á ¤G� £�� £�� �4� ��� ¤�£Q� ��� £ Õ� � �"���"�D��Á �"��� �"� ��£���� ¦�Á ¤��'�4��Ö���éJ�"��Á�� Õ����£"�4� ��  ä.ê<ëHì èAÄ Ð%¡ â>¤�£ Ü%Ì>�"§ ­U£ ¥Q� ��� � ¥�£"� ¨� í�  ä.îÈïHð Ð�¡ ¢<£ À �'� �'Á ¤U¦�� £ ¥Q� �#� ¨H� ñ <ò ðóä Í�Î ëHô õ ª ¡ ¢A£,§ £ ¥�� ¤�£,¤��"� � £,Ö��'Á ­�� ��� � ¥�£"� ¨� ö# A÷�ø äDì ù ª Ð�¡ ¢A£D¤�� À�À £�¥%� �%� ��¦�� � ¨H� ú� ô ûCä Í�Î ë4ì"ü�ì ÐA¡ ¢A£>Ö�� �U¦��"¤U�0� ¤�£>� � ã�£>Ö��'Á ­� ��� � ¥�£�Ù � ¤�£,¤��"¦�� £�Û � ¨ä<ä « � § § ��£ � ����� �� J¡ ¢<£�¤��'�>�<� £ � � �U¦��.� § § ��£ � � � ¨ä èAÄ ý ç|þ Ä � ý{ÿJý����¤�£ À ¦��>¤�� �������U£ �>� ��� � ¥�£"�þ � ¯��ä èAÄ ß ª ç�þ Ä � ß ª ÿ ë�� ��¤�£ À ¦��0¤�� �>�'� �k� ��� � ¥�£"�

þ � ¯�!� ��  ä è ì 4ð4ä èAÄ ß ª ÐA¡ ¢<£ À ¦��.¤�� �>�'� �� ¤�� �"¦��"¤4� ¤�£�Ù � À £ ��Û>Ì�� ��¥��'Ì,� ¨ä è ì ü�ì « ¥����U� �ä è ì ü�ì���ì �eì « ¥����"��¤�� ���"£ ���ä è ì � « Ì�� ��¥��'Ì,�ä è ì � eä è ô �� « Á�¦�� � �'� ����� �U��Ì�� ��¥��'Ì����ä è ì � �� ª ��� î�� Ä « Á ¦�� � �"� ����� �U�>Ì�� ��¥��'Ì,�ä è ì � �� ª ��� ì á « Á�¦�� � �'� ����� �"�AÌ�� ��¥��'Ì,�ä è ì � ������� « Á ¦�� � �"� ���A� �"�AÌ�� ��¥��'Ì����ä à.ø Ò ç�þ � � ø Ò ÿGø Ò ���#Ì.£�Ù �UÛ.� � Ì>£ ¥#� þ � ¯�ä àAÄ Í Ò ç�þ ô � Í Ò ÿGÍ Ò ���#Ì>£�Ù �UÛ>� À § � � � þ � ¯��ä Ý « § ���U�#�.° !�"�· !H· #$#�» ³ ½ä ý Ò « � À £��'���ä�% « �H�U¦�� � �'Á ¤�£"�ä�& èAÄ ý ç|þ Ä � ý{ÿ2ý�����¤�£A� ���"­�¤�� �������U£ �.�"¦����þ � ¯�ä�& èAÄ ß ª çZþ Ä � ß ª ÿ ë � �>¤�£�� ���"­e¤�� �H¤��'��¥�U¦�� �Èâ>¤�� �!� £ � £ � �!� �AÌ�� � ¤�¥�� ��Ì�� �����"��£U¨ �Ȥ��'��¥� � �"�g�'��� À £ ��� ������� ¦�Á ¤%�'��Ì�� ��¥��'Ì�� ¤�� �"¦��U¤Ì�¤�� Á ¤4�"��£,¤��'�A� ��� £ � � £�¥%�'�%�'� �4� þ � ¯�ä�& ÍDèAÄ ý çZþ Ä � ýiÿ�ý����.¤�£�� ���U­Q¤�� �$�����"£ �Ö��"Á ­��"¦���� þ � ¯��ä�& Í�Î ë4ì'üFìËç\þ � � üQì ÿ ü%ì ���0¤�£HÖ�� �U¦��"¤U�� ��Ö��'Á ­2�U¦�� � � ¥�£"� þ �"Ö�� '�ý ì � þ Á § �"� �('e§ � � Õ Á �þ � ¯�D� ��  ô ûLä�& ÍAÎ ë�ì"ü�ì Ðk¡ ¢<£4Ö�� �U¦��"¤U�� ¤�£,� � ã�£,Ö��"Á ­��"¦��DÙ �U�È� ¤�£,¤��"¦�� £�Û � ¨ä�)�ì *iç�þ � � * ÿ * ����� ¤�£�ÜD�"� £>�U£ ��£ � �"¦���� þ Ø ¯�ä+�, ç{þ � � , �0Ì�¤��"��Ì>�"¦�§ ¥4¤�� À�À £ �.-U� þ / ¯��� �� �¡ 0J¤�£ ��ØeÌ>�'��� ��Ö�£�¥�� ¤��'�e��� �U¤U��Ø� ¤��"¦��U¤U�('�1�02¤��"�4Ì.�U��¥�£ � �%Ì�� § §,� �"�H�U� � �'ÌÖ�� � ����-(1�¨ä+�,�ìcçiþ � � ,�ì ÿ2� ��Ì�¤��"�HÁ �U§ �U¦��H� �H� � -"�þ �"Ö�� '�3DÏ � þ Ø ¯��� ��  ì 4�ä+�,|ì 5 ¡ 0J¤��'�Á��"§ �"¦��H� �HÜ"�U¦��H¤��U¦�� £(-U¨k� í� �ß�6 ª ä+�,|ì5 ¡ 0J¤��'��Á �U§ �"¦���� ��� ¤�£DÖ��'§ § -U¨ä.ê ÷7 8 ª ç|þ Ä � 8 ª ÿ98 ª ����¤��'Ì2���'��ÜD�"�� ¤�£ ��"� £,� ¤�£ � £�-"� þ Ø ¯�ä.ê<ë�ì èAÄ çËþ � � èAÄ0ÿgèAÄ ���0� ¤�£�ÜFÙ :�;<Û<Ì��'§ ­"£�¥� ��� � ¥�£"� þ � ¯��ä.ê<Hì = Î ç�þ � � = Î�ÿ = Î�Ï������ ¤�£�Ü<Ö�� �"¦��U¤U�È� �� �#��><£ � Á � � Ö£ �,�H�U� � �"�%� �'Ì��'� ¥Q� ¤�£�� À £��"­"£ ���þ Á § �"� �('0Á ��Á Õ Á(� þ � ¯��ä Î@? ß ì «BA ¦���£"��C0� Ü����"§ �"�UÜD'FE � � ��£H�"��� ¦�§ §� ¦��H��£ � 1��

G í�ö G

H�IKJDL�INMPORQNS T(UDM�VXW Y Z T�[�\ ]�^N_`\ a�b.[�cNa _([�d _ ePf�W g h�]H�INS U.iRI�j.INk2l2b�cNd(fFm`^N\ aon _ p _ n aod qFd ^N_srK\ bNed ^.[�d`a _ t(cNn _ a�[ou.q�Y d(fwvwd xKy@q�Y q�z�xD{�| \ d�d }�\ a d aq�bFd ^N_�_(bNeN~NfH�INS U.ORU.iRQNk(ORIKJ�VBW eKZ bNIKrK]wgz�q�d�n \ e�q�p�^N\ y�fW �Dh�]H�INS U.ORU.iRIK�(QNk(ORM ��VXW eKZ bN\ ^N]KgR[�y�z�q�\ bNzd qz�_ dn \ eFq�pR^N\ yFfW �Nh�]H�INS �D����a q�y@_�fH�INS �D�D�(Q��o��V9a q�ys_ d \ ys_ a(fH�INS INORORQN��l9a q�y@_�q�pRd ^N_o�._ q��NY _�uKx@^N\ y�fH�I�� S�lBa Y _ _ �Pf�� �N�@�`IK�Y`a IK� _(�Y j.IK\ f�� g$[�y�_ �KZ^.[�cNa d _(e�e.cN_�d qsY [�t r@q�pRa Y _ _(�Pf �H�I�� S������}�\ d ^�^N\ yFf�� �N����_ � r�_ ^�uNIK�Y�bNIKad a(� _ eNIK�Y f9� ��_�[�n _�z�q�\ bNz p q�n }`[�n e¡d q�z�_(d ^N_ n}�\ d ^Fq�cNn�t ^N\ Y eNn _ bPf �H�I�� S.iRI�� S ��k�¢ �D��VXW �Y Z d a(� q�^@£�d a � q�^D¤ ]�^N_�\ aRxK[(}�bKZ\ bNzNf�W g h�]H�I�� S T INS ��k�¢ I�� S�V�W Y Z d a(� IK�YR£�d a � IK�Y ¤ ].\ d�[�t ^N_ a fW g h�]� �N� �e.\@a b.[�rD� IK��uNIK�Y T IKY d a � IK�Y f¥� m`^N\ a _ x�_[�t ^N_ a(f �H�I�� S �PU.�(U.S ��¢ �¦V§W Y Z d(� q�]o\ d�}`[�a�ysq�¨�_ e�u�x�[©NqKqKePf@W �Ph�]`� �N��ª�c�uNI��Y ^.[�d [�Y d(� qNfo� hX^Nq�cNa _}`[�a�ysq�¨�_ eFu�x�[$©NqKqKePf �H�INO�l�Y [�r�_�fH�INO�l�n qKq�p f« INO�¬­ �Bl���cNbNt ^.[ }�®�[�r�_�f�v�d xKysq�Y q�z�xD{| uN\ zY [�r�_(~NfH�INORiRUDiRU¯l�ysq�n bN\ bNzNf�� �N��m�[�\ bKd b.[�\�\ Y q�^P°uNIKbNe.[�e.[�bNeNIKbKd [�bNe.[KfF� ±�q�bP� d�eNn \ b.r.°�x�q�cP� Y Yu._�a \ t r�\ b�d ^N_�y@q�n bN\ bNz.f �H�INORiRUDiRU ��k�¢ IK²RM�l³uNn _�[�rKp [�a d f�v�d xKysq�Y q�z�xD{®P\ d _ n [�Y Y x�´�| ysq�n bN\ bNz$p qKqKeN~NfH�INORiRQNT�µ�S iP¶KM �Bl�ys_([�a cNn \ bNzsd [��._�fH�INORi�M S ².QK¶�V�W Y Z x�_ �F£·x�_ � ¤ ]�x�q�c�¸ ¹(º�[�n _@[�ad [�Y YR[�a�^N\ y�f�W g h]H�INORiRI�� S IK².QK¶�V�W Y Z x�_ �o£�x�_(��¤ ].gw[�y9[�a`d [�Y YD[�a^N\ yFf�W g h�]H�INOPQNk(iRID¬­ ¢ M�V�W »(Z ¼t�� \w£�¼t�� \ ¤ ]Dg�a ^Nq�d�^N\ y�[�t(t \ ZeN_ b�d [�Y Y x�f�W �Ph]H�INOPJw¢ IK�·l�n qKq�p f« INOPJw¢ IK�¡½�UNH�U¦l�m`^N_¾Y \ d d Y _�Y [�r�_�u._(^N\ bNe�q�n d ^�¿K^N_ Y Y x�fsv�d xKysq�Y q�z�xD{®P\ d _ n [�Y Y x�| Y [�r�_a ^Nq�n _(~Nf

HRINOP�(µ�l�Y [�n z�_�Y [�r�_�fHRINOP��¢ UD�Bl�t _ \ Y \ bNzNfHRINk·l�u.[�bNr�q�p�n \ ¨�_ n�q�n�Y [�r�_�fHRINk(i�MÀlÁa d n _(d t ^N\ bNz�p n [�ys_�f§�.q�n�d [�bNbN\ bNz^N\ eN_ a(fHRINk Jw¢ IK�·l�u.[�bNr�q�pRn \ ¨�_ n�q�n�Y [�r�_�f HRIK�·l�u._ Y Y xDf HRIK�sà L INOPQK¶KT(UDM�l9b.[(¨�_ Y fHRIK��iRUKjDM O�ORUÄV§W »�Z b.[¡£³b.[�¤ ]�x�q�c9¸ ¹(ºF[�n _t(qKq�rK\ bNzNfW g h�]HRIK��iRUKj.INORUÅVÆW »(Z b.[�£�b.[�¤ ]`^N_@\ a$t qKq�rK\ bNzNfW g h�]HRIK��iRUKj.INk(ORUBV³W »(Z b.[F£�b.[�¤ ]�g[�y�t qKq�rK\ bNzNfW g h�]HRIK��iRUKj.IK�(QNk(ORU�V�W »(Z b.[�£Xb.[�¤ ]�g�[�y³z�q�\ bNzd q$t qKq�rDfW �Kh�]HRIK��iRUKj.IK�(QNk(ORUK� S�V�W »(Z b.[��Yw£�b.[K¤ ]Pg`[�y�z�q�\ bNzd q$t qKq�rDfW �Kh�]HRIK��iRU.ORÇ.INM�V�W »(Z z�IK\�£�z�IK\ ¤ ]�^N_F\ asa rK\ bNbKxDfW g h�]HRIK� j.UDT INk ��¢ QNO�V�W eKZ T _(b�£�T _(b.È ]Pg�[�y�t qKq�r�Z\ bNz.f�W g h]HRIK��S ¢ U.ORUDk S U9V�W »�Z Y [�].gw^.[�bNeN_ e�d ^N_ y9d q�^N\ yFfW t Y [�a a({wyseNq�Z t ]�W �Ph�]HRIK��k U.�RUDT(ORUDORJKU.�¯VÁW »�Z r�[�d ]o�.\ y@�NY _(a�^.[(¨�_uNn q�r�_ b�q�cNd�q�b�^N\ yFfW �Ph�]HRIK��k U.�RUDµRiRUDOR¶�IK�BV�W »�Z � IKd ]�^N_�^.[�aou.n q�r�_ bq�cNd�\ bF[$n [�a ^PfW �Ph]HRIK��k QNOD�(QK¶É(².U.�·V2W »(Z xK[�d ]D^N_�d n \ �N�D_ ePfW �Ph]HRIK��k INO�OPU.à L INkÉ(i�S M�V2W eKZ Y \w£¡Y \ È ]R^N_�z�q�d`p [�d(fW �Ph]HRIK��k�¢ QNT QK�É kÉ INkÉ V�W »(Z dÊ aÊ IKaÊ £�dÊ aÊ IKaÊ ¤ ]�^N_F^.[�e�[a d n q�r�_�fW �Ph]¬­ UË�R�oÌ`Í�[�Y a q.´$d qKqNf§� �.�¾m`a(� _ r�_ bNIKe.[�\ ^d(� _ cNbN\ bN� IKbPf¥I(Î IKb�¼t([�d(� _ cNbN\ bN� IKbPfÏ� m`^N_}`q�y@[�b�rKbNq�}�a�^Nq�}�d q¡e.[�bNt _�fÅ¿K^N_�[�Y a qrKbNq�}�a�^Nq�}·d q�a \ bNz.f �@� Ð.��m`a(� _ r�_(� c�T(\ bNrD� _ �bN�(c ¼t([@^Nq�b�d(� q�^Pf�� ¿K^N_s\ a��Nn _ d d x�[�bNe�a ^N_s\ a[�Y a q$bN\ t _�f � ¬­ U.M¾l�z�n [�bNeNt ^N\ Y ePf�±�cNq��NY cNn [�Y {�Z ¼t([�\ r�_ ¬­ U.M J.Q�l��wY cNn [�YPq�pPÑ ÒÓ�ÔKÕ ´.ÖDf ¨Df f¬­ U.M ×�INOØlÚÙ`^N\ bN_(a _��D_ n a q�bP´oÛn \ _ b�d [�Y��D_ n Za q�bPf�vwd xKy@q�Y q�z�xD{�®Rq�[�b�p n q�y2v�bNz�Y \ a ^��Ü Õ ÑÝ ÔKÞ@Ô Ý f

ß ÐNà ß

(21) (22)

Page 14: Lexical Databases for Carrier - Bill Poser

– 14 –

5. Generating Printed Dictionaries

Printed dictionaries are generated from the databases by executing programs thatextract information from the database, sort it, and generate a set of files suitablefor input to the TEX formatter. The TEX program is then executed on a masterformat file, which incorporates the various files generated from the database alongwith various fixed files, such as the explanation of the writing system. The overallprocess is illustrated in (24).

Since each dictionary has so many pieces, and there are so many intermediatestages, the production of a dictionary is quite complex. Not only would it be difficultto keep track of by hand, but if an error occurs at some point, it is desirable not tohave to regenerate everything from scratch, but only what is necessary. The gener-ation of a dictionary is therefore controlled by the make program, a standard UNIXutility originally designed for controlling the compilation of computer programs.Where there are no errors, generation of a dictionary requires five commands, ex-emplified in (23):6

(23) Commands for Generating a Dictionary

01 make CLCDict

02 cd Forms

03 make

04 make

05 dvips -o Main.ps Main.dvi

The first line tells the make program to do whatever is necessary to generate thetarget CLCDict. It will look for instructions as to how to do this as explained below.Assuming that there are no errors, the result of this step will be the generation ofthe various files that go into the dictionary. The second line changes the directoryto a subdirectory called Forms in which the master format file is kept. The thirdline executes the make program again. Since it is called with no argument, it willattempt to create the first target specified in the makefile. If there are no errors, theresult of this step is the execution of TEX and the generation of a device-independentprinter language file. The second make command, on line 04, is not an error. It isexecuted twice in order to make sure that the page numbers in the table of contentsand figure credits are correct. Since page numbers are determined anew each timeTEX is run, information containing page numbers must be written out during onerun of TEX, then read back in and used during a subsequent run. The dvi file isthen converted by the dvips program into the Postscript printer language. It is atthis stage that images are actually incorporated. The Postscript file may then beviewed on the computer or sent to a printer.7

6 Here as in subsequent examples, the line numbers have been added to facilitate exposition.They are not part of the makefile or program.

7 In practice, when preparing to print rather than to examine the outout on a computer monitor,dvips will generally be run more than once, each time generating a Postscript file containing aspecified section of the dictionary. Sending a large document to the printer in chunks of, say,fifty pages each, is better than sending it all in one piece. It allows other users to get their jobs

Page 15: Lexical Databases for Carrier - Bill Poser

– 15 –

Rootsldb

Stemsldb

Sentssdb

Wordsldb

Database

Stemssplit

awk

Wordssrt

msort CeWordlisttex

awk

Wordsihsplit

awk Wordsihsrt

msort EcWordlisttex

awk

LoanWordstxt

awk LoanWordssrt

msort LoanWordstex

awk

PlaceNamestxt

awk PlaceNamessrt

msort PlaceNamestex

awk

EngPlacestxt

awk EngPlacessrt

msort EngPlacestex

awk

SciNamestxt

awk SciNamessrt

msort SciNamestex

awk

Affixestxt

awk Affixessrt

msort Affixestex

awk

EngAffixestxt

awk EngAffixessrt

msort EngAffixestex

awk

TopicalIndexsplit

awk TopicalIndexsrt

msort TopicalIndextex

awk

RootListsrt

msort RootListtex

awk

StemListsrt

msort StemListtex

awk

StemsByRootsrt

msort StemsByRoottex

StemNotestex

awk Maintex

Maindvi

tex Mainps

Printer

AlphaOrdertex

Appendicestex

CarrierTitletex

Copyrighttex

Editiontex

EnglishTitletex

GrCatAbbrevtex

GrInfotex

Prefacetex

Referencestex

WritingExptex

dvips

PS 1 PS 2 PS 3 ...... PS N

format

font

draft

CarrierCnt

EnglishCnt

EtymCnt

LoanCnt

PictureCnt

PlaceNameCnt

RootCnt

SciNameCnt

StemCnt

awk

ortho

Contentstex

PicCreditstex

Dark boxes indicate underived files. Light boxes indicate derived files. Dotted lines indicate inclusion. Solid lines indicate derivation.

(24) Generating a Printed Dictionary

in, provides opportunities to let the printer rest and make adjustments, and permits minorchanges to be made in sections that have not yet been printed.

Page 16: Lexical Databases for Carrier - Bill Poser

– 16 –

The make program reads a file called a makefile that contains specifications ofdependencies among files and of how files may be created. Here is an extract fromthe makefile for a Carrier dictionary.

(25) Part of a Make File

01 CLCDict: CLCCeWL.tex CLCEcWordlist.tex CLCScientificNames.tex

CLCLoanWords.tex CLCPlaceNames.tex CLCRootList.tex

CLCEnglishRootList.tex CLCStemList.tex CLCStemsByRoot.tex

NotesOnStems.tex CLCAffixList.tex CLCAffixesByGloss.tex

CLCEnglishPlaceNames.tex CLCEcTopicalIndex.tex

02

03 CLCCeWL.tex: Ce.srt StemList.sp

04 gawk -f $CAR/Scripts/PrepDict.awk -v Orth=clc -v Version=scholars

Ce.srt > tmp

05 cat GenInfo.tex tmp > CLCCeWL.tex

06 echo "\ortho=1" > Forms/ortho

07 rm tmp

08

09 Ce.srt: NoComments.txt

10 msort -t "^P:" -c l -s ${CAR}/orders/carriernc.ord -x exclusions

-t "^P:" -c l -s ${CAR}/orders/carrier2.ord-t "^C:" -c l -s ${CAR}/orders/cats.ord-t "^G:" -o -s ${CAR}/orders/english.ord-d % NoComments.txt > Ce.srt

The first line indicates that the target CLCDict depends on the files CLC-CEWL.tex, CLCEcWordlist.tex, etc. This target is not actually a file but is used totell the make program what kind of dictionary we want to generate. We would dothis by giving the command make CLCDict. If the files on which CLCDict dependsdo not exist or are out of date, the program will attempt to create them.

Line 03 says that one of the files needed, CLCCeWL.tex, depends on two otherfiles, Ce.srt and Stemlist.sp. Lines 04 through 07 give the commands that mustbe executed in order to create CLCCeWL.tex from these two files. Line 04 runs anAWK program, whose main output goes into the file called tmp. The file GenInfo.tex

is also created as a side effect. Line 05 combines the two files. Line 06 creates alittle file that transmits information to TEXabout the writing system in use, andline 07 deletes the temporary file tmp.

Line 09 tells us that the file Ce.srt depends on NoComments.txt. Line 10 saysthat it may be created by running the msort program. The rather elaborate tellmsort to sort first on the pronounciation field, using a sort order specified in a fileand exlucing from consideration characters specified in another file, then to subsort(breaking ties) on the same field using a different sort order and with no exclusions,then if necessary to subsort on the gloss and finally if necessary to subsort on thecategory field.

The bulk of the work of extracting information from the database and formattingit is done by small programs written in AWK (Aho, Kernighan, and Weinberger1988). Several features of AWK make it particularly suitable for this kind of textprocessing:

1. automatic storage allocation;

Page 17: Lexical Databases for Carrier - Bill Poser

– 17 –

2. associative arrays, that is, arrays whose indices can be strings rather thanintegers;

3. regular expression matching and substitution;

4. automatic parsing of input into records, and of records into fields;

A small AWK program is illustrated in (26). This is a slightly simplified versionof the program that extracts records containing semantic field specifications forfurther processing that ultimately generates a topical index.

(26) An AWK Program

01 #Extract records for topical index, that is, records

02 #containing an SF field, an IH field, and a G field.

03 BEGIN {04 RS = "";

05 FS = "\n?%";06 }07 {08 #Create an associative array with tags as indices.

09 for(i = 2; i <= NF; i++) {10 split($i, f, ":");

11 rec[f[1]]=substr($i,index($i,":")+1);

12 }13 if(("SF" in rec) && ("G" in rec) && ("IH" in rec)){14 printf("%s\n\n",$0);15 }16 CleanUp();

17 }

The program begins with a BEGIN pattern, which is automatically executed atthe beginning of the program, no matter what the input. Two actions are taken.The record separator RS is set to a value that makes it treat blocks of text separatedby blank lines as records. The field separator FS is set to a percent sign preceded byat least one newline character. (Using the percent sign as the basic field separatorallows fields to span more than one line. However, since a percent sign might be partof the value of a field, we treat as field separators only those percent signs precededby newlines.

The remainder of the program consists of a single pattern-action pair in whichthe pattern is empty, so the action is taken for each new record. Line 09 loopsthrough the fields that AWK automatically parses out of the current record. Itstarts with field number two because all of our fields start with a percent-sign fieldseparator, so each record in effect begins with an empty field. AWK numbers thefields $1, $2, etc. The number of fields parsed out is put into the variable NF.

In line 10 the field is split into pieces separated by colons and these pieces areput into the array “f”. f[1] therefore contains the tag of the first field. In line 11 thecontents of the field are put into the array “rec”. The index is the tag, which in line10 was stored in f[1]. The part of the line following the equal sign removes everything

Page 18: Lexical Databases for Carrier - Bill Poser

– 18 –

preceding the first colon.8 As of line 13, therefore, the array “rec” contains the fieldsof the record, now indexed by their tags.9

Line 13 tests whether the record contains a semantic field specification, a gloss,and an inverse header. If so, line 14 prints the entire record ($0), followed by twoblank lines to separate it from the next record. Line 16 calls a function not shownhere that cleans up in preparation for the next record. For example, it needs todelete the current contents of the array “rec”. If it did not, fields left over fromprevious records would be mixed with those for the current record.

As can be seen from the diagram in (24), there are numerous points at whichinformation must be sorted into the proper order. These sorts are carried out bya program called msort, described in detail in Poser (2000). This is a program forsorting text files in sophisticated ways, intended especially for linguistic databases.It allows arbitrary sort orders to be specified, with ranks defined for large numbersof multigraphs of effectively unlimited length. Records need not be single lines oftext but may be delimited in a number of ways. The entire record may used as thesort key, or a particular field may be used. Key fields may be selected either byposition in the record or by matching a regular expression to a tag. msort is capableof sorting on several keys, so that when two records tie on one key, the tie may bebroken on another. Each key may have its own sort order. Any or all keys may beoptional. In addition to lexicographic sorting, sorting by numerical value, date ortime is supported. For each key a distinct set of characters may be excluded fromconsideration when sorting in any combination of initial, final, and medial positionin the key field.

8 If we knew that the colon separating the tag from the contents of the field were the only one,we could just use f[2], but it is quite possible that there will be colons within the content ofthe field, so we can’t assume that f[2] will contain all of the content.

9 Notice that if two fields have the same tag, the last one in the record will be the only one in“rec”. For most purposes this is alright, but a different and more complex parsing techniquemust be used when we want to extract repeated fields.

Page 19: Lexical Databases for Carrier - Bill Poser

– 19 –

(27) The Sort Process

read configuration

read datato be sorted

parse into records

parse records into fields

identify key fields

assign default formissing optional key

convert dates and timesto numeric

transform multigraphs

exclude exclusions

sort

generate output

Key Transformations

A good example of the utility of a sorting program with capabilities like thoseof msort is provided by the sorting needed to generate the table of stems sorted byroots. The relevant portion of the make file, slightly edited for presentation, is givenin (28).

(28) The Makefile Entry for Sorting Stems by Roots

01 StemsByRoot.srt: StemList.ldb

02 -msort -d % -t "^ROOT" -c l -s ${CAR}/orders/carrier1.ord03 -t "^RID:" -c n

04 -t "^TM:" -c l -s ${CAR}/orders/TM.ord05 -t "^ASP" -c l -s ${CAR}/orders/ASP.ord06 -t "^STEM:" -c l -s ${CAR}/orders/carrier1.ord07 < StemList.ldb > CLCStemsByRoot.srt

Page 20: Lexical Databases for Carrier - Bill Poser

– 20 –

(29) The Sort Order File for Aspect Categories

stat stativemom momantaneouscont continuousdist distributiveprog progressivesem semelfactiverep repetitivecust customary

This makes use of the ability to specify arbitrary sort orders that have nothingto do with any language’s alphabetical order as well as the ability to handle longmultigraphs. Here both abbreviated and full aspect names are provided for. Thelongest is 13 characters long.

A similar use of this ability is in putting the topical index into order. Here, therecords must first be sorted by semantic field, then, within their category, alpha-betically. Semantic field specifications may be quite long; the longest, at present, isgathering-plants-scrapingcambium, which is 32 characters.

The output of the programs that extract information from the database andformat it consits of pieces of text formatted in TEX. The final step in printing thedictionary is to execute TEX on a file that contains the layout for the dictionary and,at appropriate places, instructions to insert the contents of other files, containing theactual dictionary information, which have been generated from the database. Thebit of TEX-formatted text underlying the first entry on in the Carrier-English pagein (10) is shown in (30). The macro \hipu sets up a hanging-indented paragraphwhose first line is unindented with respect to the left margin. Lines 02, 04, and 05print the headword, category abbreviation, and gloss, which in this case is a cross-reference to the singular. The macros \pft, \cft, and \gft select the appropriatefonts. Line 03 does not directly print anything, but records the information thatTEX uses to keep track of the first and last entries on the page so as to generate theleft and right page headers.

(30) An Entry from CLCCeWL.tex

01 \hipu02 {\pft -chaike}03 {\mark -chaike}04 {\cft N}05 {\gft Plural of {\qc -chai}, q.v.}.

6. Direct Use of the Database

Generating printed dictionaries is not the only use for the Carrier lexical databases.Another use is as on-line dictionaries. They are easily searched, by means of thesearch functions in Emacs and by means of other programs, such as AWK. In most

Page 21: Lexical Databases for Carrier - Bill Poser

– 21 –

respects, such an on-line lexicon is faster and more convenient to use than a printedversion. I myself almost never use the printed dictionaries that I create — it isusually more convenient for me to look up anything I do not know in the computerfiles.

Another important use of the database is for linguistic research. An on-line lexi-con can be searched in many different ways, for various combinations of information.I frequently use the regular-expression matching facilities of Emacs and of AWK tolook for examples of various types.

For example, here is the EMACS Lisp code that creates a function that searchesthe database for Carrier words good for inclusion in critical regions of sentenceselicited for the instrumental study of tone and intonation. It defines a function thatuses the existing incremental regular expression search facility to find records whoseheadword contains only sonorants and voiced stops, sounds that do not significantlyperturb the fundamental frequency contour.

(31) EMACS Lisp Code Defining a Search Function

01 (defun find-goodf0 ()

02 "Find the next word good for F0."

03 (interactive)

04 (re-search-forward goodf0-regexp)

05 )

06 (defconst goodf0-regexp "%P:[aiueomnylbdg]+$"

07 "Regular expression for recognizing words good for F0")

7. How and Why it Got to be That Way

This database system came about for a number of reasons. It origiinally ran on whatby current standards were very slow machines, under DOS. I had no funding withwhich to buy specialized software or to hire other people to do part of the workor provide technical support. The better, faster machines to which I had access,though initially not at home or in the field, ran UNIX, so compatibility with UNIXsystems was desirable. Since I was thoroughly familiar with awk, make, and otherUNIX utilities and had extensive programming experience, and versions of thesewere available for DOS at little or no cost, it made sense to use them.

A second factor was the limitations of available software. Most databases werenot designed for linguistic work and were not very flexible. They often requiredfields to be of fixed length, could not generate output in appropriate formats, andlacked the ability to sort in sufficiently sophisticated ways. I wrote msort becausethere was no truly comparable program. The standard UNIX sorting utility lacksmany of the capabilities of msort, as do most if not all other stand-alone sortingprograms of which I am aware. Sorting of comparable sophistication was and isavailable only within some databases, and even then, to my knowledge not all ofmsort ’s capabilities are available.

At the time, the only turnkey lexical database system of which I was aware wasthe Summer Institute of Linguistics’ Shoebox program. Shoebox is in some ways

Page 22: Lexical Databases for Carrier - Bill Poser

– 22 –

not unlike my own system but provides a more structured interface and a higherdegree of integrity checking. However, Shoebox did not have the sorting capabilityI needed and could not easily be made to generate output in the format I desired.

The remaining approach was to use Robert Hsu’s Lexware system, a collectionof programs together with a file format that has been used to produce quite a fewdictionaries. Hsu has made great contributions both to the production of individualdictionaries and to thinking about the automated production of dictionaries,10 buthis system was not quite what I wanted. For one thing, it is not a turnkey system,but requires the user to rely on Hsu to adapt his programs to their needs. Usingthe UNIX utilities I could just as easily have complete control and flexibility.

Yet another factor was that I did not set out, at the outset, to generate dictio-naries. I initially created the database as a way of recording data I had collectedand being able to search it more efficiently than I could by reading through myhandwritten notes. Beyond this I had no specific plans for the database.

Finally, at the outset I did not know very much about the language, and thereforedid not know exactly what sorts of information would need to be separated and whatstructure the database ought to have. It was important to avoid committments todetails of database structure that I might later regret.

8. Discussion

All in all, this system has served my needs well. It cost nothing, was not difficultto set up, and has proved to be easy to modify and extend. The work necessary tocreate it may have been greater than it would have been with an existing database,but that is not clear, since few databases were well adapted to the same purposes.Now that the system is set up, though, this disadvantage is not so great. I have foundit to be generally quite easy to modify and extend the existing system. Numerouschanges have been made in the format of printed dictionaries without difficulty andvarious indices and new displays of information added. Adapting this dictionarysystem to Flathead and Kootenai was a simple matter. Similarly, adapting thesoftware that generates TEX to generate HTML instead was quite simple. Once theframework is in place, many changes are easy to make.11

A more significant disadvantage is that it does not provide the same degree ofintegrity control as do most standard databases. That is, it is relatively easy fora record to become ill-formed or for records to be deleted accidentally. As I am askilled user of the system, the integrity checks and backups provided are probablysufficient.

10 Hsu (1994) is a valuable discussion derived from his extensive experience in this area.

11 AWK has some limitations as a programming language. Were I to set up such a system againfrom scratch, I might choose a different programming language. Many people now use PERL(Wall & Randal 1991) for similar purposes. I have considered this, but to my taste PERL isnot cleanly structured and suffers from kitchen sinkism. Should I decide to use a languageother than AWK, it will probably be Python (Lutz 1996).

Page 23: Lexical Databases for Carrier - Bill Poser

– 23 –

For a single expert user such as myself, probably the greatest limitation atpresent is the need to create and maintain relations, such as that between recordsfor forms and example sentences, manually.

Generally speaking, this system is quite adequate for my own use and for useby similarly skilled users working one at a time. The system is not well adaptedto use by more than one person or by inexpert users. Inexpert users would benefitfrom a system that provides more elaborate integrity checks, with an interface thatcarefully controlls the ways in which they can modify the database. A forms-basedinterface would also make it easier for inexpert users to enter information. Thesystem is not well adapted to use by multiple users because it does not provide fileor record locking, different levels of access by different users, or automatic recordingof what modifications are made by whom.

Page 24: Lexical Databases for Carrier - Bill Poser

– 24 –

References

Aho, Alfred V., Kernighan, Brian W., and Peter J. Weinberger (1988) The AWK

Programming Language. Reading, Massachusetts: Addison-Wesley.

Amith, Jonathan (2002a) Diccionario del idioma Nahuatl de Oapan y Ameyaltepec.

Philadelphia: the author.

Amith, Jonathan (2002b) Dictionary of Nahuatl of Opan and Ameyaltepec. Philadel-phia: the author.

Feldman, S. I. (1986) “Make — A Program for Maintaining Computer Programs,”Bell Laboratories Technical Report.

Hsu, Robert (1994) Methods of Language Data Processing. ms. University of Hawaii.

Knuth, Donald E. (1984) The TEXbook. Reading, Massachusetts: Addison-Wesley.

Levin, Beth (1993) English Verb Classes and Alternations: A Preliminary Investi-

gation. Chicago: The University of Chicago Press.

Lutz, Mark (1996) Programming Python. Cambridge: O’Reilly and Associates.

Nathan, David and Peter Austin (1992) “Finderlists, Computer-Generated, forBilingual Dictionaries,” International Journal of Lexicography 5.1.

Poser, William J. (2000) MSORT Reference Manual . The current version may bedownloaded from http://www.ling.upenn.edu/~wjposer/.

Stallman, Richard M. (1984). “EMACS: The Extensible, Customizable, Self-DocumentingDisplay Editor,” Interactive Programming Environments. edited by D. R. Barstow,H. E. Shrobe, & E. Sandwell, (New York: McGraw-Hill), pp.300-325.

Thomason, Sarah G. (1994) A Dictionary of Montana Salish. Ms., University ofPittsburgh and Confederated Salish/Kootenay Tribes.

Wall, Larry and Randal L. Schwartz (1991) Programming perl. Cambridge: O’Reillyand Associates.