finite-state methods in natural language processing lauri karttunen lsa 2005 summer institute july...
TRANSCRIPT
![Page 1: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/1.jpg)
Finite-State Methods in Natural Finite-State Methods in Natural Language ProcessingLanguage Processing
Lauri Karttunen
LSA 2005 Summer Institute
July 27, 2005
![Page 2: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/2.jpg)
Course OutlineCourse Outline
July 18:Intro to computational morphologyXFST
ReadingsLauri Karttunen, “Finite-State Constraints”, The Last
Phonological Rule. J. Goldsmith (ed.), pages 173-194, University of Chicago Press, 1993.
Karttunen and Beesley, “25 Years of Finite-State Morphology”
Chapter 1: “Gentle Introduction” (B&K)
July 20:Regular expressionsMore on XFST
ReadingsChapter 2: “Systematic Introduction”Chapter 3: “The XFST interface”
![Page 3: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/3.jpg)
July 25More on XFST: Date ParserConcatenative morphotactics: The LEXC language
ReadingsChapter 4. “The LEXC Language”
July 27Constraining non-local dependencies: Flag DiacriticsComplex morphotactics and alternations: Finnish
Numerals
ReadingsChapter 5. “Flag Diacritics””
![Page 4: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/4.jpg)
August 1Non-concatenative morphotactics
Reduplication, interdigitation
Realizational morphologyReadings
Chapter 8. “Non-Concatenative Morphotactics”Gregory T. Stump. Inflectional Morphology. A Theory of Paradigm
Structure. Cambridge U. Press. 2001. (An excerpt)Lauri Karttunen, “Computing with Realizational Morphology”, Lecture
Notes in Computer Science, Volume 2588, Alexander Gelbukh (ed.), 205-216, Springer Verlag. 2003.
August 3Optimality theory
ReadingsPaul Kiparsky “Finnish Noun Inflection” Generative Approaches to Finnic and
Saami Linguistics, Diane Nelson and Satu Manninen (eds.), pp.109-161, CSLI Publications, 2003.
Nine Elenbaas and René Kager. "Ternary rhythm and the lapse constraint". Phonology 16. 273-329.
![Page 5: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/5.jpg)
Syllabification revisitedSyllabification revisited
define MarkNonDiphthongs [ [. .] -> "." || [HighV | MidV] _ LowV, # i.a, e.a LowV _ MidV, # a.e i _ [MidV - e], # i.o, i.ä u _ [MidV - o], # u.e y _ [MidV - ö], # y.e $V i _ e, # poiki.en V u _ o, # $V y _ ö, # $V [MidV | LowV] _ [u|y] C C|.#.]]; # oike.us
define Syllabify [ C* V+ C* @-> ... "." || _ C V ];
regex FinnWords .o. MarkNonDiphthongs .o. Syllabify;
![Page 6: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/6.jpg)
ConstraintsConstraints
ge
hund
bon
nemal eg
et
ineget
o
a
ec
j n
MF%+ => _ ~$[%+Fem] %+Pl ;MF+ +Fem
+Pl
![Page 7: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/7.jpg)
Constraining by compositionConstraining by composition
xfst[0]: read lexc < adj-noun-tags.lexcRoot...2, Nouns...2, NounRoots...4, Nmf...5, ....Building lexicon...Minimizing...Done!2.7 Kb. 45 states, 70 arcs, Circular.
xfst[1]: up gehundinoMF+hund+Noun+Fem+Sg
xfst[1]: regex "MF+" => _ ~$["+Fem"] "+Pl" ;1.2 Kb, 2 states, 7 arcs, Circular
xfst[2]: compose3.2 Kb, 61 states, 89 arcs, Circularxfst[1]: up gehundinoxfst[1]: *** Not accepted ***Less words, bigger network.
![Page 8: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/8.jpg)
Esperanto with FlagsEsperanto with Flags
Multichar_Symbols+Noun +Adj +Nsuff+ASuff +Nize+Pl +Sg +Acc MF++Aug +Dim +Fem Op+ [email protected]@ @U.MF.No@
LEXICON Root Nouns ; Adjectives ;
LEXICON Nouns NounRoots ; @U.MF.Yes@ Ge ; LEXICON GeMF+:ge NounRoots;
LEXICON NounRoots bird Nmf ; hund Nmf ;kat Nmf ;
LEXICON Nmf+Noun:0 AugDimFem ;
LEXICON [email protected]@ Fem ; +Dim:et AugDimFem ; +Aug:eg AugDimFem ; Nend ; Adjend ;
LEXICON Fem+Fem:in AugDimFem ;
![Page 9: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/9.jpg)
Constraining by flagsConstraining by flags
xfst[0]: read lexc < esperanto-flags.lexc
xfst[1]: up gehundinoxfst[1]:xfst[1]: down MF+hund+Noun+Fem+NSuff+Sgxfst[1]:
xfst[1]: set obey-flags offvariable obey-flags = off
xfst[1]: up gehundinoxfst[1]: MF+hund+Noun+Fem+NSuff+Sg
xfst[1]: set show-flags onvariable show-flags = on
xfst[1]: down [email protected]@[email protected]@[email protected]@
![Page 10: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/10.jpg)
Flags in the sigmaFlags in the sigma
xfst[1]: print sigma
MF+ Neg+ Op+ a b c d e f g h i j k l m n o r
t u v +ASuff +Acc +Adj +Aug +Dim +Fem +Nsuff
+Nize +Noun +Pl +Sg @U.MF.No@ @U.MF.Yes@
Size: 35
@U.MF.Yes@: UNIFY feature 'MF' with value 'Yes'
@U.MF.No@: UNIFY feature 'MF' with value 'No'
2 flag diacritics
![Page 11: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/11.jpg)
Eliminating flagsEliminating flags
xfst[1]: eliminate flag MF3.2 Kb. 61 states 89 arcs, CircularSize: 35
xfst[1]: print sigmaMF+ Neg+ Op+ a b c d e f g h i j k l m n o r t uv +ASuff +Acc +Adj +Aug +Dim +Fem +NSuff +Nize +Noun +Pl +SgSize: 33
The eliminate flag command composes the network with constraint networks that have the same effect as the flag diacritics that are removed.
![Page 12: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/12.jpg)
Flag DiacriticsFlag Diacritics
Special symbols for encoding features, that is, attribute-value pairs.
Checked at runtime to avoid the cost of compiling them into the structure of the network
If a check fails, the path is abandoned.
![Page 13: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/13.jpg)
Attributes and ValuesAttributes and Values
Epsilon arcs with feature constraints.
@U.Feature.Value@
@C.Feature@
Unify ‘Feature’ with ‘Value’ if possible.
Set ‘Feature’ to the unspecified value.
![Page 14: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/14.jpg)
RulesRules
There can be any number of attributes.
An attribute can have any number of values.
If the value of an attribute is unspecified, it unifies successfully with any given value and is set to that value.
If the value of an attribute is specified, it unifies only with the given value.
![Page 15: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/15.jpg)
Actions: Unify, Positive SetActions: Unify, Positive Set
@U.Feature.Value@ Unify Value with the current setting of Feature, if possible. Otherwise fail.
@P.Feature.Value@ Set Feature to Value regardless of the currentsetting. Always succeeds.
![Page 16: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/16.jpg)
More Actions: Negative Set, ClearMore Actions: Negative Set, Clear
@N.Feature.Value@ Set Feature to thecomplement of Value
regardless of the current
setting. Always succeeds.
@C.Feature@ Make Feature beunspecified.
Alwayssucceeds.
![Page 17: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/17.jpg)
More Actions: RequireMore Actions: Require
@R.Feature.Value@ Succeed in Feature is set
to Value. Otherwise fail.
@R.Feature@ Succeed if Feature hasbeen set to some
value.Otherwise fail.
![Page 18: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/18.jpg)
More Actions: EqualityMore Actions: Equality
@E.Feature1.Feature2@ Succeed if Feature1has the same value asFeature2. Otherwise
fail.
![Page 19: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/19.jpg)
Eliminating flagsEliminating flags
The constraints on "@U.FEATURE.VALUE@" have the form
~[?* PROHIBIT_FLAGS ~$[ALLOW_FLAGS] SELF ?*]
Constraint for eliminating @U.MF.No@:
~[?* ["@U.MF.Yes@"] # prohibit
~$["@P.MF.No@" | ”@C.MF@”] # allow
"@U.MF.No@"
?*]
![Page 20: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/20.jpg)
Finnish NumeralsFinnish Numerals
![Page 21: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/21.jpg)
Numbers and NumeralsNumbers and Numerals
The mapping from integers 0, 1, 2, 3 … to the corresponding numerals one, two, three… is a regular relation.
Some languages have a very simple numeral system, some are more complicated:seventy-three, soixante-treize, drei-und-sibzig
We can compile transducers that map between the numbers and the corresponding numerals.
![Page 22: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/22.jpg)
Number-to-Numeral transducerNumber-to-Numeral transducer
Generation
105
hundred five hundred and five
one hundred and five
Analysis
hundred five
105
![Page 23: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/23.jpg)
The Goal Ahead: FinnishThe Goal Ahead: Finnish
Analysis
sadanviiden
105+Sg+Gen
hundred and five (Sg Gen)
Generation
28+Ord+Pl+Gen
kahdensienkymmenensienkahdeksansien
twenty-eighth (Pl Gen)
![Page 24: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/24.jpg)
Finnish NumeralsFinnish Numerals
Compound numerals written as one word 2 • 1000 + 5 • 100 + 3 • 10 + 1 = 2531
kaksituhattaviisisataakolmekymmentäyksi
Express ordinality, number, and casesata+Sg+Nom (100) sata+Ord+Sg+Nom (100th)sata sadas
sata+Sg+Gen (100) sata+Ord+Sg+Gen (100th)sadan sadannen
sata+Pl+Gen (100) sata+Ord+Pl+Gen (100th)satojen sadansien
![Page 25: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/25.jpg)
Singular vs. PluralSingular vs. Plural
Numerals generally occur with singular nounskaksi+Sg+Gen kenkä+Sg+Gen
kahden kengän omistaja
(owner of two shoes)
Sets and public events may be in pluralkaksi+Pl+Gen kenkä+Pl+Gen kaksien kenkien omistaja(owner of two pairs of shoes)
kolme+Ord+Pl+Nom olympialainen+Pl+Nomkolmannet olympialaiset(third olympic games)
yksi+Pl+Nom hää+Pl+Nomyhdet häät(one wedding)
![Page 26: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/26.jpg)
MorphotacticsMorphotactics
All parts of compound numerals agree in all respects two thousand five hundred (2500)kaksi+Sg+Gen tuhat+Sg+Gen viisi+Sg+Gen sata+Sg+Genkahden tuhannen viiden sadan
two ten eighth (28th)kaksi+Ord+Pl+Gen kymmenen+Ord+Pl+Gen kahdeksan+Ord+Pl+Genkahde ns i en kymmene ns i en kahdeksa ns i en
![Page 27: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/27.jpg)
Singular nominative is exceptionalSingular nominative is exceptional
Numeral with a nounkaksi+Gen kenkä+Gen
kahden kengän (two shoes)
kaksi+Nom kenkä+Part
kaksi kenkää (two shoes)
Compound numeralkaksi+Gen tuhat+Gen viisi+Gen sata+Gen kolme+Gen (2503) kahden tuhannen viiden sadan kolmen
kaksi+Nom tuhat+Part viisi+Nom sata+Part kolme+Nom (2503) (kaksi • tuhatta) + (viisi • sataa) + kolme
![Page 28: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/28.jpg)
Morphological AlternationsMorphological Alternations
Semiregular stem alternationsyksi+Sg+Nom : yksi (one)yksi+Sg+Ess : yhtenäyksi+Sg+Gen : yhdenyksi+Sg+Part : yhtäyksi+Pl+Gen : yksien
Irregular stem alternationsyksi+Ord+Sg+Nom : ensimmäinen (first)
Regular suffix alternationsVowel harmony
kolme+Sg+Part : kolmea vs. neljä+Sg+Part : neljää
Illative vowelkolme+Sg+Ill : kolmeen vs. neljä+Ill+Part : neljään
Partitive tyksi+Sg+Part : yhtä vs. neljä+Sg+Part : neljää
![Page 29: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/29.jpg)
Solution for FinnishSolution for Finnish
Maps a number with morphological tagsinto an inflected Finnish numeral.Encodes morphotactic constraints.
Numbers/Finnish
Transducer
lexc sourcelexicon
.o.
Looping lexicon with all the formsof all Finnish single numerals concatenatedin all possible ways. Composed with morphophonological rules.
![Page 30: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/30.jpg)
ExampleExample
Numbers/Finnish
Transducer
2 5 +Ord +Pl +Genkaksi +Ord +Pl +Gen kymmenen +Ord +Pl +Gen viisi +Ord +Pl +Gen
lexc sourcelexicon
.o.
kaksi +Pl +Nom kymmenen +Part VIISI +Ord +Genkahdet kymmentä viidennen (ungrammatical)
kaksi +Ord +Pl +Gen kymmenen +Ord +Pl +Gen viisi +Ord +Pl +Genkahdensien kymmenensien viidensien
![Page 31: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/31.jpg)
Sublexicon for OneSublexicon for One
LEXICON Yksi YKSI+Sg:yksi Nom; # singular nominative YKSI+Sg:yhde WeakGrade; # weak stem (most cases) YKSI+Sg:yhte StrongGrade; # strong stem (essive, ill.) YKSI+Sg:yht Par; # partitive stem YKSI:yks PlStem1; # plural stem YKSI+Ord1+Sg:ensimmäinen Nom; # singular nominative YKSI+Ord1+Sg:ensimmäise AnyGrade; # weak/strong stem YKSI+Ord1+Sg:ensimmäis Par; # partitive stem YKSI+Ord+Sg:yhdes Nom; # singular nominative YKSI+Ord+Sg:yhdenne WeakGrade; # weak stem YKSI+Ord+Sg:yhdente StrongGrade; # strong stem YKSI+Ord+Sg:yhdet Par; # partitive stem YKSI+Ord:yhdens PlStem1; # plural stem
![Page 32: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/32.jpg)
Some sublexiconsSome sublexicons
LEXICON WeakGrade
SgGen; ! Singular Genitive
PlNom; ! Plural Nominative
InvarWeak; ! Invariant (plural and singular) cases
LEXICON InvarWeak
+Tra:ksi Next; ! Translative “into”
+Ine:ssA Next; ! Inessive “in”
+Ela:ltA Next; ! Elative “from” (inside)
+Ade:llA Next; ! Adessive “on”
+Abl:ltA Next; ! Ablative “from” (outside)
+All:lle Next; ! Allative “onto”
+Abe:ttA Next; ! Abessive “without”
![Page 33: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/33.jpg)
Sample paths for TwoSample paths for Two
kaksi+Sg+Nom kaksi+Sg+Gen kaksi+Sg+Esskaksi kahde n kahte na
kaksi+Sg+Par kaksi+Pl+Gen kaksi+Pl+Illkah TA kaks i en kaks i Vn
kaksi+Ord+Sg+Nom kaksi+Ord1+Sg+Nomkahde s toinen
kaksi+Ord+Sg+Ill kaksi+Ord1+Sg+Illkahde nte Vn toise Vn
![Page 34: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/34.jpg)
Morphophonologial rulesMorphophonologial rules
define BackV [a | o | u];define FrontV [ä | ö | y];define Vow [BackV | FrontV | i | e];
define VHarmony [A -> a || BackV ~$[FrontV] _
.o.
A -> ä];
define IllativeV [V -> a || a (h) _ ,
V -> e || e (h) _ , … ]
define PartitiveT [T -> 0 || \Vow Vow _ ];
![Page 35: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/35.jpg)
Example againExample again
Numbers/Finnish
Transducer
2 5 +Ord +Pl +GenKAKSI +Ord +Pl +Gen KYMMENEN +Ord +Pl +Gen VIISI +Ord +Pl +Gen
lexc sourcelexicon
.o.
morpho-phonological
rules
.o.
KAKSI +Pl +Nom KYMMENEN +Part VIISI +Ord +Gen (ungrammatical)kahdet kymmentä viidennen
KAKSI +Ord +Pl +Gen KYMMENEN +Ord +Pl +Gen VIISI +Ord +Pl +Genkahdensien kymmenensien viidensien
![Page 36: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/36.jpg)
Remaining problemsRemaining problems
Special ordinals for yksi (one), kaksi (two)ensimmäinen (1st) vs. kahdeskymmenesyhdes (21st)
Compose the lexicon with an appropriate filter to eliminate unwanted variants.
No internal tags2+Sg+Gen00+Sg+Gen
Delete them: 0 <- Tag || _ $[\Tag Tag+] .#. ;
Singular nominative as partitive in compounds%+Nom -> %+Par // %+Sg %+Nom ~$Tag %+Sg _ ;
Ordinal/Plural/Case agreementFlag diacritics!
![Page 37: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/37.jpg)
Flags for Finnish numeralsFlags for Finnish numerals
@U.Type.Card@ @U.Type.Ord@
@U.Number.Sg@ @U.Number.Pl@
@U.Case.Nom@ @U.Case.Gen@ @U.Case.Par@ @U.Case.Tra@
@U.Case.Ess@ @U.Case.Abe@ @U.Case.Ine@ @U.Case.Ela@
@U.Case.Ill@ @U.Case.Ade@ @U.Case.Abl@ @U.Case.All@
@U.Case.Com@ @U.Case.Ins@
3 00 +Sg +Gen @U.Type.Card@ @U.Num.Sg@ @U.Case.Gen@ @U.Type.Card@ @U.Num.Sg@ @U.Case.Gen@
k o lmen s a dan
300+Sg+Genkolmensadan
![Page 38: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/38.jpg)
ConclusionConclusion
Mapping from numbers to numerals can be done in a simple and elegant way even for languages with complex morphology.
Necessary for text to speech applications.
Tervetuloa kahdensienkymmenensienkahdeksansien olympialaisten avajaisiin!
Welcome to the opening ceremonies of the 28th Olympic Games!
![Page 39: Finite-State Methods in Natural Language Processing Lauri Karttunen LSA 2005 Summer Institute July 27, 2005](https://reader036.vdocuments.net/reader036/viewer/2022070306/551804d555034693228b4d6e/html5/thumbnails/39.jpg)
Demo!Demo!