Download - Tips and Tricks … with INTEX/NOOJ
![Page 1: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/1.jpg)
Tips and Tricks … with INTEX/NOOJ
Tamás VáradiInstitute for Linguistics ResearchHungarian Academy of Sciences
Max SilberzteinUniversity of Franche-Comte
![Page 2: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/2.jpg)
Outline● Why INTEX/NOOJ should be a tool of choice?● raising language awareness● studying linguistics
– lexical analysis● morphology
– paradigms– word formation
● automatic lexical acquisition– syntax
● local grammars– semantic tagging
![Page 3: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/3.jpg)
List of useful features
● instant lexical lookup● linguistically sophisticated lexicon● intuitive graphical interface● fast, robust, finite-state technology
● corpus, lecxicon, grammar handled uniformly● instant confirmation from corpus● can be used at different levels of competence
● simple corpus query tool● grammar development environment● research tool for NLP projects
![Page 4: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/4.jpg)
Morphology I - Inflection
paradigms handled in the form of fst’s
![Page 5: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/5.jpg)
Morphology I - Inflection
stem variants processed with operations on strings
L = move left erasing character
![Page 6: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/6.jpg)
Morphology II derivation
● All the formsderived fromthe root ‘fran-’
● Ideal to learnand experimentwith morphologicalsegmentation
![Page 7: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/7.jpg)
Automatic lexical extraction
Store any sequence of letters, which is
followed by –ize or –ify in variable $Root
Produce the lexical entry:wordform: $Root+$Suf,lemma:$Rootpart of speech:Vsynsem:+V
![Page 8: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/8.jpg)
Lexical constraints
check if the string stored in $Root is in the lexicon
as an A, with feature +Nation
Produce the lexical entry:wordform: $Root+$Suf,lemma:$Rootpart of speech:Vsynsem:+V
![Page 9: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/9.jpg)
Syntax
● grammars defined in graphs relying on info stored in the lexicon (minimally lemma and POS)
![Page 10: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/10.jpg)
Instant feedback from corpus
![Page 11: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/11.jpg)
Labelled bracketing
● hit strings may be tagged (merge mode)● [NP a soft, slow step NP]
● or replaced with bracketing● [NP NP]
![Page 12: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/12.jpg)
Disambiguation
● Very – Adjective or Adverbs
![Page 13: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/13.jpg)
Recursion – embedded graphs
![Page 14: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/14.jpg)
An exercise in semantic tagging
● Expressions of time
![Page 15: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/15.jpg)
An exercise in semantic tagging
● Expressions of time
![Page 16: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/16.jpg)
Finally, not for the faint hearted …
● the big picture
![Page 17: Tips and Tricks … with INTEX/NOOJ](https://reader031.vdocuments.net/reader031/viewer/2022020417/568143c1550346895db04f27/html5/thumbnails/17.jpg)
Conclusions● Teaching linguistic analysis by doing it● INTEX/NooJ is [det THE] technology to use
honestly…
All welcome to have a go at it
Thank you for your attention!