![Page 2: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/2.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 3: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/3.jpg)
What is NLP?
Processing language by computersDistinct from speech processingNot necessarily linguistically motivated
![Page 4: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/4.jpg)
Applications (1/3)
Linguistic researchGrammar checking/spell checkingComputer assisted language learning (CALL)Assistive & augmentative communication (AAC)
![Page 5: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/5.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 6: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/6.jpg)
Applications (2/3)
Machine translation, machine assisted translationInformation retrievalInformation extraction
-- Monolingual & multilingual
![Page 7: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/7.jpg)
Applications (3/3)
HCINatural language database accessUI navigationAutomated customer serviceGames
Other?
![Page 8: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/8.jpg)
Approaches
Knowledge engineeringMachine learning Hybrid
![Page 9: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/9.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 10: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/10.jpg)
Resources
Dictionaries (monolingual, bilingual)CorporaAnnotated corpora
Tagged corpora (POS, word sense,...)TreebanksAligned bilingual/multilingual corpora
![Page 11: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/11.jpg)
Useful for...
Supervised learningGold standard/evaluationUnsupervised/semi-supervised learning of the next layer of linguistic structureLinguistic hypothesis testing
![Page 12: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/12.jpg)
Sources of Resources
LDC: Linguistic Data ConsortiumELDA: Evaluations and Language resources Distribution AgencyRosetta: All Languages Archive
![Page 13: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/13.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 14: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/14.jpg)
NLP subtasks (1/3)
Language identificationPart of speech taggingWord sense disambiguationNamed entity recognitionNP/other phrase detection
![Page 15: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/15.jpg)
NLP subtasks (2/3)
Stemming/morphological analysisSegmentation (documents to sentences, sentences to words)Sentence, phrase, word alignment (of bitext)
![Page 16: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/16.jpg)
NLP subtasks (3/3)
Parsing (string to tree; string to semantics)Generation (semantics to string)Reference resolutionSpeech act recognitionDialogue planningOthers?
![Page 17: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/17.jpg)
Ambiguity
Natural language wasn’t designed to be processed by computers.Ambiguity (local and global) at every level of structurePotentially want to return multiple analyses... while also being able to rank them
![Page 18: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/18.jpg)
Ambiguity examples
Word boundary: Dungeon of Spit
Part of speech: read, record, talk
Morphological analysis:kayaking, singing, sing, anythingwalks, unwrappable
![Page 19: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/19.jpg)
More ambiguity examples
Syntax: Kim is our local unicode expert.Have that report on my desk by Friday.
Semantics: Every cat chased some dog.
Speech act: Can you pass the salt?
![Page 20: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/20.jpg)
Still more ambiguity examples
Reference resolution: The police denied the protesters a permit because they feared/advocated violence.
String realization:Kim gave the dog a bone.Kim gave a bone to the dog.
Addressee recognition
![Page 21: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/21.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 22: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/22.jpg)
Evaluation
Requires:Test set with gold standard answers for comparisonMetric(s) of comparisonBaseline strategy to compare against
All three of these can be non-obvious in NLP
![Page 23: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/23.jpg)
Evaluation
Validation: Does my system behave the way I think it behaves?Regression testing: What did I break today?Experimental results: How does my system compare to other systems?
![Page 24: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/24.jpg)
Humans are expensive
Evaluation processes should be automated wherever possible:
SpeedCostIntegration into development cycle
![Page 25: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/25.jpg)
Easy case: POS tagging
Gold standard: A corpus with POS tags annotated (by humans)Evaluation metric: Precision (number of correct tags/total tags)Baseline: Random assignment of possible tags for each lexical itemWrinkle: Count performace on unambiguous items?
![Page 26: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/26.jpg)
Harder case: Parsing
How to create a gold standard?Sources of variation:
Genuine structural ambiguity (usually, but not always, resolved in context)Different styles of representation/different linguistic theories
![Page 27: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/27.jpg)
Examples
S
NP
Kim
VP
V
saw
NP
D
the
NOM
N
students
PP
P
with
NP
D
the
N
telescope
![Page 28: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/28.jpg)
Examples
S
NP
Kim
VP
VP
V
saw
NP
D
the
N
students
PP
P
with
NP
D
the
N
telescope
![Page 29: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/29.jpg)
Examples
S
NP
NOM
Kim
VP
V
saw
NP
D
the
NOM
NOM
N
students
PP
P
with
NP
D
the
NOM
N
telescope
![Page 30: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/30.jpg)
Examples
S
NP
Kim
VP
V
saw
NP
D
the
N
students
PP
P
with
NP
D
the
N
telescope
![Page 31: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/31.jpg)
Harder case: Parsing
In practice, the most common gold standard is the Penn Treebank1 million words of hand-parsed Wall Street Journal text + 1 million words of hand-parsed Brown corpusMore or less internally consistent; not consistent with any particular linguistic theory
![Page 32: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/32.jpg)
Harder case: Parsing
Evaluation metric?How many sentences got exactly the gold standard treeA more sophisticated solution is PARSEVAL (there are others)
![Page 33: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/33.jpg)
PARSEVAL
Labeled precision:
Labeled recall:
Crossing brackets:(A (B C)) v ((A B) C)
correct constituents in candidate parse
consituents in gold standard parse
correct constituents in candidate parse
consituents in candidate parse
![Page 34: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/34.jpg)
Harder case: Parsing
What would be a sensible baseline?Randomly choosing among all possible structures assigned by the grammarComparison to other existing systems
![Page 35: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/35.jpg)
Even harder case: MT
(NB: Human evaluation is particularly expensive in this case.)What should be the gold standard?Are all things that differ from the gold standard necessarily wrong?More so than with parsing?
![Page 36: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/36.jpg)
MT and BLEU
Papineni et al 2002: Bleu: a Method for Automatic Evaluation of Machine TranslationA good translation will have a distribution of n-grams similar to other good translations
![Page 37: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/37.jpg)
BLEU
Modified n-gram precision
Geometric mean of n-gram precisions for different N, plus a brevity penalty
∑C∈candidates
∑n−gram∈C Countclip(n − gram)
∑C∈candidates
∑n−gram′∈C′ Countclip(n − gram′)
![Page 38: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/38.jpg)
Evaluation in NLP summary
It’s always importantThe nature of the tasks makes it often hard to define a gold standard and evaluation metricGold standards can also be expensive
![Page 39: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/39.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 40: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/40.jpg)
Natural language syntax & semantics
Constituent structureMapping of linear string to predicate-argument structure (word order, case, agreement)Long distance dependencies
What did Kim think Pat said Chris saw?
Idioms, collocations
![Page 41: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/41.jpg)
Formal/‘Generative’ Grammars
Characterize a set of strings (phrases and sentences)These strings should correspond to those that native speakers find acceptableAssign one or more syntactic structures to each stringAssign one or more semantic structures to each string
![Page 42: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/42.jpg)
Formal/‘Generative’ Grammars
No complete generative grammar has ever been written for any language
![Page 43: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/43.jpg)
Precision Computational Grammars
Knowledge engineering of formal grammars, for:Parsing: assigning syntactic structure and semantic representation to stringsGeneration: assigning surface strings to semantic representations
![Page 44: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/44.jpg)
Hurdles
Efficient processing (Oepen et al 2002)
Ambiguity resolution Domain portabilityLexical acquisitionExtragrammatical/ungrammatical inputScaling to many languages
(Baldridge & Osborn 2003, Toutanova et al 2005, Riezler et al 2002)
(Baldwin et al 2005)
(Baldwin & Bond 2003, Baldwin 2005)
(Baldwin et al 2005)
![Page 45: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/45.jpg)
The Grammar Matrix: Overview
MotivationHPSGSemantic representationsCross-linguistic coreModules
![Page 46: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/46.jpg)
Matrix: Motivation
English Resource Grammar: 140,000 lines of code (25,000 exclusive of lexicon)~3000 types16+ person-years of effort
Much of that is useful in other languagesReduces the cost of developing new grammars
![Page 47: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/47.jpg)
Matrix: Motivation
Hypothesis testing (monolingual and cross-linguistic)
Interdependencies between analysesAdequacy of analyses for naturally occurring text
![Page 48: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/48.jpg)
Matrix: Motivation
Promote consistent semantic representations
Reuse downstream technology in NLU applications while changing languagesTransfer-based (symbolic or stochastic MT)
![Page 49: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/49.jpg)
HPSG
Head-Driven Phrase Structure Grammar (Pollard & Sag 1994)
Mildly-context sensitive (Joshi et al 1991)
Typed feature-structuresDeclarative, order-independent, constraint-based formalism
![Page 50: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/50.jpg)
An HPSG consists of
A collection of feature-structure descriptions for phrase structure rules and lexical entriesOrganized into a type hierarchy, with supertypes encoding appropriate features and constraints inherited by subtypesAll rules and entries contain both syntactic and semantic information
![Page 51: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/51.jpg)
An HPSG is used
By a parser to assign structures and semantic representations to stringsBy a generator to assign structures and strings to semantic representationsRules, entries, and structures are DAGs, with type name labeling the nodesConstraints on rules and entries are combined via unification
![Page 52: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/52.jpg)
Example rule type
head-subj-phrase:
binary-headed-phrase &
head-compositional
SUBJ 〈 〉
COMPS 1
HEAD-DTR
[SUBJ 〈 2 〉
COMPS 1
]
NON-HEAD-DTR 2
![Page 53: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/53.jpg)
Example rule type
head-final:
binary-headed-phrase &
HEAD-DTR 1
NON-HEAD-DTR 2
ARGS 〈 2 , 1 〉
subj-head: head-subj-phrase & head-final
![Page 54: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/54.jpg)
Example parse2664
HEAD verb
SUBJ 〈 〉
COMPS 〈 〉
3775
1
2664
HEAD noun
SPR 〈 〉
COMPS 〈 〉
3775
Kim
2664
HEAD verb
SUBJ 〈 1 〉
COMPS 〈 〉
3775
danced
![Page 55: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/55.jpg)
Semantic Representations
Not going for an interlinguaNot representing connection to world knowledgeNot representing lexical semantics (the meaning of life is life’)Making explicit the relationships among parts of a sentence
![Page 56: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/56.jpg)
Semantic Representations
Kim gave a book to Sandygive(e,x,y,z), name(x,‘Kim’), book(y), name(z,‘Sandy’), past(e)
![Page 57: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/57.jpg)
Semantic Representations
Sandy was given a book by Kim.A book was given to Sandy by Kim.Kim continues to give books to Sandy.This is the book that Kim gave Sandy.Which book did Kim give Sandy?Which book do people often seem to forget that Pat knew Kim gave to Sandy?This book was difficult for Kim to give to Sandy.
![Page 58: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/58.jpg)
Semantic representations
Languages may still differ:Lexical predicates
Japanese: kore, sore, areGrammaticized tense/aspect, discourse statusWays of saying
make a wish, center divider
![Page 59: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/59.jpg)
Matrix Architecture
Cross-linguistic core encoding language universalsSet of mutually-compatible ‘modules’ encoding recurring, but non-universal patternsRapid prototyping of precision grammarsOngoing development through Ling 567
![Page 60: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/60.jpg)
Lecture outline
What is NLP?Applications & ApproachesResources used in NLPNLP subtasks; AmbiguityEvaluation in NLPPrecision grammar engineeringNLP around UW
![Page 61: Natural Language Processing (overview)courses.cs.washington.edu/courses/cse415/06wi/notes/BenderLecture.pdfInformation extraction-- Monolingual & multilingual. Applications (3/3) HCI](https://reader031.vdocuments.net/reader031/viewer/2022013003/5f32fc03b1f09b5b9f19283e/html5/thumbnails/61.jpg)
NLP around UW
Professional MA in Computational Linguistics http://www.compling.washington.edu
Computational Linguistics Lab http://depts.washington.edu/uwcl
Turing Center http://turing.cs.washington.edu
SSLI Lab http://ssli.ee.washington.edu
MS/UW Symposium in Computational Linguistics http://depts.washington.edu/uwcl/msuw/symposium.html
iSchool, Med School, ...