deep grammar error detection and automated lexical acquisition
TRANSCRIPT
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Deep Grammar Error Detection andAutomated Lexical Acquisition
Steps towards Wide-Coverage Open Texts Processing
Department of Computational LinguisticsSaarland University
IGK Colloquium17th Nov. 2005
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
What is deep processing?
Deep processing means to maximally exploit grammaticalknowledge for language processing.Focus on linguistic precision and semantic modellingGrammar-centric approachThe opposite of deep is not statistical but shallow.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Why we need deep processing?
Explicit model of grammaticalityAbility to capture subtle linguistic interactionsSemantics
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Problems with deep processing
EfficiencyDetailed language modelling creates large search space.Alleviated by efficient parsing algorithms and betterhardware
SpecificityLinguistic sound vs. application interestingRanking of the results is necessary.
Robustness/CoverageStrict grammaticality metricInsufficient coverage of the grammarDynamic nature of language
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Problems with deep processing
EfficiencyDetailed language modelling creates large search space.Alleviated by efficient parsing algorithms and betterhardware
SpecificityLinguistic sound vs. application interestingRanking of the results is necessary.
Robustness/CoverageStrict grammaticality metricInsufficient coverage of the grammarDynamic nature of language
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Problems with deep processing
EfficiencyDetailed language modelling creates large search space.Alleviated by efficient parsing algorithms and betterhardware
SpecificityLinguistic sound vs. application interestingRanking of the results is necessary.
Robustness/CoverageStrict grammaticality metricInsufficient coverage of the grammarDynamic nature of language
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Robustness and specificity
Robustness and specificity are a pair of dual problems.
Grammar EngineeringOvergeneration � specificityUndergeneration � robustness
ApplicationRanked outputHigh coverag overnoisy inputs
For deep grammars, a balance point should be achieved tomaximize linguistic accuracy.Robustness and specificity should come with extramechanism.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Robustness and specificity
Robustness and specificity are a pair of dual problems.
Grammar EngineeringOvergeneration � specificityUndergeneration � robustness
ApplicationRanked outputHigh coverag overnoisy inputs
For deep grammars, a balance point should be achieved tomaximize linguistic accuracy.Robustness and specificity should come with extramechanism.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Robustness and specificity
Robustness and specificity are a pair of dual problems.
Grammar EngineeringOvergeneration � specificityUndergeneration � robustness
ApplicationRanked outputHigh coverag overnoisy inputs
For deep grammars, a balance point should be achieved tomaximize linguistic accuracy.Robustness and specificity should come with extramechanism.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Robustness and specificity
Robustness and specificity are a pair of dual problems.
Grammar EngineeringOvergeneration � specificityUndergeneration � robustness
ApplicationRanked outputHigh coverag overnoisy inputs
For deep grammars, a balance point should be achieved tomaximize linguistic accuracy.Robustness and specificity should come with extramechanism.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Coverage problem of deep processing
Road-testing ERG over BNC [Baldwin et al., 2004]Test on 20,000 strings from BNCFull lexical span for only 32%Among these
57% are parsed (overall coverage 57%× 32% ≈ 18%)83% of the parses are correct40% parsing failures are caused by missing lexical entries39% parsing failures are caused by missing constructions
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
The focus of this talk
Deep grammar error detectionThe lexical coverage is a major problem for deepprocessing.Automated deep lexical acquisition
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Symbolic approach
Inductive Logic ProgrammingBackground ∧ Hypothesis � Evidence
ILP based grammar extension[Cussens and Pulman, 2000]After a failed parse, abduction is used to find needed edges,which, if they existed, would allow a complete parse of thesentence. Linguistic constraints are applied to restrict thegeneration of implausible edges.
ProblemsThe generated rules are too general or too specific.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Symbolic approach
Inductive Logic ProgrammingBackground ∧ Hypothesis � Evidence
ILP based grammar extension[Cussens and Pulman, 2000]After a failed parse, abduction is used to find needed edges,which, if they existed, would allow a complete parse of thesentence. Linguistic constraints are applied to restrict thegeneration of implausible edges.
ProblemsThe generated rules are too general or too specific.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Symbolic approach
Inductive Logic ProgrammingBackground ∧ Hypothesis � Evidence
ILP based grammar extension[Cussens and Pulman, 2000]After a failed parse, abduction is used to find needed edges,which, if they existed, would allow a complete parse of thesentence. Linguistic constraints are applied to restrict thegeneration of implausible edges.
ProblemsThe generated rules are too general or too specific.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Error Mining
[van Noord, 2004]Large hand-crafted grammars are error-prone.Manual detection of errors is time consuming.Small test suite based validations are not reliable.Parsing failures are good indication of (under-generating)errors.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Parsability
Definition
R(wi . . . wj) =C(wi ...wj |OK )
C(wi ...wj )
If the parsability of a particular word sequence is (much)lower, it indicates that something is wrong.Parsabilities can be calculated efficiently for large corpuswith suffix arrays and perfect hashing.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Error mining experiment of ERG with BNC
1.8M sentences (21.2M words) with only ASCII charactersand no more than 20 words eachRunning best-only parsing with PET took less 2 days on elf
Status Num. of Sentence PercentageParsed 301,503 16.74%No lexical span 1,260,404 69.97%No parse found 239,272 13.28%Edge limit reached 96 0.01%
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Error mining experiment of ERG with BNC
1.8M sentences (21.2M words) with only ASCII charactersand no more than 20 words eachRunning best-only parsing with PET took less 2 days on elf
Status Num. of Sentence PercentageParsed 301,503 16.74%No lexical span 1,260,404 69.97%No parse found 239,272 13.28%Edge limit reached 96 0.01%
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Error analysis
Number Percentageuni-gram 2,336 10.52%bi-gram 15,183 68.36%tri-gram 4,349 19.58%
Table: N-grams with R < 0.1unigram
bigram
trigram
other
unigrambigramtrigramother
N-gram Countweed 59the poor 49a fight 113in connection 85as always 84peered at 28the World Cup 57
Table: Examples
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
full lex span 541K sent.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
full lex span 541K sent.
22K N-grams
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
full lex span 541K sent.
22K N-grams
bi/tri-grams
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
full lex span 541K sent.
22K N-grams
bi/tri-grams
lex err
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Pin down the errors
1.8M sent.
full lex span 541K sent.
22K N-grams
bi/tri-grams
lex err cons. err
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Detecting lexical error
Missing lexical spanLow parsability unigramsLanguage dependent heuristics:i.e. low parsability bigrams started with determiner like“the poor”, “a fight”
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Unification-based approach
[Erbach, 1990, Barg and Walther, 1998, Fouvry, 2003]Use underspecified lexical entries to parse the wholesentenceGenerate lexical entries afterwards by collectinginformation from the full parse
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
An example of how unification-based approach works
the kangaroo jumps
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
An example of how unification-based approach works
the kangaroo jumps
24STEMD
"THE"E
HEAD det
35 »STEM
D"KANGAROO"
E–
26666666666664
STEMD
"JUMPS"E
HEAD
verb
24PERSON 2 3rd
NUM 3 sg
35
SUBJ
*264HEAD
noun
24PERSON 2
NUM 3
35375+
37777777777775
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
An example of how unification-based approach works
the kangaroo jumps
24STEMD
"THE"E
HEAD det
35 »STEM
D"KANGAROO"
E–
26666666666664
STEMD
"JUMPS"E
HEAD
verb
24PERSON 2 3rd
NUM 3 sg
35
SUBJ
*264HEAD
noun
24PERSON 2
NUM 3
35375+
37777777777775
26666666664
HEAD 5
SPR 〈〉
HEAD-DTR
24HEAD 5
SPRD
1E35
NON-HEAD-DTR 1
37777777775
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
An example of how unification-based approach works
the kangaroo jumps
24STEMD
"THE"E
HEAD det
35 »STEM
D"KANGAROO"
E–
26666666666664
STEMD
"JUMPS"E
HEAD
verb
24PERSON 2 3rd
NUM 3 sg
35
SUBJ
*264HEAD
noun
24PERSON 2
NUM 3
35375+
37777777777775
26666666664
HEAD 5
SPR 〈〉
HEAD-DTR
24HEAD 5
SPRD
1E35
NON-HEAD-DTR 1
37777777775
266664SUBJ 〈〉
HEAD-DTR
»SUBJ
D4
E–NON-HEAD-DTR 4
377775
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Problems with unification-based approaches
Generated lexical entries might be:too general: overgenerationtoo specific: undergeneration
Computational complexity increased significantly withunderspecified lexical entries, especially when twounknowns occur next to each other.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Outline
1 Background and MotivationDeep Processing: State-of-the-ArtCoverage of Deep Processing
2 Grammar Error DetectionPrevious Work on Grammar Error DetectionError Mining
3 Automated Lexical AcquisitionPrevious Work on Lexical AcquisitionStatistical Lexical Type Predictor
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Statistical approach
[Baldwin, 2005]Based on a set of lexical typesTreat lexical acquisition as a classification taskGeneralize the acquisition model over various sencondarylanguage resources
POS taggerChunkerTreebankDependency parserLexical ontology
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Importing lexicon from a semantic lexical ontology
AssumptionThere is a strong correlation between the semantic andsyntactic similarity of words. [Levin, 1993]
FactAbove 90% of the synsets in WordNet (2.0) share at least onelexical type among all included words.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Importing lexicon from a semantic lexical ontology
AssumptionThere is a strong correlation between the semantic andsyntactic similarity of words. [Levin, 1993]
FactAbove 90% of the synsets in WordNet (2.0) share at least onelexical type among all included words.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Importing lexicon from WordNet
[Baldwin, 2005]Construct semantic neighbours (all synonyms, directhyponyms, direct hypernyms)Take a majority vote across the lexical types of thesemantic neighbours
ImprovementVoting is weighted and must exceed a threshold.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Importing lexicon from WordNet
[Baldwin, 2005]Construct semantic neighbours (all synonyms, directhyponyms, direct hypernyms)Take a majority vote across the lexical types of thesemantic neighbours
ImprovementVoting is weighted and must exceed a threshold.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Importing lexicon from WordNet
Results
Baldwin05MTWVT
0.250
0.300
0.350
0.400
0.450
0.500
0.550
0.600
0.650
0.700
0.750
OverallAdvAdjVerbNoun
F−Sc
ores
Category
Importing Lexicon from WordNet
The sparse ERG lexicon (as compared to WordNet) makesthe voting less reliable.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Maximum entropy model based lexical type predictor
p(t , c) =exp(
∑i θi fi(t , c))∑
t ′∈T exp(∑
i θi fi(t ′, c))
A statistical classifier that predicts for each occurrence ofunknown word or missing lexical entryInput: features from the contextOutput: atomic lexical types
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Atomic lexical types
The lexical information is encoded inatomic lexical types.Attribute-value structures can bedecomposed into atomic lexicaltypes.
t
hF a | b
i
t
ta tb
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Baseline models
Select the majority lexical type for each POS
POS Majority Lexical Typenoun n_intr_leverb v_np_trans_leadj. adj_intrans_leadv. adv_int_vp_le
General purpose POS tagger trained with lexical types:TnT, MXPOST
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Basic features
Prefix/suffix of the wordContext words and their lexical types
Model PrecisionBaseline 30.7%TnT 40.4%MXPOST 40.2%ME basic 50.0%
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Partial parsing results
0 1 2 3 4
a b
c
d
0 1 2 3 4
Model PrecisionBaseline 30.7%TnT 40.4%MXPOST 40.2%ME basic 50.0%ME +pp 50.5%
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Turning to the disambiguation model
Generate top n candidateentries for the unknownwordParse the sentence withcandidate entriesUse disambiguation modelto select the best parsePick the correspondingentry
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Turning to the disambiguation model
Generate top n candidateentries for the unknownwordParse the sentence withcandidate entriesUse disambiguation modelto select the best parsePick the correspondingentry
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Turning to the disambiguation model
Generate top n candidateentries for the unknownwordParse the sentence withcandidate entriesUse disambiguation modelto select the best parsePick the correspondingentry
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Turning to the disambiguation model
Generate top n candidateentries for the unknownwordParse the sentence withcandidate entriesUse disambiguation modelto select the best parsePick the correspondingentry
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Experiment
Results
30
35
40
45
50
55
60
65
+disambi.ME(+pp)ME(−pp)taggerbaesline
Prec
ision
Model
DLA with LinGO ERG
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
What has been done?
Error mining based lexical error detectionExperiment with ERG and BNC shows a major part ofparsing failure is due to missing lexical entries.Some rules are used to discover missing lexical entries.
Statistical lexical acquisitionA maximum entropy based lexical type prediction model isdesigned and evaluated with various feature templates.Lexical ontology based acquisition method is tried.Disambiguation model is incorporated to enhancerobustness.
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Baldwin, T. (2005).Bootstrapping deep lexical resources: Resources for courses.In Proceedings of the ACL-SIGLEX Workshop on Deep Lexical Acquisition, pages 67–76, Ann Arbor,Michigan. Association for Computational Linguistics.
Baldwin, T., Bender, E. M., Flickinger, D., Kim, A., and Oepen, S. (2004).Road-testing the English Resource Grammar over the British National Corpus.In Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC2004), Lisbon, Portugal.
Barg, P. and Walther, M. (1998).Processing unkonwn words in HPSG.In Proceedings of the 36th Conference of the ACL and the 17th International Conference on ComputationalLinguistics, Montreal, Quebec, Canada.
Cussens, J. and Pulman, S. (2000).Incorporating Linguistics Constraints into Inductive Logic Programming.In Fourth Conference on Computational Natural Language Learning and of the Second Learning Languagein Logic Workshop.
Erbach, G. (1990).Syntactic processing of unknown words.IWBS Report 131, IBM, Stuttgart.
Fouvry, F. (2003).Lexicon acquisition with a large-coverage unification-based grammar.In Companion to the 10th of EACL, pages 87–90, ACL, Budapest, Hungary.
Levin, B. (1993).English verb classes and alternations.University of Chicago Press, Chicago, USA.
van Noord, G. (2004).
Background & Motivation Grammar Error Detection Automated Lexical Acquisition Summary
Error mining for wide-coverage grammar engineering.In Proceedings of the 42nd Meeting of the Association for Computational Linguistics (ACL’04), Main Volume,pages 446–453, Barcelona, Spain.