Download - Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models
![Page 1: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/1.jpg)
Fine-Grained LinguisticSoft Constraints on Statistical
Natural Language Processing Models
Yuval MartonPh.D. Dissertation DefenseDepartment of Linguistics
University of Maryland
![Page 2: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/2.jpg)
Unifiedcorpus-based model
with soft linguistic constraints
Syntactic(Parsing)
in stat. machine translation
Semantic(Phrases)
in stat. machine translation
Unifiedcorpus-based model
with soft linguistic constraints
Yuval Marton, Dissertation Defense 2
Dissertation Theme
• Hybrid knowledge/corpus-based statistical NLP models using fine-grained linguistic soft constraints
Syntactic(Parsing)
in stat. machine translation
Semantic(Words)
in word-pair similarity tasks
Semantic(Phrases)
in stat. machine translation
![Page 3: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/3.jpg)
Pure vs. Hybrid Models
• Pure models– Corpus-based, data-driven, distributional, statistical
• Statistical Machine Translation• Distributional Profiles (Context Vectors)
– Manually-crafted linguistic knowledge (rules, word grouping by concept), theory-driven• Rule-based / syntax-driven machine translation• WordNet/thesaurus-based semantic similarity measures
• Hybrid models– Here: bias data-driven models with linguistic constraints
Yuval Marton, Dissertation Defense 3
![Page 4: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/4.jpg)
Yuval Marton, Dissertation Defense 4
Hard and Soft Constraints
• Hard constraints– [0,1]; in/out– Decrease search space– Theory-driven– Faster, slimmer
• Soft constraints– [0..1]; fuzzy– Only bias the model– Data-driven: Let patterns emerge
Universe
Hard
Universe
Soft
![Page 5: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/5.jpg)
Yuval Marton, Dissertation Defense 5
Fine-GrainedSoft Linguistic Constraints
• Fine granularity is a big deal– Soft syntactic constraints in SMT
• Chiang 2005 vs. Marton and Resnik 2008• Negative results positive results
– Soft semantic constraints in word-pair similarity ranking • Mohammad and Hirst 2006 vs.
Marton, Mohammad and Resnik 2009• Positive results better results
– Soft semantic constraints in paraphrase generation for SMT• Callison-Burch et al. 2006 vs. Marton, Callison-Burch & Resnik 2009
![Page 6: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/6.jpg)
Yuval Marton, Dissertation Defense 6
Road Map Hybrid models with soft constraints
– Pure and hybrid models– Hard and soft constraints– Fine-grained
• Soft syntactic constraints– In statistical machine translation
• Soft semantic constraints – In word pair similarity tasks– In paraphrasing for statistical machine translation
• Unified model
![Page 7: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/7.jpg)
7
• Chiang 2005, 2007• Weighted synchronous CFG
– Unnamed non-terminals: X <e, f >e.g., X < 今年 X1, X1 this year>
• Translation model features:e.g., ϕ3 = log p(e|f)
• Log-linear model:+ rule penalty feature, “glue” rules
• These trees are not necessarily “syntactic”! – Not syntactic in the linguistic sense
Statistical Machine Translation: Hiero
的竞选 Election投票 在初选 voted in the primaries
Yuval Marton, Dissertation Defense
![Page 8: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/8.jpg)
Yuval Marton, Dissertation Defense 8
Previous (Coarse) Soft Syntactic Constraints
• X X1 speech ||| X1 discurso – What should be the span of X1?
• Chiang’s (2005) constituency feature– Reward rule’s score if rule’s
source-side matches a constituent span
– Constituency-incompatible emergent patterns can still ‘win’ (in spite of no reward)
– Good idea -- Neg-result
![Page 9: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/9.jpg)
Yuval Marton, Dissertation Defense 9
New (Fine-Grained) Soft Syntactic Constraints
• separate weighted feature for each constituent, e.g.:• NP-only: (NP= )• VP-only: (VP= )
![Page 10: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/10.jpg)
10
New Constraint Conditions
• VP-only, revisited:– We saw VP-match (VP= ):
Reward exact match of a VP sub-tree span
– We can also incur a penalty for crossing constituent boundaries, e.g., VP-cross (VP+ )
Yuval Marton, Dissertation Defense
![Page 11: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/11.jpg)
11
Constraint (Feature) Space• {NP, VP, IP, CP, …} x {match=,cross-boundary+}• Basic translation models:
– For each feature, add (only it) to default feature set, assigning it a separate weight.
• Feature “combo” translation models:– NP2 (double feature): add both NP= and NP+
with a separate weight for each– NP_ (conflated feature) ties weights of NP= and NP+
– XP=, XP+, XP2, XP_: conflate all labels that correspond to “standard” X-bar Theory XP constituents in each condition.
– All-labels= (Chiang’s), All-labels+, All-labels_, All-labels2
Yuval Marton, Dissertation Defense
![Page 12: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/12.jpg)
12
Chinese-English Results• Replicated Chiang 2005
constituency feature (negative result)
• NP=, QP+, VP+ up to .74 BLEU points better.
• XP+, IP2, all-labels_, VP2, NP_, up to 1.65 BLEU points better.
• Validated on the NIST MT08 test set
BLEU score: higher=better*,**: sig. better than baseline+,++: better than Chiang-05
(replicated)
Yuval Marton, Dissertation Defense
![Page 13: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/13.jpg)
13
Arabic-English Results• New result for Chiang’s
constituency feature (MT06, MT08)
• PP+, AdvP= up to 1.40 BLEU better than Chiang’s and baseline.
• AP2, AdvP2 up to 1.94 better.
• Validated on the NIST MT08 test set
*,**: sig. better than baseline+,++: better than Chiang-05
New!
Yuval Marton, Dissertation Defense
![Page 14: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/14.jpg)
14
PP+ Example: Arabic MT06
Source ... (PP (IN ب) (NP (NP (NN تعىىن) (NP (NN مندوب) (NP (NNP سورىا) (NNP لدى)))) (DT ال) (NP (NN امم) (NP (NN ال) (JJ متحدة))))))) …
Gloss …(PP (IN in) (NP (NP (NN appointment) (NP (NN representative) (NP (NNP syria) (NNP to)))) (DT the) (NP (NN nations) (NP (NN the) (JJ united))))))) …
Reference [the third decree ordered] the appointment of the syrian representative to the united nations …
Baseline … to appoint syria to the united nations representative …
PP+ … to appoint a representative of syria to the united nations …
Yuval Marton, Dissertation Defense
![Page 15: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/15.jpg)
15
Arabic-English Results – MIRA
Yuval Marton, Dissertation Defense
Chiang, Marton and Resnik (2008)
Previous problem of feature selection solved here:
![Page 16: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/16.jpg)
Yuval Marton, Dissertation Defense 16
Road Map Hybrid models with soft constraints
– Pure and hybrid models– Hard and soft constraints– Fine-grained
Soft syntactic constraints– In statistical machine translation
• Soft semantic constraints – In word pair similarity tasks– In paraphrasing for statistical machine translation
• Unified model
![Page 17: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/17.jpg)
Semantic Models• Forget Frege, alternative worlds, <e,t>, …• To model meaning of words, we can use
– “Pure” models• Knowledge-based: Manually crafted linguistic resources
(dictionary, thesaurus, taxonomies, WordNet)• Usage-based: Machine-generated distributional profiles
(containing word co-occurrence-based information)– Hybrid models
• Bias distributional profiles with soft semantic constraints– As we just saw with soft syntactic constraints– E.g, use thesaurus “concepts” as word senses, with which
to alter co-occurrence counts in distributional profiles
Yuval Marton, Dissertation Defense 17
![Page 18: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/18.jpg)
Yuval Marton, Dissertation Defense 18
Word-Based Distributional Profiles (DPs)
• Distributional Hypothesis (Harris 1940; Firth 1957)– DP (Context Vector) of “bank”:
Which words “bank” occurs next to• Strength of association
– Counts, PMI, TF/IDF-based, Log-likelihood ratios …
• Vector similarity (cosine, L1, L2,..)
linguistmoneyrivertellerwater
…
banklinguistmoneyriver
tellerwater…
tenure
α
![Page 19: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/19.jpg)
Yuval Marton, Dissertation Defense 19
Taxonomies and Groupings
• WordNet– Synsets– Classical Relations (“is-a”)– Arc distance– “The tennis problem”
• Thesaurus– Flat lists of related words– Potentially coarse – Implicit relations,
potentially non-classical
job
Academic job
Is-a
Professor
Is-a
Industry job
Is-a
CEO
Is-a
![Page 20: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/20.jpg)
Yuval Marton, Dissertation Defense 20
Concept-Based Distributional ProfilesMohammad & Hirst (2006) – Macquarie Thesaurus
• Word-based DP• Concept-based
DP– Approximate
senses– Aggregated– Coarse
• “bank” is listed under several concepts
• DP for each sense
linguistmoneyrivertellerwater
…
bank
linguistmoneyrivertellerwater
…
RIVERbank, boat,
wave, …
linguistmoneyrivertellerwater
…
FIN.INSTbank, dollar,
deposit, …
![Page 21: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/21.jpg)
Yuval Marton, Dissertation Defense 21
Concept-Based Distributional ProfilesMohammad & Hirst (2006) – Macquarie Thesaurus
• How similar are “bank” and “wave”?
• Compare all pairs of senses– FIN.INST, PHYSICS– FIN.INST, RIVER– RIVER, PHYSICS– RIVER, RIVER
• Return closest sense pair• Problem: bank = wave ??
bank
RIVERbank, boat,
wave, …
FIN.INSTbank, dollar,
deposit, …
wave
PHYSICSamp., wave, freq.,
…
![Page 22: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/22.jpg)
Yuval Marton, Dissertation Defense 22
New: Word/Concept Hybrid Model(Word Sense DP)
• Given the word’s word-based DP and concept-based DPs:
• Bias DP of “bank” towards DP of RIVER
• Create bankFIN.INST
similarly, etc.
linguistmoneyrivertellerwater
…
bank
linguistmoneyrivertellerwater
…
RIVERbank, boat,
wave, …
linguistmoneyrivertellerwater
…
bankRIVER
![Page 23: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/23.jpg)
Yuval Marton, Dissertation Defense 23
Fine-Grained Soft Semantic Constraints
• Hybrid models: best of all: fine-grained, sense-aware, widely applicable– bankFIN.INST ≠ bankRIVER ≠ waveRIVER !
• Two hybrid flavors:– Hybrid-filtered– Hybrid-proportional
Pros and cons: Word-based DP
Concept-based DP
Word senses: Smear senses Sense-awareRelations: co-occurrence Semantic RelatednessTarget granularity: Word level (fine) Aggregated (coarse)Applicability (vocab): Wide Limited
![Page 24: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/24.jpg)
Yuval Marton, Dissertation Defense 24
Evaluation: Word-Pair Similarity Task
• Give each word pair a similarity score– Rooster – voyage: 0.12– Coast – shore: 0.93
• Same part-of-speech pairs– Noun-noun (Rubinstein & Goodenough, 1965; Finkelstein et al. 2002)
– Verb-verb (Resnik & Diab, 2000)
• Result: list of pairs ordered by similarity• Evaluation metric: Spearman rank correlation
![Page 25: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/25.jpg)
Yuval Marton, Dissertation Defense 25
Word-Pair Similarity Results
![Page 26: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/26.jpg)
Yuval Marton, Dissertation Defense 26
Road Map Hybrid models with soft constraints
– Pure and hybrid models– Hard and soft constraints– Fine-grained
Soft syntactic constraints– In statistical machine translation
• Soft semantic constraints – In word pair similarity tasks– In paraphrasing for statistical machine translation
• Unified model
![Page 27: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/27.jpg)
Words Phrases
• Extend the word-based semantic similarity measures to “phrases”– she declined to provide any other information …– police refused to provide any other details …
• So far: See if y is similar to xNow: Find y’s similar to x
• Can solve other problems now!– Use these extended phrasal DPs to find
good paraphrases of unknown “phrases” in machine translation models
Yuval Marton, Dissertation Defense 27
informationmoney
declinedteller
details…
to provide
any otherbank
![Page 28: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/28.jpg)
Coverage Problem in Statistical Machine Translation
• Trained on parallel text• Every new test
document contains some “phrases” unknown to the model
Spanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish English
Test
set
SpanishSpanishSpanishSpanishSpanishSpanishSpanishSpanish
??
28Yuval Marton, Dissertation Defense
![Page 29: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/29.jpg)
Previous Solution: Pivoting
• Use other parallel texts to increase coverage
• Drawback: Parallel text is a limited resources!
Spanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish English
Test
set
SpanishSpanishSpanishSpanishSpanishSpanish’Spanish’’Spanish’’’
French SpanishFrench’’ Spanish’
’ French’’ Spanish
29Yuval Marton, Dissertation Defense
German SpanishGerman’ Spanish German’ Spanish’
![Page 30: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/30.jpg)
New Solution: Monolingually-Derived Paraphrases
• Use monolingual text to increase coverage
• Resources available in abundance!
Spanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish EnglishSpanish English
Test
set
SpanishSpanishSpanishSpanishSpanishSpanish’Spanish’’Spanish’’’M
onol
ingu
al te
xt
SpanishSpanishSpanishSpanishSpanish’Spanish’’Spanish’’’Spanish’’’’
SpanishSpanishSpanish
30Yuval Marton, Dissertation Defense
α
![Page 31: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/31.jpg)
Find Paraphrases
• Gather all contexts L _ R for phrase “to provide any other”:• What else appears between L _ R ?
31Yuval Marton, Dissertation Defense
Left context (L) __ Right context (R)declined to provide any other details
refused to provide any other information unable to provide any other details
failed to provide any other explanation
… to provide any other …
![Page 32: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/32.jpg)
Find Paraphrases
• Gather all contexts L _ R for phrase “to provide any other”:• What else appears between L _ R ?• Measure distributional similarity to each candidate, e.g.,
“to provide any other” -- “to give further”
32Yuval Marton, Dissertation Defense
Left context (L) __ Right context (R)declined to give further details
refused to provide any information unable to reveal any details
failed to provide further explanation
… to provide any other …
![Page 33: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/33.jpg)
Paraphrase Examples (Phrases)
•
33Yuval Marton, Dissertation Defense
![Page 34: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/34.jpg)
Paraphrase Examples (Unigrams)
•
34Yuval Marton, Dissertation Defense
![Page 35: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/35.jpg)
Paraphrase Feature Model
• Evidence reinforcement:If exist more than one fi paraphrases of f:Aggregate score with a “quasi-online updating”:asimi = asimi-1 + (1 – asimi-1) sim(fi,f), where asim0 = 0
35Yuval Marton, Dissertation Defense
Analogous to Callison-Burch et al. (2006)
![Page 36: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/36.jpg)
English to Chinese Results
• 29k line subset created to emulate low density language setting
* better than baseline+ better than non-hybrid
counterpart
36Yuval Marton, Dissertation Defense
![Page 37: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/37.jpg)
English-Chinese Translation Examples
Yuval Marton, Dissertation Defense 37
![Page 38: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/38.jpg)
Spanish to English
•
38Yuval Marton, Dissertation Defense
![Page 39: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/39.jpg)
Comparison with Corpus Size & Pivoting
•
39Yuval Marton, Dissertation Defense
![Page 40: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/40.jpg)
Yuval Marton, Dissertation Defense 40
Road Map Hybrid models with soft constraints
– Pure and hybrid models– Hard and soft constraints– Fine-grained
Soft syntactic constraints– In statistical machine translation
Soft semantic constraints – In word pair similarity tasks– In paraphrasing for statistical machine translation
• Unified model
![Page 41: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/41.jpg)
Yuval Marton, Dissertation Defense 41
Unified Model
• Soft linguistic constraints in a log-linear model– Syntactic– Semantic– …
• ihi(x)
• Constraints = Add more ihi(x) terms to the sum:
ihi(x) + jhj(x)
hi: Features / Constraints
i: Weight / importance of feature i
![Page 42: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/42.jpg)
Unified Model (Soft Syntactic Constraints)
• Straightforward: if is a translation model,
bias is syntactically, e.g., as follows:
+ jϕj(f,e)
1 If the source language where ϕj(f,e) = word sequence f is a VP.
0 Otherwise.
Yuval Marton, Dissertation Defense 42
![Page 43: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/43.jpg)
Unified Model (Soft Semantic Constraints)semantic distance of word e in sense s from word e’ in sense s’:
Yuval Marton, Dissertation Defense 43
where:
= K cosWord(e,e’)
= cosSense(es ,e’s’)
cross-termscross-terms
cos(es ,e’s’) =
fSense(e,s,wi)
fSense(e,s,wi)
fSense(e,s,wi)
fSense(e’,s’,wi) / ZC
/ ZC
/ ZC
/ ZC
fWord(e,wi)
fWord(e’,wi)
fWord(e,wi)
fWord(e,wi) fWord(e’,wi)
fSense(e’,s’,wi)
![Page 44: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/44.jpg)
Unifiedcorpus-based model
with soft linguistic constraints
Syntactic(Parsing)
in stat. machine translation
Semantic(Phrases)
in stat. machine translation
Main Contributions
Yuval Marton, Dissertation Defense 44
Unifiedcorpus-based model
with soft linguistic constraints
Syntactic(Parsing)
in stat. machine translation
Semantic(Words)
in word-pair similarity tasks
Semantic(Phrases)
in stat. machine translation
Fine-grained linguistic
soft constraints
Fine-grained linguistic
soft constraints
Fine-grained linguistic
soft constraintsin state-of-the-art
end-to-end phrase-based SMT systems
in state-of-the-art end-to-end
phrase-based SMT systems
distributional paraphrase generation
evidence reinforcement component
![Page 45: Fine-Grained Linguistic Soft Constraints on Statistical Natural Language Processing Models](https://reader036.vdocuments.net/reader036/viewer/2022081513/56816420550346895dd5e081/html5/thumbnails/45.jpg)
45
Thanks to…
• Defense Committee:– Philip Resnik, Chair/Advisor – Amy Weinberg, Advisor – William Idsardi, Member – Chris Callison-Burch,
Special Member (JHU) – Bonnie Dorr, Dean's
Representative• Ling Chair:
– Norbert HornsteinYuval Marton, Dissertation Defense
• Ling Cohort:– Ellen … Lau– Phil Monahan– Eri Takahashi– Rebecca McKeown– Chizuru Nakao
• CLIP Lab– David Chiang, Smara Muresan,
Hendra Setiawan, Adam Lopez, Chris Dyer, Asad Sayeed, Vlad Eidelman, Zhongqiang Huang, Denis Filimonov, and many others!