xu ly ngon ngu tu nhien
DESCRIPTION
Tai lieu ve xu ly ngon ngu tu nhienTRANSCRIPT
-
1X l ngn ng t nhin(Natural Language Processing)(Natural Language Processing)
L Thanh HngB mn H thng Thng tin
Vin CNTT &TT Trng HBKHNEmail: huonglt-fit@mail hut edu vn
1
Email: huonglt [email protected]
Mc ch mn hcz Hiu cc nguyn tc c bn v cc cch tip cn trong XLNNTNz Hc cc k thut v cng c c th dng pht trin cc h thng c cc u c g c c d g p cc g
hiu vn bn hoc ni chuyn vi con ngi z Thu c mt s tng v cc vn m trong XLNN
Ti liu tham khoz Christopher Manning and Hinrich Schtze. 1999. Foundations of
Statistical Natural Language Processing. The MIT Press. z Dan Jurafsky and James Martin 2000 Speech and Language
2
z Dan Jurafsky and James Martin. 2000. Speech and Language Processing. PrenticeHall.
z James Allen. 1994. Natural Language Understanding. The Benjamins/Cummings Publishing Company Inc.
Thng tin chungz nh giz Gia k: 30%
z im danh:10%im danh:10%z ng gp trong gi hc:10%z BTL: 10%
z Cui k: thi vit 70%z iu kin c thi cui k:
z Vng mt t hn 30% s ln im danh z C tt c cc loi im gia k
3
z Bi tp ln: z Vit tiu lun hoc ci t chng trnh theo nhm (
-
2X l NNTN = chuyn i m thanh thnh ng ngha
5NNTN l trung tm ca tr tu con ngi
X l NNTN l g?
6
z Mc ch: hiu c nhiu ngn ngz Khng ch n gin l x l xu hoc so khp t kho
ELIZA
z http://en.wikipedia.org/wiki/ELIZA
z http://www.cs.bham.ac.uk/research/projects/cogaff/eliza/
7
Ni chuyn vi Elizaz E: how are you today?z H: not so goodz E: why do you say not so good?z H: i suffer from allergies?z E: thats interestingz H: i come out in spotsz E: im not sure i know what youre talking aboutz H: spots on my facez E: your face
8
z E: your facez H: thats rightz E: is it really that right?
-
3Ni chuyn vi Elizaz H: youre a program, arent you?z E: does it please you to believe i am a program, am not i?z H: it sure doesH: it sure doesz E: i seez H: yupz E: a while back you talk about your facez H: what about your face?z E: what is it you really want to know?z H: are you my kind of girl?
9
z H: are you my kind of girl?z E: perhaps in your fantasies i am your kind of girlz H: in my fantasies you are blonde
Ni chuyn vi Elizaz E: what makes you think i am blonde?
H i d t kz H: i dont knowz E: dont you really know?z H: noz E: are you saying no just to be negative?z H: no
10
z E: you are being a bit negativez H: no im notz E: how long have you been not?
Ti sao nghin cu XLNNTN
z Nghin cu cch con ngi xc nh tz Nghin cu cch con ngi phn tch cuz Nghin cu cch con ngi hc mt ngn
ngz Nghin cu cch ngn ng tin ha
11
Cc ch trong XLNNTN
z Mc phn tch: c php, ng ngha, din ngn, thc chng,
z Cc bi ton con: gn nhn t loi, PTCP, phn gii nhp nhng t, phnt ch cu trc din ngn,
z Thut ton v phng php: da trn tp ng liu, da trn tri thc,
12
z Cc ng dng: trch rt thng tin, phn hi thng tin, dch my, hi p, hiu ngn ng t nhin,
-
4Cc mc phn tchz Morphology (hnh thi hc): cch t c xy dng,
cc tin t v hu t ca tcc tin t v hu t ca tz Syntax (c php): mi lin h v cu trc ng php
gia cc t v ngz Semantics (ng ngha): ngha ca t, cm t, v
cch din tz Discourse (din ngn): quan h gia cc hoc cc
cu
13
cuz Pragmatic (thc chng): mc ch pht ngn, cch
s dng ngn ng trong giao tipz World Knowledge (tri thc th gii): cc tri thc v
th gii, cc tri thc ngm
Hnh thi hcTing Anh: ngn ng bin hnh, a m titz kick kicks kicked kickingz kick, kicks, kicked, kickingz sit, sits, sat, sittingz murder, murders
Nhng khng phi lun thm v xa ui.z gorge, gorgeousz arm, army
rc r
v: nhi nht; n: nhng ci n, hm ni
14
Ting Vit: ngn ng khng bin hnh, n m tit cn tch tCnh tay Qun i
Tch tz Mt cu c th c n kh nng tch t, nhng ch 1
t h l trong chng l ngz Gii php n gin: ly chui m tit di nht bt u t v tr hin ti v c trong t in t
z Vn : chng cho tz Hc sinh | hc sinh | hc.z Hc sinh | hc | sinh hc
15
z Hc sinh | hc | sinh hc.) Lit k tt c cc kh nng c th v thit k mt
gii php la chn ci tt nht
Gn nhn t loiThe boy threw a ball to the brown dog.
z The/DT boy/NN threw/VBD a/DT ball/NN to/INthe/DT brown/JJ dog/NN./.
DT determiner t ch nhNN noun, danh t, s t hoc s nhiu
16
VBD verb, past tense ng t, qu khIN preposition gii tJJ adjective tnh t. du chm cu
-
5Gn nhn t loiCon nga con nga .
z Con nga/DT /gT con nga/DT /TT.
z ng/aT gi/TT i/Ph_t nhanh/TT qu/trng_t.
17
z ng gi/DT i/gT nhanh/TT qu/trng_t.
Ng php: nhp nhng cu trc (t loi)
Time flies like an arrow.
Time // flies like an arrow.VBZ gii t so snh (IN)
18
Time flies // like an arrow.NNS VBP
Ng php: nhp nhng cu trc (t loi)
ng gi // i nhanh qu.
ng // gi i nhanh qu.
19
Ng php: nhp nhng cu trc (lin kt)
SS
VP
NP
20
NP V NP PP PP I saw the man on the hill with a telescope.
-
6Ng php: nhp nhng cu trc (lin kt)
S
VP
NP
21
NP V NP PP PP I saw the man on the hill with a telescope.
Ng php: nhp nhng cu trc (lin kt)
S
VP
22
NP V NP PP PP I saw the man on the hill with a telescope.
Nhng ng php khng ni ln nhiu iu
z Colorless green ideas sleep furiously. [Chomsky]
z fire match arson hotelz plastic cat food can cover
23
Ng ngha: nhp nhng mc t vngz I walked to the bank ...
f th iof the river.to get money.
z The bug in the room ...was planted by spies.flew out the window.
z I work for John Hancock
24
z I work for John Hancock ...and he is a good boss.which is a good company.
-
7Din ngn: ng tham chiu
President John F. Kennedy was assassinated.The president was shot yesterday.Relatives said that John was a good father.JFK was the youngest president in history.His family will bury him tomorrow.
25
Friends of the Massachusetts native will hold a candlelight service in Mr. Kennedys home town.
Thc chngBn rt ra iu g t nhng iu ti ni? Bn
h th ?phn ng th no?
Lut hi thoiz Bn i my gi ri?z Anh a cho em l mui c khng?
26
g
Ni km theo din tz Ti c vi bn 500.000 l i Vit Nam s
thng.
Tri thc th gii
Mai i n ti C y gi mn bt tt C y liMai i n ti. C y gi mn bt tt. C y li tin boa v v nh.
z Mai n g vo ba ti? z Ai mang ba ti n cho Mai?
27
z Ai lm bt tt?z Mai c tr tin khng?
Tri thc v ngn ng: Chng ta bit g v cu ny? z Cc t phi xut hin theo mt trnh t nht nh:
a Ch kem n b Ch n kema. Ch kem n. b. Ch n kemz Cc b phn cu thnh cu:
ch = ch ng (subject); n kem = v ng (predicate)z Ai lm g cho ai:
ch th(ch), hnh ng(n), i tng(kem)
28
-
8Cc vn khc?
z Hai cu Mai ni ch n kem v Mai ph nhn ch nz Hai cu Mai ni ch n kem v Mai ph nhn ch n kem khng logic vi nhau
z Cu v th gii: bit 1 cu l ng hay sai c th trong mt vi trng hp c th n ng.
z Ti ung c ph espresso sng nay, nhng Mai thng
29
u g c p esp esso s g ay, g a t gminh khng hp l
Tri thc n
1. I want to solve the problemz I wanna solve the problem
2. I understand these studentsz These students I understandz I want these students to solve the problemz These students I want [x] to solve the
problem z [x]=these students 30
c trng ca ngn ng
z Mt s c th nh c:z Singing Sing+ing; Bringing bring+ing
z Duckling ?? Duckl +ingz Cn phi bit duckl khng phi l t
31
z Nhng khng th nh tt c v qu nhiu
Ngoi b nh, ta cn g?
S nhiu trong ting Anh:z Toy+s -> toyz ; add zz Book+s -> books ; add sz Church+s -> churchiz ; add izz Box+s-> boxiz ; add iz
32
Cn c h thng lut sinh/x l cc trng hp ny
-
9Phn tch = gn b ngoi vi cch biu din trong ca n
z V sao XLNNTN kh: What makes NLP hard: khng c tng ng 1-1 vi bt k cch biu din no.
z Ta cn bit cu trc d liu v thut ton thc hin, mc d c th xy ra bng n t
33
, y ghp bt c cng on x l no
Phn tch cu hi LSAT / (former) GRE
z Su tng iu khc C, D, E, F, G, H c trin lm trong cc phng 1, 2, 3 ca mt trin lm.
T C E th kh t hz Tng C v E c th khng trong cng phng.z Tng D v G pha trong mt phng.z Nu tng E v F trong cng phng th khng c tng no khc
trong phng z C ta nht 1 tng trin lm trong mt phng, khng c nhiu
hn 3 tng trong bt c phng noz Nu tng D c trin lm trong phng 3 v cc tng E, F trong
34
phng 1, trong cc pht biu di y, pht biu no ng:A. Tng C trong phng 1B. Tng H trong phng 1C. Tng G trong phng 2D. Tng C v H trong cng phng E. Tng G v F trong cng phng
U: A Bugs Life c chiu ti ch no ca Mountain View?
Gii quyt ng tham chiu
View?S: A Bugs Life c chiu rp Summit.U: Khi no n c chiu ? S: N c chiu lc 2pm, 5pm, v 8pm.U: Ti mun 1 ngi ln, 2 tr con cho bui chiu u tin. N gi bao nhiu?
35
z Cc ngun tri thc:z Tri thc min (Domain knowledge)z Tri thc v din ngn (Discourse knowledge)z Tri thc th gii (World knowledge)
Ti sao XLNNTN li kh?
NNTNNNTN:z Nhp nhng ti mi mcz Phc tp v mz Lin quan lp lun v th gii
36
-
10
Gii phpz Ta cn cc cng c no?z Tri thc v ngn ngz Tri thc v th giiz Cch kt hp cc tri thc
z Gii php tim nng:Cc m hnh xc sut xy dng t d liu
37
z Cc m hnh xc sut xy dng t d liuz P(maison house) caoz P(Lavocat general the general avocado) thp
Nhc li cc bi ton trong XLNNTN
z Vo: chui k tz Ra: cc cp (gc t, th hnh thi t )z Cc vn :z Kt hp cc thnh phn cu to nn tz Loi hnh thi t (t bin t, t phi sinh, t ghp) z V d: quotations ~ quote/V + -ation(der V->N) +z V d: quotations ~ quote/V + -ation(der.V->N) +
NNS.
38
Phn tch c php
z Vo: chui cc cp (t/t loi)z Ra: cu trc ng php ca cu vi cc nt c gn nhn (t, t loi, vai tr ng php)
z Vn : z Quan h gia t, t loi, v cu trc cuz S dng nhn c php (Ch ng v ng b ngz S dng nhn c php (Ch ng, v ng, b ng,
.)z V d: Ti/aT nhn thy/gT Mai/DT ((Ti/aT)CN ((nhn thy/gT) (Mai/DT)OBJ)VN)C
39
Ng ngha
z Vo: cu trc ng php ca cuz Ra: cu trc ng ngha ca cuz Vn :z Quan h gia cc i tng nh ch th
(Subject), i tng (Object), tc nhn (Agent), hu qu (Effect) v cc loi khcq ( )
((Hc sinh/DT)CN ((hc/gT sinh hc/DT)gN)VN)C(Hc sinh/DT)Sbj (hc/gT)action (sinh hc/DT)Obj
40
-
11
Cc ng dng ca XLNNTNz Kh: x l ting ni (speech processing),
dch my (machine translation) trch rtdch my (machine translation), trch rt thng tin (information extraction), giao din hi thoi = NNTN (dialog interface), hi p (question answering)
z ng dng hin nay: sa li chnh t, phn loi vn bn, loi vn bn,
41
-
12
Trch rt thng tin
Martin Baker, a person
4646
Genomics job
Employers job posting form
Trch rt thng tin
October 14 2002 4:00 a m PTOctober 14, 2002, 4:00 a.m. PT
For years, Microsoft Corporation CEO Bill Gatesrailed against the economic philosophy of open-source software with Orwellian fervor, denouncing its communal licensing as a "cancer" that stifled technological innovation.
Today, Microsoft claims to "love" the open-source concept, by which software code is made public to encourage improvement and development by outside programmers. Gates himself says Microsoft will gladly disclose its crown jewels--the
NAME TITLE ORGANIZATIONBill Gates CEO MicrosoftBill Veghte VP MicrosoftRichard Stallman founder Free Soft..
IE
47
coveted code behind the Windows operating system--to select customers.
"We can be open source. We love the concept of shared source," said Bill Veghte, a Microsoft VP. "That's a super-important shift for us in terms of code access.
Richard Stallman, founder of the Free Software Foundation, countered saying
Newsinessence [Radev & al. 01]
-
13
49
Google News [02]
-
Tch t ting Vit
L Thanh HngB mn H thng Thng tin
Vin CNTT &TT Trng HBKHNEmail: [email protected]
1
Tch tz Mc ch: xc nh ranh gii ca cc t trong cu. z L bc x l quan trng i vi cc h thng XLNNTN,
c bit l i vi cc ngn ng n lp, v d: m tit Trung Quc, m tit Nht, m tit Thi, v ting Vit.
z Vi cc ngn ng n lp, mt t c th c mt hoc nhiu m tit.
Vn ca bi ton tch t l kh c s nhp nhng trong ranh gii t.
2
T vng
z ting Vit l ngn ng khng bin hnhz T in t ting Vit (Vietlex): >40.000 t,
trong :81 55% tit l t t z 81.55% m tit l t : t n
z 15.69% cc t trong t in l t nz 70.72% t ghp c 2 m titz 13.59% t ghp 3 m titz 1.04% t ghp 4 m tit
3
T vng
di # %1 6,303 15.692 28,416 70.723 2 259 5 623 2,259 5.624 2,784 6.935 419 1.04Tng 40,181 100
4
Bng 1. di ca t tnh theo m tit
Qui tc cu to t ting Vitz T n: dng mt m tit lm mt t. z V d: ti, bc, ngi, cy, hoa, i, chy, v, , , nh, nh...
z T ghp: t hp (ghp) cc m tit li, gia cc m tit c quan h v ngha vi nhau.
z T ghp ng lp. cc thnh t cu to c quan h bnh ng vi nhau v ngha. z V d: ch ba, bp nc
z T ghp chnh ph. cc thnh t cu to ny ph thuc vo thnh t cu to kia. Thnh t ph c vai tr phn loi, chuyn bit ho v sc thi ho cho thnh t chnh. z V d: tu ho, ng st, xu bng, tt m, ngay , thng
tp, sng v...
5
Qui tc cu to t ting Vitz T ly: cc yu t cu to c thnh phn ng m c lp
li; nhng va lp va bin i. Mt t c lp li cng cho ta t ly.
z Bin th ca t: c coi l dng lm thi bin ng hoc dng "li ni" ca t.dng li ni ca t. z Rt gn mt t di thnh t ngn hnz ki-l-gam ki l/ k l
z Lm thi ph v cu trc ca t, phn b li yu t to t vi nhng yu t khc ngoi t chen vo. V d:z kh s lo kh lo sz ngt ngho ci ngt ci nghoz danh li + ham chung ham danh chung li
6
-
Qui tc cu to t ting Vitz Cc din t gm nhiu t (vd, bi v) cng c coi l
1 tz Tn ring: tn ngi v v tr c coi l 1 n v t
vng z Cc mu thng xuyn: s, thi gian
7
Cc hng tip cnz Tip cn da trn t inz Tip cn theo phng php thng kz Kt hp hai phng php trn.
8
Cc phng phpz So khp t di nht (Longest Matching)z Hc da trn s ci bin (Transformation-based
Learning TBL)z Chuyn i trng thi trng s hu hn (Weighted Finite
State Transducer WFST)z hn lon cc i (Maximum Entropy ME)z Hc my s dng m hnh Markov n (Hidden Markov
Models- HMM) z Hc my s dng vect h tr (Support Vector
Machines)z Kt hp mt s phng php trn
9
Tip cn da trn t in
z Xy dng t inz Mi mc t lu thng tin v t, t loi, ngha loiz T chc sao cho tn t b nh v thun tin trong vic
tm kimz M ha t in: T loi v ngha loi kiu byte c lu
di dng mt k t. z VD: danh t -112 p, - 115 s
10
Tip cn da trn t inz Phn trang theo hai ch ci u ca t, sp tng. Vi mi trang,
cc t li c sp theo vn ABC.
ba b xe......
Content
Paragraph1 2 n
11
bao
b ngoi bi tp
xe c xe p
Content
1
2
n
Tm t trong t in
z di ti a ca t? 3? 4? 5?z Vn : khng x l c cc t hp t c nh, vd "ng chng b chuc tt t h t t i a ra tt c cc t ghp c trong t in trng vi phn u ca xu vo
12
-
Tm t trong t inNu nh my ngh th ta v
V tr t: 0 1 2 3 4 5 6 7z Ta c bng sau:zz
z K hiu:z - LT - DTz - gT - aT
13
Phn gii nhp nhng
z Ly tt c cc cch phn tch, nu phn tch c php cho ra cy ng th l cch phn tch ng.
14
Cch tip cn lai2008.>
z Kt hp phn tch automat hu hn + biu thc chnh quy + so khp t di nht + thng k ( gii quyt nhp nhng)
15
Biu thc chnh quiz l mt khun mu c so snh vi mt chui z Cc k t c bit: z * - bt c chui k t no, k c khng c gz x t nht 1 k tz + - chui trong ngoc xut hin t nht 1 lnV dz V d: z Email: x@x(.x)+z dir *.txtz *John -> John, Ajohn, Decker John
z Biu thc chnh quy c s dng c bit nhiu trong:* Phn tch c php* Xc nhn tnh hp l ca d liu* X l chui* Tch d liu v to bo co
16
Automat hu hnz Lp ngn ng chnh qui, c on nhn bi my o,
gi tn l automat hu hn.z Automat hu hn n nh (Deterministic Finite Automat a DFAz Automat hu hn khng n nh (Nondeterministic Finite
Automat a NFA)Automat a NFA)z Automat hu hn khng n nh, chp nhn php truyn rng
(-NFA)
17
Gii thiu phi hnh thc v automat hu hn
z Mt bi ton trong automat l nhn din chui w c thuc v ngn ng L hay khng.
z Chui nhp c x l tun t tng k hiu mt t tri sang phimt t tri sang phi.
z Trong qu trnh thc thi, automat cn phi nh thng tin qua x l.
18
-
V d v automat hu hnL = {w {0, 1}* | w kt thc bng chui con 10}.
19
Automat hu hn cho cc t ting Anh
20
Cch tch t n ginz Pht hin cc mu thng thng nh tn ring, ch vit
tt, s, ngy thng, a ch email, URL, s dng biu thc chnh qui
z H thng chn chui m tit di nht t v tr hin ti v g c trong t in, chn cch tch c t t nht
Hn ch: c th a ra cch phn tch khng ng.
Gii quyt: lit k tt, c 1 chin lc chn cch tch tt nht.
21
La chn cch tch tz Biu din on bng chui cc m tit s1 s2 snz Trng hp nhp nhng thng xuyn nht l 3 t lin nhau s1s2s3
trong s1s2 v s2s3 u l t.
z BIu din 1 on bng th c hng tuyn tnh G = (V,E), V = {v0, v1, . . . , vn, vn+1}
z Nu cc m tit si+1, si+2, . . . , sj to thnh 1 t -> trong G c cnh (vi,vj)
z Cc cch tch t = cc ng i ngn nht t v0 n vn+122
Thut tonThut ton 1. Xy dng th cho chui s1s2 . . . sn1: V ;2: for i = 0 to n + 1 do3: V V {vi};4: end for5: for i = 0 to n do5: for i = 0 to n do6: for j = i to n do7: if (accept(AW, si sj)) then8: E E {(vi, vj+1)};9: end if10: end for11: end for12: return G = (V,E);
23accept(A, s): automat A nhn xu vo s
Phn gii nhp nhng
z Xc sut xu s:
z P(wi|w1i-1): xc sut wi khi c i-1 m tit trc
z n = 2: bigram; n = 3: trigram
24
-
Phn gii nhp nhngz Khi n = 2, tnh gi tr P(wi|wi-1) ln nht maximum
likelihood (ML)
z c(s): s ln xu s xut hin; N: tng s t trong tp luynz Khi d liu luyn nh hn kch c ton b tp d liu
P ~ 0z S dng k thut lm trn
25
K thut lm trn
vi 1 + 2 = 1 v 1, 2 0PML(wi) = c(wi)/Nz Vi tp th nghim T = {s1,s2,,sn}, xc sut P(T) ca tp
thth: z Entropy ca vn bn:
vi NT: s t trong Tz Entropy t l nghch vi xc sut trung bnh ca 1 cch tch
t cho cc cu trong vn bn th nghim.26
Xc nh gi tr 1, 2z T tp d liu mu, nh ngha C(wi-1,wi) l s ln (wi-1,
wi) xut hin trong tp mu. Ta cn chn 1 2 lm cc i gi tr
vi 1 + 2 = 1 v 1, 2 0
Thut ton
28
Kt quz S dng tp d liu gm 1264 bi trong bo Tui tr, c 507,358 tz Ly = 0.03, cc gi tr hi t sau 4 vng lp
z chnh xc = s t h thng xc nh ng/tng s t h thng xc nh = 95%
29
-
Gn nhn t loi
L Thanh Hng
1
L Thanh HngB mn H thng Thng tin
Vin CNTT &TT Trng HBKHNEmail: [email protected]
nh nghaz Gn nhn t loi (Part of Speech tagging - POS
tagging): mi t trong cu c gn nhn th t loi tng ng ca n
z Vo : 1 on vn bn tch t + tp nhnz Ra: cch gn nhn chnh xc nht
2
z Ra: cch gn nhn chnh xc nht
V d 1V d 2V d 3V d 4V d 5
Gn nhn lm cho vic phn tch vn bn d dng hn
Ti sao cn gn nhn?z D thc hin: c th thc hin bng nhiu phng php
khc nhauz Cc phng php s dng ng cnh c th em li
kt qu ttM d th hi b h t h b
3
z Mc d nn thc hin bng phn tch vn bnz Cc ng dng:z Text-to-speech: record - N: [reko:d], V: [riko:d]; lead
N [led], V: [li:d]z Tin x l cho PTCP. PTCP thc hin vic gn nhn
tt hn nhng t hnz Nhn dng ting ni, PTCP, tm kim, v.v
z D nh gi (c bao nhiu th c gn nhn ng?)
Tp t loi ting Anh
z Lp ng (cc t chc nng): s lng c nhz Gii t (Prepositions): on, under, over,z Tiu t (Particles): abroad, about, around, before, in,
instead, since, without,
4
z Mo t (Articles): a, an, thez Lin t (Conjunctions): and, or, but, that,z i t (Pronouns): you, me, I, your, what, who,z Tr ng t (Auxiliary verbs): can, will, may, should,
z Lp m: c th c thm t mi
Lp t m trong ting Anh
open class
verbs
Proper nouns: IBM, Colorado
nounscommon nouns
count nouns: book, ticket
mass nouns: snow, saltauxiliaries
Color: red, white
. . .
5
p
adverbs
adjectives Age: old, young
Value: good, bad
Degree adverbs: extremely, very, somewhat
Manner adverbs: slowly, delicately
Temporal adverbs: yesterday, Monday
Locatives adverbs: home, here, downhill
Tp nhn cho ting Anh
z tp ng liu Brown: 87 nhnz 3 tp thng c s dng: Nh: 45 nhn - Penn treebank (slide sau)
6
Nh: 45 nhn - Penn treebank (slide sau) Trung bnh: 61 nhn, British national corpus Ln: 146 nhn, C7
-
7I know that blocks the sun.He always books the violin concert tickets early.He says that book is interesting.
Penn Treebank v d
z The grand jury commented on a number of other topics.
8
The/DT grand/JJ jury/NN commented/VBDon/IN a/DT number/NN of/IN other/JJ topics/NNS ./.
Kh khn trong gn nhn t loi?
l x l nhp nhng
9
Cc phng php gn nhn t loi
z Da trn xc sut: da trn xc sut ln nht, da trn m hnh Markov n (hidden markov model HMM)
Pr (Det N) > Pr (Det Det)
10
Pr (Det-N) > Pr (Det-Det)
z Da trn lutIf Then
Cc cch tip cn
z S dng HMM : S dng tt c thng tin c v on
z Da trn rng buc ng php: khng
11
g g p p gon, ch loi tr nhng kh nng sai
z Da trn chuyn i: on trc, sau c th thay i
Gn nhn da trn xc sut
Cho cu hoc 1 xu cc t, gn nhn t loi thng xy ra nht cho cc t trong xu .
Cch thc hin:
12
z Hidden Markov model (HMM): Chn th t loi lm ti a xc sut:P(t|t loi)P(t loi| n t loi pha trc)The/DT grand/JJ jury/NN commented/VBD on/IN a/DTnumber/NN of/IN other/JJ topics/NNS ./.
P(jury|NN) = 1/2
-
V d -HMMs
13
Thc hin hc c gim st, sau suy din xc nh th t loi
Gn nhn HMM
z Cng thc Bigram HMM: chn ti cho wi c nhiu kh nng nht khi bit ti-1 v wi :ti = argmaxj P(tj | ti-1 , wi) (1)
z Gi thit n gin ha HMM: vn gn nhn
14
z Gi thit n gin ha HMM: vn gn nhn c th gii quyt bng cch da trn cc t v th t loi bn cnh n
ti = argmaxj P(tj | tj-1 )P(wi | tj ) (2)
xs chui th(cc th ng xut hin)
xs t thng xut hin vi th tj
V d
1. Secretariat/NNP is/VBZ expected/VBN to/TO race/VBtomorrow/NN
2. People/NNS continue/VBP to/TO inquire/VB the/DTreason/NN for/IN the/DT race/NN for/IN outer/JJ
15
space/NNz Khng th nh gi bng cch ch m t trong tp ng
liu (v chun ha)z Mun 1 ng t theo sau TO nhiu hn 1 danh t (to
race, to walk). Nhng 1 danh t cng c th theo sau TO (run to school)
Gi s chng ta c tt c cc t loi tr t race
z Ch nhn vo t ng trc(bigram):to/TO race/??? NN or VB?the/DT race/???
I/PP know/VBP that/WDT block/NN blocks/NNS?VBZ? the/DT sun/NN.
16
z p dng (2):
z Chn th c xc sut ln hn gia 2 xc sut:P(VB|TO)P(race|VB) hoc P(NN|TO)P(race|NN)
xc sut ca 1 t l race khi bit t loi l VB.
ti = argmaxj P(tj | tj-1 )P(wi | tj )
Tnh xc sutXt P(VB|TO) v P(NN|TO)z T tp ng liu Brown
P(NN|TO)= .021P(VB|TO)= .340
17
P(race|NN)= 0.00041P(race|VB)= 0.00003
z P(VB|TO)P(race|VB) = 0.00001z P(NN|TO)P (race|NN) = 0.000007
race cn phi l ng t nu i sau TO
Bi tpz I know that blocks the sun.z He always books the violin concert tickets early.z He says that book is interesting.
z I/PP know/VBP that/WDT blocks/VBZ the/DT sun/NN.
18
z He/PP always/RB books/VBZ the/DT violin/NN concert/NN tickets/NNS early/RB.
z I know that block blocks the sun.z I/PP know/VBP that/DT block/NN blocks/NNS?VBZ?
the/DT sun/NN.
z He/PP says/VBZ that/WDT book/NN is/VBZ interesting/JJ.
-
M hnh y z Chng ta cn tm chui th tt nht cho ton xuz Cho xu t W, cn tnh chui t loi c xc sut ln
nhtT=t1, t2 ,, tn hoc,
19
(nguyn l Bayes)
arg max ( | )T
T P T W
=
M rng s dng lut chui
P(A,B) = P(A|B)P(B) = P(B|A)P(A)
P(A,B,C) = P(B,C|A)P(A) = P(C|A,B)P(B|A)P(A) = P(A)P(B|A)P(C|A,B)
20
P(A,B,C,D) = P(A)P(B|A)P(C|A,B)P(D|A,B,C..)
1 1 1 1 1 1 1 11
( ) ( | ) ( | ... ) ( | ... )n
i i i i i i ii
P T P W T P w w t w t t P t w t w t =
=lch s nhnpr t
Gi thit trigram
z Xc sut 1 t ch ph thuc vo nhn ca n
1 1 1( | ... ) ( | )i i i i iP w w t t t P w t=
21
z Ta ly lch s nhn thng qua 2 nhn gn nht (trigram: 2 nhn gn nht + nhn hin ti)
1 1 1( | ... ) ( | )i i i i iP w w t t t P w t
1 1 1 2 1( | ... ) ( | )i i i i iP t w t t P t t t =
Thay vo cng thc
n n
P(T)P(W|T) =
22
1 2 1 2 13 1
( ) ( | ) ( | )[ ( | )]i i i i ii i
P t P t t P t t t P w t = =
nh gi xc sutz S dng quan h xc sut t tp ng liu nh gi xc sut:
2 1( )( | ) i i ic t t tP t t t
23
2 11 2
2 1
( )( | )( )i i i
i i ii i
P t t tc t t
=
( , )( | )( )i i
i ii
c w tP w tc t
=
Bi ton
Cn gii quyt
arg max ( ) ( | )T P T P W T=
24
By gi ta c th tnh c tt c cc tch P(T)P(W|T)
arg max ( ) ( | )T
T P T P W T
=
-
V dNNS
DT
NNS
NNS
25
the dogVB
sawVBP
ice-cream
Tm ng i tt nht?
Tm ng i c im cao nht
NNS NNS
7530
NNS1
1 2 1 2 13 1
( ) ( | ) ( | )[ ( | )]n n
i i i i ii i
P t P t t P t t t P w t = =
26
the dog
VB
DT
sawVBP
ice-cream
75
1
60301
NNS1
52
Cch tm ng i c im cao nhtz S dng tm kim kiu best-first (A*)
1. Ti mi bc, chn k gi tr tt nht ( ) . Mi gi tr trong k gi tr ny ng vi 1 kh nng kt hp nhn ca tt c cc t
27
2. Khi gn t tip theo, tnh li xc sut. Quay li bc 1
z u: nhanh (khng cn kim tra tt c cc kh nng kt hp, ch k ci tim nng nht)
z Nhc: c th khng tr v kt qu tt nht m ch chp nhn c
chnh xcz > 96%z Cch n gin nht? 90%
z Gn mi t vi t loi thng xuyn nht ca n
28
nz Gn t cha bit = danh t
z Ngi: 97%+/- 3%; nu c tho lun: 100%
Cch tip cn th 2: gn nhn da trn chuyn i
Transformation-based Learning (TBL):
z Kt hp cch tip cn da trn lut v cch tip t d h h h l i th
29
cn xc sut: s dng hc my chnh li th thng qua vi ln duyt
z Gn nhn s dng tp lut tng qut nht, sau n tp lut hp hn, thay i mt s nhn, v tip tc
Transformation-based painting
30
-
Transformation-based painting
31
Transformation-based painting
32
Transformation-based painting
33
Transformation-based painting
34
Transformation-based painting
35
Transformation-based painting
36
-
V d vi TBL
37
V d vi TBL
1. Gn mi t vi nhn thng xut hin nht (thng chnh xc khong 90% ). T tp ng liu Brown:P(NN|race)= 0.98
38
( | )P(VB|race)= 0.02
2. expected/VBZ to/ TO race/NN tomorrow/NNthe/DT race/NN for/IN outer/JJ space/NN
3. S dng lut chuyn i:Thay NN bng VB khi th trc l TO
pos: NN>VB pos: TO @[-1] o
TO race/VB
Lut gn nhn t loi
39
Lut gn nhn t loi
40
Hc lut TB trong h thng TBL
41
Cc tp ng liu
z Tp hun luynw0 w1 w2 w3 w4 w5 w6 w7 w8 w9 w10
z Tp ng liu hin ti (CC 1)
42
z Tp ng liu hin ti (CC 1)dt vb nn dt vb kn dt vb ab dt vb
z Tp ng liu tham khodt nn vb dt nn kn dt jj kn dt nn
-
Khun dng cho lut gn nhn t loiz Trong TBL, ch cc lut tha khun dng mi c
hc.z V d: cc lut
tag:'VB'>'NN' tag:'DT'@[-1].
43
tag: VB NN tag: DT @[ 1].tag:NN>VB' tag:'DT'@[-1].
tha khun dngtag:A>B tag:C@[-1].
z C th to khun dng s dng cc bin v danhtag:_>_ tag:_@[-1].
Hc lut TB trong h thng TBL
44
im, chnh xc, ngng
z im ca 1 lut:
score(R) = |pos(R)| - |neg(R)|
z chnh xc:
45
z Threshold: ngng m chnh xc ca 1 lut cn vt qua c th c la chn.
z Trong TBL, ngng ca chnh xc thng < 0.5.
Sinh v tnh im cho lut ng vin 1z Template = tag:_>_ tag:_@[-1]z R1 = tag:vb>nn tag:dt@[-1]
46
z pos(R1) = 3z neg(R1) = 1z score(R1) = pos(R1) - neg(R1) = 3-1 = 2
Sinh v tnh im cho lut ng vin 2z Template = tag:_>_ tag:_@[-1]z R2 = tag:nn>vb tag:vb@[-1]
47
z pos(R2) = 1z neg(R2) = 0z score(R2) = pos(R2) - neg(R2) = 1-0 = 1
Hc lut TB trong h thng TBL
48
-
Chn lut tt nht
z Th hng hin ti ca lut ng vinR1 = tag:vb>nn tag:dt@[-1] Score = 2R2 = tag:nn>vb tag:vb@[-1] Score = 1
49
z Nu score threshold =< 2 th chn R1z ngc li nu score threshold > 2, dng
Ti u ha vic chn lut tt nht
z Gim d tha lut:ch sinh cc lut ng vin ph hp t nht vi 1 d liu trong tp luyn.
z nh gi tng cng:
50
z nh gi tng cng: z Lu vt ca cc lut ng vin tt nhtz B qua cc lut ph hp vi s lng mu <
score ca lut tt nht
Tm kim tham lam kiu Best-First
Hm gi
h(n) = gi c lng ca ng i r nht t trng thi ca nt n n trng thi ch
51
thi ca nt n n trng thi ch
u im ca TBL
z Lut c th c to th cng
z Lut d hiu v logic
52
z D ci t
z C th chy rt nhanh (nhng ci t th phc tp)
Phn tch li: kh khn i vi b gn nhn t loi
Cc li thng thng (> 4%)z NN (common noun) vs .NNP (proper noun) vs. JJ
(adjective): kh phn bit, s phn bit ny l quan
53
( j ) p p y qtrng c bit trong trch rt thng tin
z RP(particle) vs. RB(adverb) vs. IN(preposition):tt c cc loi ny c th xut hin tun t sau ng t
z VBD vs. VBN vs. JJ: phn bit thi qu kh, phn t 2, tnh t (raced vs. was raced vs. the out raced horse)
Cch tt nht pht hin cc t cha bit
z Da trn 3 dng ui bin t (-ed, -s, -ing); 32 ui phi sinh (-ion, etc.); ch hoa; gch ni
54
z Tng qut hn:z Phn tch hnh thi tz Cc cch tip cn hc my
-
Gn nhn t loi ting VitCu ting Vit tch t
Qua nhng ln t Si_Gn v Qung_Ngi kim_tra cng_vic , Sophie v Jane thng tr_chuyn vi Mai , cm_nhn ngn_la_sng v nim_tin mnh_lit t ngi ph_n VN ny .
Cu ting Vit
Qua nhng ln t Si_Gn v Qung_Ngi kim_tra cng vic Sophie v Jane thng tr chuyn viVit
c gn nhn t loi
cng_vic , Sophie v Jane thng tr_chuyn vi Mai , cm_nhn ngn_la_sng v nim_tin mnh_lit t ngi ph_n VN ny .
Ch thch t loi
55
Cc bc thc hinz Tch tz Gn nhn tin nghim (gn mi t vi tt c cc nhn t loi m
n c th c). z Vi mt t mi, dng mt nhn ngm nh hoc gn cho n tp
tt c cc nhn. Vi ngn ng bin i hnh thi da vo hnh g g thi t
z Quyt nh kt qu gn nhn (loi b nhp nhng)z da vo quy tc ng phpz da vo xc sutz s dng mng n-ron z cc h thng lai s dng kt hp tnh ton xc sut v rng buc
ng phpz gn nhn nhiu tng
56
D liu phc v gn nhn
z Ng liu: z T in t vngz Kho vn bn gn nhn, c th km theo cc quy
tc ng php xy dng bng taytc g p p y d g b g tayz Kho vn bn cha gn nhn, c km theo cc thng
tin ngn ng nh l tp t loiz Kho vn bn cha gn nhn, vi tp t loi c xy
dng t ng nh cc tnh ton thng k
57
Kh khn trong gn nhn t loi ting Vit
z c trng ring v ngn ngz thiu cc kho d liu chun nh Brown hay
Penn Treebank kh kh t h i kt kh khn trong nh gi kt qu
58
Cch tip cn 1[inh in] Dien Dinh and Kiem Hoang, POS-tagger for English-
Vietnamese bilingual corpus. HLTNAACL Workshop on Building and using parallel texts: data driven machine translation and beyond, 2003.
z chuyn i v nh x t thng tin t loi t ting Anh doz gn nhn t loi trong ting Anh t chnh xc
cao ( >97%) z nhng thnh cng gn y ca cc phng php
ging hng t (word alignment methods) gia cc cp ngn ng.
59
[inh in]z Xy dng mt tp ng liu song ng Anh Vit ~ 5 triu
t (c Anh ln Vit).
z gn nhn t loi cho ting Anh da trn Transformation-based Learning TBL [Brill 1995]
z ging hng gia hai ngn ng ( chnh xc khong 87%) chuyn nhn t loi sang ting Vit.
z kt qu c hiu chnh bng tay lm d liu hun luyn cho b gn nhn t loi ting Vit.
60
-
[inh in]z u im: z trnh c vic gn nhn t loi bng tay nh tn
dng thng tin t loi mt ngn ng khc. z Nhc:z Ting Anh v ting Vit khc nhau: v cu to t, trt
t v chc nng ng php ca t trong cu kh khn trong ging hng
z Li tch ly qua hai giai on: (a) gn nhn t loi cho ting Anh v (b) ging hng gia hai ngn ng
z Tp nhn c chuyn i trc tip t ting Anh sang ting Vit khng in hnh cho t loi ting Vit
61
Cch tip cn 2z [Nguyen Huyen, Vu Luong] Thi Minh Huyen Nguyen, Laurent
Romary, and Xuan Luong Vu, A Case Study in POS Tagging of Vietnamese Texts. The 10th annual conference TALN 2003.
z da trn nn tng v tnh cht ngn ng ca ting Vit. z xy dng tp t loi (tagset) cho ting Vit da trnz xy dng tp t loi (tagset) cho ting Vit da trn
chun m t kh tng qut ca cc ngn ng Ty u, nhm m un ha tp nhn hai mc: z mc c bn/ct li (kernel layer): c t chung nht cho cc
ngn ng z mc tnh cht ring (private layer): m rng v chi tit ha cho
mt ngn ng c th da trn tnh cht ca ngn ng
62
[Nguyen Huyen, Vu Luong]
z mc c bn: danh t (noun N), ng t (verb V), tnh t (adjective A), i t (pronoun P), mo t (determine D), trng t (adverb R), tin-hu gii t (adposition S), lin t (conjunction C), s t (numeral M) tnh thi t (interjection I) v t(numeral M), tnh thi t (interjection I), v t ngoi Vit (residual X, nh foreign words, ...).
z mc tnh cht ring: c trin khai ty theo cc dng t loi trn nh danh t m c/khng m c i vi danh t, ging c/ci i vi i t, .v.v.
63
Cch tip cn 3z [Phuong] Nguyn Th Minh Huyn, V Xun Lng, L
Hng Phng . S dng b gn nhn t loi xc sut QTAG cho vn bn ting Vit. K yu Hi tho ICT.rda03
z lm vic trn mt ca s cha 3 t, sau khi b sung thm 2 t gi u v cui vn bn.
z Nhn c gn cho mi t lt ra ngoi ca s l nhn kt qu cui cng.
64
Th tc gn nhn t loi [Phng]1. c t (token) tip theo 2. Tm t trong t in 3. Nu khng tm thy, gn cho t tt c cc nhn c th 4. Vi mi nhn c th
a. tnh Pw = P(tag|token)b. tnh Pc = P(tag|t1,t2), t1, t2, l nhn tng ng ca hai t
ng trc t token. c. tnh Pw,c = Pw * Pc, kt hp hai xc sut trn.
5. Lp li php tnh cho hai nhn khc trong ca s Sau mi ln tnh li (3 ln cho mi t), cc xc sut kt qu
c kt hp cho ra xc sut ton th ca nhn c gn cho t.
65
[Phng]
z Chia kho vn bn gn nhn lm 2 tp: tp hun luyn v tp th nghim
z T ng gn nhn cho cc phn vn bnz So snh kt qu thu c vi d liu mu. z Thi gian hun luyn vi 32000 t: ~ 30s
66
-
[Phng]z Cu gn nhn:
hi ln < w pos="Nn"> su , c ln ti nhn thy mt bc tranh tuyt p
Nc - danh t n th, Vto - ngoi ng t ch hng, Nn - danh t s lng, Vs - ng t tn ti, Nu - danh t n v, Pp - i t nhn xng, Jt - ph t thi gian, Vt - ngoi ng t, Nt - danh t loi th, Jd - ph t ch mc , Aa - tnh t hm cht.
67
[Phng]z Cu t tp ng liu mu
hi ln < w pos="Nn"> su , c ln ti nhn thy mt bc tranh tuyt p
Cu do chng trnh gn nhn hi nhn thy mt bc tranh tuyt p
68
[Phng]
z Kt qu: z ~94% (9 nhn t vng v 10 nhn cho cc loi k
hiu)z ~85% (48 nhn t vng v 10 nhn cho cc loiz 85% (48 nhn t vng v 10 nhn cho cc loi
k hiu)z Nu khng dng n t in t vng (ch s
dng kho vn bn gn nhn mu) th cc kt qu ch t c tng ng l ~80% v ~60%.
69
Cch tip cn 4z Phan Xun Hiu:
z da trn phng php Maximum Entropy (MaxEnt) v Conditional Random Fields (CRFs) - ng dng rt nhiu cho cc bi ton gn nhn cho cc thnh phn trong d liu chuiliu chui.
z D liu hun luyn: l tp ng liu Viet Treebank bao gm hn 10.000 cu ting Vit c gn nhn t loi bi cc chuyn gia ngn ng.
70
[Hiu]
Hc m hnh gn nhn t loi 71
Trch chn c trngz ... thng tr_chuyn vi Mai ... z Cn xc nh t loi cho t tr_chuyn, cc c trng:z Chnh bn thn t tr_chuyn thng xut hin vi t loi no
trong tp d liu Viet Treebank? T tr chuyn thng c nhn t loi l g trong t in? Lz T tr_chuyn thng c nhn t loi l g trong t in? L ng t chng?
z T thng i ngay trc t tr_chuyn thng c gi g? z T vi i sau t tr_chuyn c gi g? C phi n gi l
ngay trc n l mt ng t hay khng? z Kt hp ca hai t vi Mai gi iu g, chc t trc
(tr_chuyn) nn l mt ng t?
72
-
Ng cnh cho trch xut c trng
73
Ng cnh cho trch xut c trng
74
Kt qu gn nhn s dng MaxEnt v CRFs
75
Tp t loi ting VitidPOS symbolPOS vnPOS enPOS
1 N danh t noun2 V ng t verb3 A tnh t adjective4 M s t numeral5 P i t pronoun6 R ph t adverb6 R ph t adverb7 O gii t preposition8 C lin t conjunction9 I tr t auxiliary word10 E cm t emotivity word11 Xy* t tt abbreviation12 S yu t t (bt, v) component stem13 U khng xc nh undetermined
76T tt mang nhn kp: X = t loi ca t tt ; y = k hiu t tt. V d: GDP-Ny ; HIV Ny.
Tp tiu t loi ting VitidPOS idSub
POSsymbol
POSvnPOS enPOS
1 1 Np danh t ring proper noun1 2 Nc danh t n th countable noun1 3 Ng danh t tng th collective Noun1 4 Na danh t tru tng abstract noun1 5 Ns danh t ch loi classifier noun1 6 Nu danh t n v unit noun
77
1 6 Nu danh t n v unit noun1 7 Nq danh t ch lng quantity noun2 8 Vi ng t ni ng intransitive verb2 9 Vt ng t ngoi ng transitive verb2 10 Vs ng t trng thi state verb2 11 Vm ng t tnh thi modal verb2 12 Vr ng t quan h relative verb3 13 Ap tnh t tnh cht property adjective3 14 Ar tnh t quan h relative adjective3 15 Ao tnh t tng thanh onomatopoetic adjective3 16 Ai tnh t tng hnh pictographic adjective
Tp tiu t loi ting VitidPOS idSub
POSsymbol
POSvnPOS enPOS
4 17 Mc s t s lng cardinal numeral4 18 Mo s t th t ordinal numeral5 19 Pp i t xng h personal pronoun5 20 Pd i t ch nh demonstrative pronoun5 21 Pq i t s lng quality pronoun
78
5 21 Pq i t s lng quality pronoun5 22 Pi i t nghi vn interrogative pronoun6 23 R ph t adverb7 24 O gii t preposition8 25 C lin t conjunction9 26 I tr t auxiliary word
10 27 E cm t emotivity word11 28 Xy t tt abbreviation12 29 S yu t t (bt, v) component stem13 30 U khng xc nh undetermined
-
Phn tch c php
1
L Thanh HngB mn H thng Thng tin
Vin CNTT &TT Trng HBKHNEmail: [email protected]
Bi ton PTCP
P
T
C
cy PTCP mu
chnh xctnh
i
2
C
P
Vn phm
cu Cc b PTCP hin nay c chnh xc cao(Eisner, Collins, Charniak, etc.)
cy c php
im
Khi nim v vn phm
z Phn tch cu B vng gm c nonz Cy c php:z Tp lutz C CN VNz CN DNz VN gNz gN gT DNz DN DT TT
3
Vn phm
z Mt vn phm sn sinh l mt h thngz G = ( T, N, S, R ), trong z T (terminal) tp k hiu kt thcz N (non terminal) tp k hiu khng kt thcz S (start) k hiu khi uz R (rule) tp lutz R = { | , (TN) } z gi l lut sn xut
4
Dng chun Chomsky
z Mi NNPNC khng cha u c th sinh t mt vn phm tn mi sn xut u c dng A BC hoc A a, vi A,B,CN v a TT
z V d: Tm dng chun Chomsky cho vn phm G vi T = {a,b}, N ={S,A,B}, R nh sau:z S bA|aBz A bAA|aS|az B aBB|bS|b
5
Nhc li v vn phmz Vn phm: 1 tp lut vit liz K hiu kt thc: cc k hiu khng th phn r c
na.z K hiu khng kt thc: cc k hiu c th phn r c.Xt h G
6
z Xt vn phm G:S NP VPNP John, garbageVP laughed, walks
G c th sinh ra cc cu sau:John laughed. John walks.Garbage laughed. Garbage walks.
-
Cu trc ng php
Cy c php biu din cu trc ng php ca mt cu. B vng gm c non.
C
CN VN
7
DTB
gTgm
DTc
TTnon
TTvng
DN gN
DN
Cc ng dng ca PTCP
Dch my (Alshawi 1996, Wu 1997, ...)
ting Anh ting Vitcc thao tc
vi cy
8
Nhn dng ting ni s dng PTCP (Chelba et al 1998)Put the file in the folder. Put the file and the folder.
Cc ng dng ca PTCP
Kim tra ng php (Microsoft)
Trch rt thng tin (Hobbs 1996)
9
Kho vn bnNY Times
CSDL
cu truy vn
Vn phm phi ng cnh (Context-Free Grammar) cn gi l vn phm cu trc onz G = z T tp cc k hiu kt thc (terminals)z N - tp cc k hiu khng kt thc (non-terminals)z P k hiu tin kt thc (preterminals), khi vit li tr
thnh k hiu kt thc P N
10
thnh k hiu kt thc, P Nz S k hiu bt uz R: X , X l k hiu khng kt thc; l chui cc
k hiu kt thc v khng kt thc (c th rng)z Vn phm G sinh ra ngn ng L
z B nhn dng: tr v yes hoc noz B PTCP: tr v tp cc cy c php
So vi vn phm cm ng cnh R: A
z Vn phm ng cu:z , vi V+ , V*
z Vn phm cm ng cnh:z r = , vi V+ , V* , z v 1A212 vi
z Vn phm phi ng cnh:z A , A N,
i V* ( T N )*
11
z vi V*= ( T N )*z Vn phm chnh qui:z A aB, z A Ba, z A a, vi A, B N, a T.
VPCQ
VPPNC
VPCNC
VPNC
Vn phm phi ng cnh
12
-
p dng tp lut ng php
z S NP VP DT NNS VBD The children slept
13
pz S
NP VP DT NNS VBD NP DT NNS VBD DT NN The children ate the cake
Cu trc on qui
14
Vn phm cho ngn ng t nhin c nhp nhng
S
NP VP
Nhp nhng - PPc th gn ti 2 im (vi VP hoc vi NP)
John saw snow on the campus
15
NP
0 John
VP
PP
NP
1 saw NP2 snow
3 on
4 the 5 campus 6
PTCP kiu trn xungz Hng chz Khi u vi 1 danh sch cc k hiu cn trin khai (S,
NP,VP,) z Vit li cc ch trong tp ch bng cch:
S
NP VP
.
16
z tm lut c v tri trng vi ch cn trin khaiz triu khai n vi v phi lut, tm cch khp vi cu u vo
z Nu 1 ch c nhiu cch vit li chn 1 lut p dng (bi ton tm kim)
z C th s dng tm kim rng (breadth-first search) hoc tm kim su (depth-first search)
Kh khn vi PTCP trn xungz Cc lut qui triz PTCP trn xung rt bt li khi c nhiu lut c cng v tri
SNP X1 SNP X2 SNP X600 SVP Y1
17
z Nhiu thao tc tha: trin khai tt c cc nt c th phn tch trn xung
z PTCP trn xung s lm vic tt khi c chin lc iu khin ng php ph hp
z PTCP trn xung khng th trin khai cc k hiu tin kt thc thnh cc k hiu kt thc. Trn thc t, ngi ta thng s dng phng php di ln lm vic ny.
z Lp li cng vic: bt c ch no c cu trc ging nhau
PTCP di ln
z Hng d liuz Khi to vi xu cn phn tchz Nu chui trong tp ch ph hp vi v phi ca 1 lut
thay n bng v tri ca lut
.
S
NP VP
18
thay n bng v tri ca lut.z Kt thc khi tp ch = {S}.z Nu v phi ca cc lut khp vi nhiu lut trong tp ch, cn la chn lut p dng (bi ton tm kim)
z C th s dng tm kim rng (breadth-first search) hoc tm kim su (depth-first search)
-
Kh khn vi PTCP di ln
z Khng hiu qu khi c nhiu nhp nhng mc t vng
z Lp li cng vic: bt c khi no c cu trc con chung
19
chungz C PTCP TD (LL) v BU (LR) u c phc
tp l hm m ca di cu.
Thut ton CKY (b nhn dng)
Vo: xu n t Ra: yes/no Cu trc ng php: bng n x n (chart table)
20
g p p g ( ) hng nh s 0 n n-1 ct nh s 1 n n cell [i,j] lit k tt c cc nhn c php gia i v j
Thut ton CKY (bottom-up) for i := 1 to n Thm tt c t loi ca t th i vo [i-1,i]
for width := 2 to n for start := 0 to n-width
end := start + width
21
end := start + width for mid := start+1 to end-1 for mi nhn c php X trong [start,mid] for mi nhn c php Y trong [mid,end] for mi cch kt hp X v Y (nu c) Thm nhn kt qu vo [start,end] nu cha
c nhn ny
V dB vng gm c non1 2 3 4 5
0DT
CNDN
C
22
1TT
2gT
VNgN
3DT DN
4TT
Vn phm phi ng cnh1. Start S2. S NP VP3. NP Det Noun4. NP Name
9. V ate10. Name John11. Name ice-cream, snow12. Noun ice-cream, pizza
23
5. NP Name PP6. PP Prep NP7. VP V NP8. VP V NP PP
13. Noun table, guy, campus14. Det the15. Prep on
Lut kt hp
z Cell[i,j] cha nhn X nuz C lut XYZ;z Cell[i,k] cha nhn Y v Cell[k,j] cha nhn Z,
24
vi k nm gia i v j;
z VD: NP DT [0,1] NN[1,2]
-
CKY phi s dng lut nh phn
z Chuyn VPV NP PP thnh:8.a. VPV Arguments8 b Arguments NP PP
25
8.b. Arguments NP PP
CKY chart
1 2 3 4 5 6 7 8
0 DT1 NN2 VBD
The guy ate the ice-cream on the table
26
2 VBD3 DT4 NN5 IN6 DT7 NN
p dng thao tc dn
1 2 3 4 5 6 7 8
0 DT NP1 NN
27
2 VBD3 DT4 NN5 IN6 DT7 NN
Nhp nhng!1 2 3 4 5 6 7 8
0 DT NP S1 NN2 VBD VP
5. NP NN PP8.a. VPV Arguments8.b. Arguments NP PP
28
3 DT NP NP, Args
4 NN5 IN PP6 DT NP7 NN
Thut ton Earley (top-down)
z Tm cc nhn v cc nhn thiu (partial constituents) t u voz A B C . D E l nhn thiu:
A D+ =A
29
z Tin hnh dn t tri sang phi
B C D E
A B C . D E
B C D E
A B C D . E
V d
ROOT S NP PapaS NP VP N caviarNP Det N N spoon
30
NP NP PP V ateVP VP PP P withVP V NP Det thePP P NP Det a
-
Recursive Descent ( quy)
z 0 ROOT . S 0z 0 S . NP VP 0
ROOT S VP VP PP NP Papa V ateS NP VP VP V NP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
31
z 0 NP . Papa 0 z 0 NP Papa . 1
z 0 S NP . VP 1
Root S VPNP
VPPapa
ROOT S S NP VP NP Papa
VP
Papa
Goal stack
Recursive Descent
z 0 S NP . VP 1z 1 VP . VP PP 1
ROOT S VP VP PP NP Papa V ateS NP VP VP V NP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
32
1 VP . VP PP 1 1 VP . VP PP 1
1 VP . VP PP 1 stack overflowedVP VP PP VP VP PP
PPVP
PPVP
PPPPVP
PPPP
VP VP PP
VP PP
PPPP
VP VP PP
Recursive DescentROOT S VP V NP NP Papa V ateS NP VP VP VP PP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
0 ROOT . S 0 0 S . NP VP 0
NP P
33
z 1 VP . V NP 1 sau . = nonterminal, lp i lp li vic tm k hiu ny (predict) 1 V . ate 1 sau . = terminal, tm n u vo (scan) 1 V ate . 2 sau . = rng, ch con ca cha n hon chnh (attach)
z 1 VP V . NP 2 predict (ch con tip theo) 2 NP . ... 2 phn tch tip v cui cng 2 NP ... . 7 we hon thnh ch con NP ca cha n attach
z 1 VP V NP . 7 attachz 0 S NP VP . 7 attach
0 NP . Papa 0 0 NP Papa . 1
0 S NP . VP 1
Recursive Descent
z 0 ROOT . S 0z 0 S . NP VP 0
z 0 NP . Papa 0
ROOT S VP V NP NP Papa V ateS NP VP VP VP PP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
thc hin bng li gi hm:S() gi NP() v VP(), VP c trin khai 1
34
pz 0 NP Papa . 1
z 0 S NP . VP 1z 1 VP . V NP 1 1 V . ate 1 1 V ate . 2
z 1 VP V . NP 2 2 NP . ... 2 2 NP ... . 7
z 1 VP V NP . 7z 0 S NP VP . 7
cn quay li th 1 lut VP khc
S() gi NP() v VP(), VP c trin khai 1 cch qui
Recursive DescentROOT S VP V NP NP Papa V ateS NP VP VP VP PP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
0 ROOT . S 0 0 S . NP VP 0
0 NP . Papa 0
35
1 VP . V NP 1 1 V . ate 1 1 V ate . 2
1 VP V . NP 2 2 NP . ... 2 phn tch tip v cui cng 2 NP ... . 4 ... on NP ng l t 2 n 4
ch ny cng cn quay li
0 NP . Papa 0 0 NP Papa . 1
0 S NP . VP 1 1 VP . VP PP 1
Recursive DescentROOT S VP V NP NP Papa V ateS NP VP VP VP PP N caviar P withNP Det N PP P NP N spoon Det theNP NP PP Det a
0 Papa 1 ate 2 the 3 caviar 4 with 5 a 6 spoon 7
0 ROOT . S 0 0 S . NP VP 0
NP P
36
1 VP . VP PP 11 VP . VP PP 1
1 VP . VP PP 1stack overflowedkhng gii quyt c g
cn thay i tp lut loi tr qui tri
0 NP . Papa 0 0 NP Papa . 1
0 S NP . VP 1 1 VP . VP PP 1
1 VP . VP PP 1
-
Thut ton Earleyz Thut ton Earley ging thut ton qui ni trn, nhng gii
quyt c vn qui tri. z S dng bng phn tch ging thut ton CKY, nhm lu li cc
thng tin tm thy lp trnh ng Dynamic programming.Cc thao tc ca thut ton
37
z X l phn i sau du . theo kiu qui :z Nu l t, qut (scan) u vo xem c ph hp khngz Nu l k hiu khng kt thc, on (predict) cc kh nng
khp n (gim s php tin on bng cch nhn trc k k hiu t u vo v ch s dng cc lut ph hp vi k k hiu )
z Nu rng, ta hon thnh mt thnh phn ng php, gn (attach) n vo nhng ch lin quan
00 ROOT . S
khi to
tng ng vi (0, ROOT . S)
38
00 ROOT . S0 S . NP VP
predict lut c v tri l S
(0, S . NP VP)
39
00 ROOT . S0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa
predict lut c VT = NP(c 3 lut ph hp)
40
00 ROOT . S0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa0 D t th
predict lut c VT = Det (2 lut)
41
0 Det . the0 Det . a
00 ROOT . S0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa0 D t th
predict lut c VT = NPta lm vic ny bc trc, v vy khng lm li!Ch : ta phi lm li vic ny vi lut qui tri
42
0 Det . the0 Det . a
Ch : ta phi lm li vic ny vi lut qui tri
-
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa0 D t th
scan: t ph hp t u vo
43
0 Det . the0 Det . a
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa0 D t th kh h h
44
0 Det . the0 Det . a
scan: khng ph hp
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP0 NP . Det N0 NP . NP PP0 NP . Papa0 D t th
45
0 Det . the0 Det . a scan: khng ph hp
0 Papa 10 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP0 NP . Papa0 D t th
attach NP mi to (bt u t 0) vi cc phn lin quan (cc phn cha hon thnh kt thc ti 0 v c NP sau du . )
46
0 Det . the0 Det . a
0 Papa 10 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th
predict
47
0 Det . the0 Det . a
0 Papa 10 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
predict
48
0 Det . the 1 PP . P NP0 Det . a
-
0 Papa 10 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
predict
49
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
0 Papa 10 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
predict
50
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
0 Papa 1 0 ROOT . S 0 NP Papa .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP predict
51
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 P . with
predict
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
52
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 P . withscan: thnh cng!
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
53
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 P . with scan: khng hp
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP0 NP . NP PP 1 VP . V NP0 NP . Papa 1 VP . VP PP0 D t th 1 PP P NP
attach
54
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 P . with
-
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP
predict
55
0 Det . the 1 PP . P NP0 Det . a 1 V . ate
1 P . with
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
predict (cc bc sau tng t)
56
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
predict
57
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 20 ROOT . S 0 NP Papa . 1 V ate .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
scan (lc ny tht bi v P kh hi l t ti
58
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
Papa khng phi l t tip theo)
0 Papa 1 ate 2 the 30 ROOT . S 0 NP Papa . 1 V ate . 2 Det the .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th th h !
59
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
scan: thnh cng!
0 Papa 1 ate 2 the 30 ROOT . S 0 NP Papa . 1 V ate . 2 Det the .0 S . NP VP 0 S NP . VP 1 VP V . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
60
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
-
0 Papa 1 ate 2 the 30 ROOT . S 0 NP Papa . 1 V ate . 2 Det the .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N0 NP . Det N 0 NP NP . PP 2 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
61
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 0 ROOT . S 0 NP Papa . 1 V ate . 2 Det the .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
62
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
63
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
64
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
attach
65
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa0 D t th 1 PP P NP 2 D t th
attach
66
0 Det . the 1 PP . P NP 2 Det . the0 Det . a 1 V . ate 2 Det . a
1 P . with
-
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
attach
67
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
68
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
attach
69
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
70
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
71
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 40 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
72
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
-
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
73
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP
74
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 5 NP . Papa0 D t th 1 PP P NP 2 D t th 1 VP VP PP
75
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 5 NP . Papa0 D t th 1 PP P NP 2 D t th 1 VP VP PP 5 D t th
76
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 5 Det . the0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 5 Det . a
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 5 NP . Papa0 D t th 1 PP P NP 2 D t th 1 VP VP PP 5 D t th
77
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 5 Det . the0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 5 Det . a
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 5 NP . Papa0 D t th 1 PP P NP 2 D t th 1 VP VP PP 5 D t th
78
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 5 Det . the0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 5 Det . a
1 P . with 0 ROOT S .4 P . with
-
0 Papa 1 ate 2 the 3 caviar 4 with 50 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 4 P with .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 5 NP . Papa0 D t th 1 PP P NP 2 D t th 1 VP VP PP 5 D t th
79
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 5 Det . the0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 5 Det . a
1 P . with 0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NPPP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PPPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
80
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . NPP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PPPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
81
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . NPP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N 6 N . caviarP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP 6 N . spoonPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
82
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . NPP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N 6 N . caviarP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP 6 N . spoonPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
83
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 spoon 7 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a . 6 N spoon .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . NPP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N 6 N . caviarP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP 6 N . spoonPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
84
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
-
ate 2 the 3 caviar 4 with 5 a 6 spoon 7 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a . 6 N spoon .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . N 5 NP Det N .PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N 6 N . caviarP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP 6 N . spoonPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
85
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
ate 2 the 3 caviar 4 with 5 a 6 spoon 7 . 1 V ate . 2 Det the . 3 N caviar . 4 P with . 5 Det a . 6 N spoon .P 1 VP V . NP 2 NP Det . N 2 NP Det N . 4 PP P . NP 5 NP Det . N 5 NP Det N .PP 2 NP . Det N 3 N . caviar 1 VP V NP . 5 NP . Det N 6 N . caviar 4 PP P NP .P 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP . NP PP 6 N . spoon 5 NP NP . PPPP 2 NP . Papa 0 S NP VP . 5 NP . PapaP 2 D t th 1 VP VP PP 5 D t th
86
P 2 Det . the 1 VP VP . PP 5 Det . the2 Det . a 4 PP . P NP 5 Det . a
0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
87
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
88
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S .4 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
89
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
90
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP
-
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
91
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
92
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
93
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
94
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with0 ROOT S .
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
95
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with0 ROOT S .
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
96
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with0 ROOT S .
-
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
97
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with0 ROOT S .
0 Papa 1 ate 2 the 3 caviar 4 with a spoon 70 ROOT . S 0 NP Papa . 1 V ate . 2 Det the . 3 N caviar . 6 N spoon .0 S . NP VP 0 S NP . VP 1 VP V . NP 2 NP Det . N 2 NP Det N . 5 NP Det N .0 NP . Det N 0 NP NP . PP 2 NP . Det N 3 N . caviar 1 VP V NP . 4 PP P NP .0 NP . NP PP 1 VP . V NP 2 NP . NP PP 3 N . spoon 2 NP NP . PP 5 NP NP . PP0 NP . Papa 1 VP . VP PP 2 NP . Papa 0 S NP VP . 2 NP NP PP .0 D t th 1 PP P NP 2 D t th 1 VP VP PP 1 VP VP PP
98
0 Det . the 1 PP . P NP 2 Det . the 1 VP VP . PP 1 VP VP PP .0 Det . a 1 V . ate 2 Det . a 4 PP . P NP 7 PP . P NP
1 P . with 0 ROOT S . 1 VP V NP .4 P . with 2 NP NP . PP
0 S NP VP .1 VP VP . PP7 P . with0 ROOT S .
Vn vi PTCP trn xung: qui tri
VP
VP PPgn lin tc cc lut mi vo cy trc khi thy PP
99
VP PP
VP PP
PPs cn on trc s PP cn u vo
nhng thut ton Earley Ok!VP
PPVP
1 VP . VP PP
(ct 1)
100
attach
VP
V NP
VP
PPVP
V NPate the caviar
1 VP VP . PP
(ct 4)
nhng thut ton Earley Ok!
VP
VP
PPVP
1 VP . VP PP
(ct1)
attach
c th dng li
101
VP
V NP
VP
PPVP
V NP
VP
PP
1 VP VP . PP
ate the caviar
with a spoon
(ct 7)
attach
nhng thut ton Earley Ok!
VP
VP
PPVP
1 VP . VP PPc th dng li
(ct1)
102
VP
V NP
VP
PPVP
V NP
VP
PP
ate the caviar
with a spoon
in his bed
1 VP VP PP .
(ct 10)
-
nhng thut ton Earley Ok!VP
PPVP
1 VP . VP PPc th dng li
VP
VP
PP1 VP VP . PP(ct1) attach
103
VP
V NP
VP
PPVP
V NP
PP
VP PP
ate the caviar
with a spoon
in his bed
(ct10)
Phc hi cy c phpS dng thut ton dng queue n gin,
da trn cc thnh phn c ch 1 thnh phn trng thi kt thc l c ch If s=[A ,i] trong tp ch k & c ch then q=[A ,k] & item r= [B ,j] l
c ch
[s,itrong tp trng thi j
i k j
q r
104
i k j
nh du tt c cc thnh phn trong tp trng thi Sn dng Start S, 0
for j=n downto 0 dofor i=0 to j do
for mi b nh du [s,i] trong tp trng thi j dofor k=i to j do
if [q,i]Sk & [r,k] Sj & s= qr thennh du [q,i] v [r,k]
[s,i] : mt thnh phn vi lut s & tr v con tr i.
u im
z Thut ton Earley thc hin mt vi php lc top-down: bt c thnh phn no (state, or triple) c a vo tp trng thi cn tng thch vi phn c sinh ra bn tri V
105
thch vi phn c sinh ra bn tri. V d: S wi trong wi l phn ca cu c duyt qua
wi
S
*
Nhc im
z Biu din lut: Explicit representation of rules: wastes time building them.
z Thc hin php lc bn tri nhng khng lc
106
bn phi
Php lc nhn trc cho k hiu khng kt thc A:
FIRST(A)= {x|A x }, x= 1 tokenv.d., FIRST(S)= who, did, the, etc.
Cc phng php khc
z Cc phng php khc ng vi cc cch khc nhau tm cc on
z on X[i, j] l on c nhn X ph u vo t I n jExample:
John ate ice cream on the table
107
0 John 1 ate 2 ice-cream 3 on 4 the 5 table 6PP[3,6]; S[0,6];
z Biu din khng gian tm kim nh cy and-orz Disjuncts (or) = cc ng phn tch khc nhauz Conjuncts (and) = v phi ca lut, v d v phi ca
S l NP VP
PTCP l vic tm kim
Det(0,1) Noun(1, 2)
S(0, 7)0 the 1 guy 2 saw 3 ice-cream 4 on 5 the 6 hill 7
NP(0, 1) VP(1, 8) NP(0, 2)
V(1, 2)
VP(2, 7)
V(2, 3) NP(3,7) NP(3 4)
Name (0, 1)
108
NP(5, 7)
Det(5,6) Noun(6,7)the hill
NP(5,7)
Name(5,6)
( ) ( , )
Name(3, 4) PP(4, 7)
the guy
saw
NP(3, 4)
Prep(4, 5)
on
ice-cream
-
PTCP gc tri (Left-corner parsing)
z Nhn t di ln tm k hiu u tin (left-corner) ca on, sau phn tch phn cn li theo kiu trn
S
NP VP
S NP VP
NP the Noun
VP ate NP
109
xungz Tm cch kt hp cc c trng tt nht ca tm phn tch trn xung v di ln
theNoun
12
tm
predict
ate
Phng php ny lm vic tt vi ngn ng vi thnh phn quan trng t u nh ting Anh. Cc ting c, H Lan, Nht l ngn ng c phn quan trng t cui.
-
Phn tch c php xc sut
L Thanh Hng
1
gB mn H thng Thng tin
Vin CNTT &TT Trng HBKHNEmail: [email protected]
Lm cch no chn cy ng?
z V d: I saw a man with a telescope.
z Khi s lut tng, kh nng nhp nhng tngz Tp lut NYU: b PTCP Apple pie : 20,000-30,000
2
p pp plut cho ting Anh
z La chn lut AD: V DT NN PP(1) VP V NP PP
NP DT NN(2) VP V NP
NP DT NN PP
Kt hp t (bigrams pr)V d:
Eat ice-cream (high freq)Eat John (low, except on Survivor)
Nhc im:z P(John decided to bake a) c xc sut caoz Xt:
P(w3) = P(w3|w2w1)=P(w3|w2)P(w2|w1)P(w1)
3
P(w3) P(w3|w2w1) P(w3|w2)P(w2|w1)P(w1)Gi thit ny qu mnh: ch ng c th quyt nh b ng trong
cuClinton admires honesty
s dng cu trc ng php dng vic lan truynz Xt Fred watered his mothers small garden. T garden c
nh hng nh th no?z Pr(garden|mothers small) thp m hnh trigram khng ttz Pr(garden | X l thnh phn chnh ca b ng cho ng t to
water) cao hn s dng bigram + quan h ng php
Kt hp t (bigrams pr)
z V c mt s loi b ng nht nh Verb-with-obj, verb-without-obj
z S tng thch gia ch ng v b ng:John admires honesty Honesty admires John ???
4
Nhc im: Kch thc tp ng php tngz Cc bi bo ca tp ch Wall Street Journal trong 1 nm:
47,219 cu, di trung bnh 23 t, gn nhn bng tay: ch c 4.7% hay 2,232 cu c cng cu trc ng php
Khng th da trn vic tm cc cu trc c php ng cho c cu. Phi xy dng tp cc mu ng php nh
V dS
VP VP
VP
Lut 3
5
This apple pie looks good and is a real treat
DT NN NN VBX JJ CC VBX DT JJ NNNP NP
VP ADJLut 1 Lut 2
Lut 1. NPDT NN NN2. NPDT JJ NN3. SNP VBX JJ CC VBX NPz Nhm (NNS, NN) thnh NX; (NNP, NNPs)=NPX;
(VBP, VBZ, VBD)=VBX;
6
(VBP, VBZ, VBD) VBX; z Chn cc lut theo tn sut ca n
-
Tnh xc sut
X NP
1470
Pr(X Y)
7
Y DT JJ NN
9711NP
= = 0.1532
Tnh PrS
NP VP
DT JJ NN VBX NP
DT JJ NNThe big guyate
1
4
3
S NP VP; 0.35NP DT JJ NN; 0.1532VP VBX NP; 0.302
2
8
Lut p dng Chui Pr1 S NP VP 0.352 NP DT JJ NN 0.1532 x 0.35 = 0.05363 VP VBX NP 0.302 x 0.0536= 0.01624 NP DT JJ NN 0.1532 x 0.0162=0.0025Pr = 0.0025
the apple pie
Vn phm phi ng cnh xc sut
z 1 vn phm phi ng cnh xc sut (Probabilistic Context Free Grammar) gm cc phn thng thng ca CFG
z Tp k hiu kt thc {wk}, k = 1, . . . ,Vz Tp k hiu khng kt thc {Ni}, i = 1, . . . ,nz K hiu khi u N1
9
z K hiu khi u Nz Tp lut {Ni j}, j l chui cc k hiu kt thc v khng
kt thcz Tp cc xc sut ca 1 lut l:
i j P(Ni j) = 1z Xc sut ca 1 cy c php:
P(T) = i=1..n p(r(i))
Cc gi thitz c lp v tr: Xc sut 1 cy con khng ph thuc vo v tr
ca cc t ca cy con trong cu
k, P(Njk(k+c) ) l ging nhauz c lp ng cnh: Xc sut 1 cy con khng ph thuc vo
10
p g y g p cc t ngoi cy con
P(Njkl| cc t ngoi khong k n l) = P(Njkl)z c lp t tin: Xc sut 1 cy con khng ph thuc vo
cc nt ngoi cay con
P(Njkl| cc nt ngoi cy con Njkl ) = P(Njkl)
Cc thut ton
z CKYz Beam searchz Agenda/chart based search
11
z Agenda/chart-based searchz
CKY kt hp xc sut
z Cu trc d liu:z Mng lp trnh ng [i,j,a] lu xc sut ln nht
ca k hiu khng kt thc a trin khai thnh chui ij.
12
z Backptrs lu lin kt n cc thnh phn trn cyz Ra: Xc sut ln nht ca cy
-
Tnh Pr da trn suy din
z Trng hp c bn: ch c 1 t u voPr(tree) = pr(A wi)
z Trng hp qui: u vo l xu cc tAwij if k: A C, B wik ,C wkj ,ik j. * **
13
p[i,j] = max(p(A C) x p[i,k] x p[k,j]).
i k j
A
B C
wij14
Tnh xc sut Viterbi (thut ton CKY)
15
0.0504
V dz S NP VP 0.80z NP Det N 0.30z VP V NP 0.20z V includes 0 05
z Det the 0.50z Det a 0.40z N meal 0.01z N flight 0 02z V includes 0.05 z N flight 0.02
Dng thut ton CYK phn tch cu vo:The flight includes a meal
Tnh Pr1. S NP VP 1.02. VP V NP PP 0.43. VP V NP 0.64. NP N 0.75. NP N PP 0.36. PP PREP N 1.0 NP NP PP
VP
S VP
NP
PPV N
1.0
0.40 7 0 7
0.6
0.3
17
7. N a_dog 0.38. N a_cat 0.59. N a_telescop 0.210. V saw 1.011. PREP with 1.0
N V N PREP N PREP N
0.7
0.3 1.0 0.5 1.0 0.2
0.71.0
1.0
Pl = 1.7.4.3.71.511.2 = .00588 Pr = 1.7.6.3.31.511.2 = .00378 Pl is chosen
a_dog saw a_cat with a_telescope
Xc sut Forward v Backward
The big brown fox
NPN
NThe
big
t
Xt
1 t-1 T
Forward= xc sut cc phn t trn v bao gm 1 nt c th no
18
NN
bigbrown
foxForwardProbability =ai(t)=P(w1(t-1), Xt=i)
i
Backward Probability =bi(t)=P(wtT |Xt=i)
bi(t)
ai(t)th no
Backward= xc sut cc phn t di 1 nt c th no
-
Xc sut trong v ngoiN1= Start
Nj
Outside j(p,q)
Inside j(p,q)
19
z Npq = k hiu khng kt thc Nj tri t v tr p n q trong xu
z j = xc sut ngoi (outside)z j = xc sut trong (inside)z Nj ph cc t wp wq, nu Nj wp wq
w1 wm
wp wq wq+1wp-1
N1= Start
Nj
Outside j(p,q)
Inside j(p,q)
Xc sut trong v ngoi
20
w1 wm
wp wq wq+1wp-1
j(p,q) j(p,q) = P(N1 w1m , Nj wpq | G)= P(N1 w1m |G) P(Nj wpq | N1 w1m, G)
j(p,q)=P(w1(p-1) , Npqj,w(q+1)m|G)j(p,q)=P(wpq|Npqj, G)
Tnh xc sut ca xu
z S dng thut ton Inside, 1 thut ton lp trnh ng da trn xc sut inside
P(w1m|G) = P(N1 * w1m|G) = P(w1m|N1m1, G) = 1(1,m)
21
z Trng hp c bn:j(k,k) = P(wk|Nkkj, G)=P(Nj wk|G)
z Suy din:j(p,q) = r,sd(p,q-1) P(Nj NrNs) r(p,d) s(d+1,q)
Suy din
NjP(Nj NrNs)
Tnh j(p,q) vi p < q tnh trn tt c cc im j thc hin t di ln
22
Nr Ns
wp wdwd+1 wq
r(p,d) s(d+1,q)x
P(Nj NrNs)
-nhn 3 thnh phn, tnh tng theo j, r,s.
V d1. S NP VP 1.02. VP V NP PP 0.43. VP V NP 0.64. NP N 0.75. NP N PP 0.3 NP NP PP
VP
S VP
NP
PPV N
1.0
0.4
0.6
0.3
23
5. NP N PP 0.36. PP PREP N 1.07. N a_dog 0.38. N a_cat 0.59. N a_telescope 0.210. V saw 1.011. PREP with 1.0 P(a_dog saw a_cat with a_telescope) =
N V N PREP N
NP NP PP V N
PREP N
0.7
0.3 1.0 0.5 1.0 0.2
0.71.0
1.0
1.7.4.3.71.511.2 + ... .6... .3... = .00588 + .00378 = .00966
Tm kim kiu chmz Tm kim trong khng gian trng thiz Mi trng thi l mt cy c php con vi 1 xc sut
nht nhz Ti mi thi im, ch gi cc thnh phn c im cao nht
24
-
Lm giu PCFG
z PCFG n gin hot ng khng tt do cc gi thit c lp
z Gii quyt: a thm thng tinPh th t
25
z Ph thuc cu trcz Vic trin khai 1 nt ph thuc vo v tr ca n
trn cy ( c lp vi ni dung v t vng ca n)z V d: b sung thng tin cho 1 nt bng cch lu
gi thng tin v cha ca n: SNP khc vi VPNP
Lm giu PCFGz PCFG t vng ha : PLCFG (Probabilistic
Lexicalized CFG, Collins 1997; Charniak 1997)
z Gn t vng vi cc nt ca lutz Cu trc H