multiple-path syntactic analyzer

10
Multiple-Path Syntactic Analyzer 2003. 10. 8 Dae-Won Park

Upload: brock-diaz

Post on 02-Jan-2016

67 views

Category:

Documents


0 download

DESCRIPTION

Multiple-Path Syntactic Analyzer. 2003. 10. 8 Dae-Won Park. Introduction. By S. KUNO, A. G. OETTINGER, 1962 Method of predictive syntactic analysis Obtaining a single most probable description of structure of an input sentence in a single left to right scan through the sentence - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Multiple-Path Syntactic Analyzer

Multiple-Path Syntactic Analyzer

2003. 10. 8

Dae-Won Park

Page 2: Multiple-Path Syntactic Analyzer

2

Introduction

By S. KUNO, A. G. OETTINGER, 1962

Method of predictive syntactic analysis Obtaining a single most probable description of structure of an input sentence

in a single left to right scan through the sentence Prediction pools : similar to a pushdown store

Difficulty of predictive analysis for handling complex sentence structures There are many syntactically ambiguous sentences in natural texts when a single path analysis comes to a dead end, determining which of the

previous branch points was the cause of the failure Lacking of an effective method for distinguishing paths

-> not possible to try different paths in a systematic loop-free sequence

Page 3: Multiple-Path Syntactic Analyzer

3

Introduction

Extending the predictive approach By including effective provisions for multiple analysis of syntactically ambigu

ous sentences Variable size prediction pool consisting of one or more subpools Each subpool is pushdown store (stack)

Case : (k-1)st word has been processed

- prediction pool contains a subpool for each sentence structure compatible with the first (k-1) words

- topmost prediction of each subpool is tested against all the homographs of the k-th word

- after the processing of the last word, tracing back the paths

Page 4: Multiple-Path Syntactic Analyzer

4

Backgroud : Dictionary and Syntactic word classes

In the processing,each word of an input sentence is looked up in a dictionary is coded for membership in all

syntactic word classes

Page 5: Multiple-Path Syntactic Analyzer

5

Backgroud : Grammar Table

Defining grammatical matching function G of a language is described in terms of

- a set of predictions P,a set of syntactic word classes S,a set of syntactic role indicators R

Prediction : stands for a certain syntactic structure recognized in the language G(Pi, Sj) = { [(p1

1, p12, , , , p1

m), (r1)],

[(p21, p2

2, , , , p2m), (r2)], , , [(pq

1, pq2, , , , pq

m), (rq)] }

[(Pi,Sj), G(Pi,Sj)] in G : rule of grammar (Pi,Sj) = (SENTENCE, PRN)(PRN=personal pronoun in the nominative case)

Page 6: Multiple-Path Syntactic Analyzer

6

Analysis of a sentence(1/4)

Procedure for analysis Sample sentence : THEY ARE FLYING PLANES

Init

- Store prediction of SENTENCE First word

- Pared with the syntactic word class(PRN) of the first word(THEY)

- Replace the initial prediction of SENTENCE to eight new prediction : by looking up grammar table ( table 2 )

Second word

- Three syntactic word classes (BE1, BE2, BE3) assigned to ARE

- Coupled with the topmost prediction of each of the eight subpools: 24 argument => (PREDICATE, BE1), (PREDICATE, BE2), , , (ADJECTIVE CLAUSE, BE1), , , (COMMA, BE1), , ,

- All subpools except with PREDICATE are discarded

- All Subrules with PREDICATE have PREDICATE VERB (table 3)

Page 7: Multiple-Path Syntactic Analyzer

7

Analysis of a sentence(2/4)

Page 8: Multiple-Path Syntactic Analyzer

8

Three analyses for sample sentence (THEY ARE FLYING PLANES) Table 4 First : THEY refers to planes Second : THEY refers to peoples Third : not acceptable ( such as “The facts are smoking kills” )

- semantically correct

- ill-formed

- acceptable form : The facts are : smoking kills

Analysis of a sentence(3/4)

Page 9: Multiple-Path Syntactic Analyzer

9

Analysis of a sentence(4/4)

Page 10: Multiple-Path Syntactic Analyzer

10

Program & Conclusion

System : IBM 7090 Limited core memory (within 32000 words)

A path is determined in part by the choice of a single homograph Sk for each word position k (k=1, 2, ,,, n ), n is the number of words in the sentence Total number of distinct selection

N = k=1 k

(when the kth position has k homographs Sk ( k = 1, 2, , , k )

Running time 12 minutes for the analysis of 35-word sentence