may 2006clint-ln parsing1 computational linguistics introduction parsing with context free grammars

32
May 2006 CLINT-LN Parsing 1 Computational Linguistics Introduction Parsing with Context Free Grammars

Upload: bartholomew-martin

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 1

Computational Linguistics Introduction

Parsing with

Context Free Grammars

Page 2: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 2

Chomsky Hierarchy

Page 3: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 3

Weak Equivalence

• A grammar should generate all and only sentences in the language under investigation.

• Let H be language under investigation and G be the grammar we are developing.

• The grammar should generate all sentences in the language, i.e. for any s in H, s is also in L(G).

• The grammar should generate only sentences in the language, i.e. for any s in L(G), s is also in H.

Page 4: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 4

All and Only

L(G)

G

H =

Page 5: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 5

Overgeneration

L(G)

H

Page 6: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 6

Overgeneration

• Basic Problem: L(G) is larger than H

• There are sentences generated by the grammar that are not in H.

• The “only” constraint is violated.

• The grammar is too weak.

• Example: a grammar which ignores number and gender

Page 7: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 7

Undergeneration

L(G)

H

Page 8: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 8

Undergeneration

• Basic Problem: H is larger than L(G)

• There are sentences in H that are not generated by the grammar.

• The “all” constraint is violated.

• The grammar is too strong.

• Examples (for H = NL): – a grammar which lacks recursion; – a finite state grammar

Page 9: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 9

Weak and Strong Equivalence

• A grammar/lexicon G generates a characteristic language L(G)

• Grammars G1 and G2 are said to be weakly equivalent if L(G1) = L(G2)

• A grammar G also assigns one or more phrase structures to any s in L(G)

• Weakly equivalent grammars G1 and G2 are said to be strongly equivalent if in addition they assign identical phrase structures to any s in L(G1).

Page 10: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 10

Weak Equivalence

A a

A aA

A a

A Aa

Page 11: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 11

Appropriate Structure

• The structure assigned by the grammar should be appropriate.

• The structure should

• Be understandable

• Allow us to make generalisations.

• Reflect the underlying meaning of the sentence.

Page 12: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 12

Ambiguity

• A grammar is ambigious if it assigns two or more structures to the same sentence.

• The grammar should not generate too many possible structures for the same sentence.

• There is a tradeoff between ambiguity and clarity: too much detail can obscure the design principles.

• Too little detail means that the grammar is undercommitted,

Page 13: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 13

Limitations of CF Grammars

• Simple CF Grammars tend to overgenerate• The only mechanism available to control

overgeneration is to invent new categories.• Proliferation of categories soon becomes

intractable. Problems include– Size of grammar– Understandability of grammar

Page 14: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 14

Criteria for Evaluating Grammars

• Does it undergenerate?• Does it overgenerate?• Does it assign appropriate structures to

sentences it generates?• Is it simple to understand? How many rules are

there?• Does it contain generalisations or special cases?• How ambiguous is it? How many structures for a

given sentence?

Page 15: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 15

CF Phrase Structure Rules

s → np vpnp → d Nvp → Vvp → V np(4 rules)

• Nice grammar – but it overgenerates• Solution – invent more categories nps, nppl,

vpsn, vppl etc.

Page 16: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 16

CF Phrase Structure Ruleswith Number Agreement

s -> nps vps

s -> nppl vppl

nps -> DS NS

nppl -> DPL NPL

vps -> VS

vps -> VS nps

vps -> VS nppl

vppl -> VPPL

vppl -> VPPL nps

vppl -> VPPL nppl

(10 rules)

Page 17: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 17

Constraints andInformation Structures

• PATR2 handles this problem by augmenting CF rules with constraints between constituents.

• Basic idea is that each constituent of a CF rule is associated with an information structure

• We then express constraints between information structures.

Page 18: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 18

Example of a PATR rulewith Number Constraints

Rule

s -> np vp

<np num> = <vp num>

<s num> = <np num>

Page 19: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 19

Example of a Grammarwith Number Constraints

s -> np vp<np num> = <vp num><s num> = <np num>

np -> D N<np num> = <D num><D num> = <N num>

vp -> V<vp num> = <V num>

Page 20: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 20

Summary

• Pure CFGs become unwieldy when we try to constrain them to incorporate, for example, agreement information

• PATR2 deals with this problem by associating information structures and constraints with each rule constituent.

• Information structures are often referred to as F-structures.

Page 21: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 21

Grammar versus Parsing

• A grammar is a description of a language.• A grammar abstractly associates structures with all and

only the strings of the grammar.• A parser is an implementation of an algorithm that

actually discovers the structures assigned by a grammar to a sentence.

• Typically there may be several different parsing algorithms for achieving this.

• Top down strategy• Bottom up strategy

Page 22: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 22

Parse Tree

A valid parse tree for a grammar G is a tree– whose root is the start symbol for G – whose interior nodes are nonterminals of G – whose children of a node T (from left to right)

correspond to the symbols on the right hand side of some production for T in G.

– whose leaf nodes are terminal symbols of G.

• Every sentence generated by a grammar has a corresponding parse tree

• Every valid parse tree exactly covers a sentence generated by the grammar

Page 23: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 23

Parsing Problem• Given grammar G and sentence A find all valid

parse trees for G that exactly cover A

S

VP

NPV

DetNom

Nbook

that

flight

Page 24: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 24

Soundness and Completeness

• A parser is sound if every parse tree it returns is valid.

• A parser is complete for grammar G if for all sL(G)– it terminates– it produces the corresponding parse tree

• For many purposes, we settle for sound but incomplete parsers

Page 25: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 25

Top Down

• Top down parser tries to build from the root node S down to the leaves by replacing nodes with non-terminal labels with RHS of corresponding grammar rules.

• Nodes with pre-terminal (word class) labels are compared to input words.

Page 26: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 26

Top Down Search Space

Start node →

Goal node↓

Page 27: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 27

Bottom Up

• Each state is a forest of trees.

• Start node is a forest of nodes labelled with pre-terminal categories (word classes derived from lexicon)

• Transformations look for places where RHS of rules can fit.

• Any such place is replaced with a node labelled with LHS of rule.

Page 28: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 28

Bottom Up Search Space

fl fl

fl fl fl

fl fl

Page 29: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 29

Top Down vs Bottom UpGeneral

• Top down – For: Never wastes

time exploring trees that cannot be derived from S

– Against: Can generate trees that are not consistent with the input

• Bottom up– For: Never wastes

time building trees that cannot lead to input text segments.

– Against: Can generate subtrees that can never lead to an S node.

Page 30: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 30

Top Down Parsing - Remarks

• Top-down parsers do well if there is useful grammar driven control: search can be directed by the grammar.

• Left recursive rules can cause problems.• A top-down parser will do badly if there are many

different rules for the same LHS. Consider if there are 600 rules for S, 599 of which start with NP, but one of which starts with V, and the sentence starts with V.

• Top-down is unsuitable for rewriting parts of speech (preterminals) with words (terminals). In practice that is always done bottom-up as lexical lookup.

• Useless work: expands things that are possible top-down but not there.

• Repeated work: anywhere there is common substructure

Page 31: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 31

Bottom Up Parsing - Remarks

• Empty categories: termination problem unless rewriting of empty constituents is somehow restricted (but then it’s generally incomplete)

• Inefficient when there is great lexical ambiguity (grammar driven control might help here)

• Conversely, it is data-directed: it attempts to parse the words that are there.

• Both TD (LL) and BU (LR) parsers can do work exponential in the sentence length on NLP problems

• Useless work: locally possible, but globally impossible.• Repeated work: anywhere there is common substructure

Page 32: May 2006CLINT-LN Parsing1 Computational Linguistics Introduction Parsing with Context Free Grammars

May 2006 CLINT-LN Parsing 32

Development of a Concrete Strategy

• Combine best features of both top down and bottom up strategies.– Top down, grammar directed control.– Bottom up filtering.

• Examination of alternatives in parallel uses too much memory.

• Depth first strategy using agenda-based control.