recent developments in natural language parsing - · pdf filerecent developments in natural...
TRANSCRIPT
Recent Developmentsin Natural Language Parsing
Giorgio SattaUniversity of Padua
Venice, May 31st, 2016
Giorgio Satta University of Padua NL Parsing
Summary
Part I
Introduction to natural language parsing
A little bit of history
Part II
Dependency grammar
Abstract meaning representation
Giorgio Satta University of Padua NL Parsing
Parsing
The term parsing derives from the Latin expression pars orationis(lit. part of speech) meaning the analysis of sentencecomponents and their grammatical relations
Example:Rolls-Royce said it expects its U.S. sales to remain steady
1. Rolls-Royce proper name subject of 2.2. said verb main3. it pronoun subject of 4.4. expects verb subordinate of 2.
...
Giorgio Satta University of Padua NL Parsing
Parsing
In computer science, parsing refers to any process of recognition ofan object on the basis of a formal grammar (e.g., compilertheory, syntactic pattern matching)
The importance of parsing stems from the fact that you can notextract meaning without the support of syntax
Example:What is the value of (13 + 5) ∗ 7 ?What is the value of ( ) ∗ + 13 5 7 ? (lexicographic order)
Giorgio Satta University of Padua NL Parsing
Parsing Applications
In natural language (NL) processing, the parser is not astand-alone application, it is rather used as a component ofsome end-to-end system
Most popular systems exploiting parsing:
Automatic speech understanding
Information extraction
Intelligent personal assistant
Machine translation
Question answering
Text summarization
Giorgio Satta University of Padua NL Parsing
Parsing Applications
1968: HAL 9000 is a sentient computer appearing in StanleyKubrick’s 2001: A Space Odyssey
Giorgio Satta University of Padua NL Parsing
Parsing Applications
2011: IBM Watson is a question/answering system whichoutperformed its human opponents and former winners on the quizshow Jeopardy!
Giorgio Satta University of Padua NL Parsing
Parsing Overview
NL parsing is strongly rooted in
Generative linguistics
Formal language and automata theory
Computer algorithms
Machine learning
Giorgio Satta University of Padua NL Parsing
Generative Linguistics
In 1957, American scientist Noam Chomsky advocated for aformalized theory of linguistics structure, based onmathematically defined models
The idea started the field of generative linguistics, whichrevolutionized the scientific study of language
Giorgio Satta University of Padua NL Parsing
An Early Generative Model
Phrase-structure grammars (from Chomsky) are a generativemodel that strongly influenced NL parsing
Example: Rule system for fragment of English (operator ‘|’denotes alternative)
S → NP VPNP → NP PP | Det N | NVP → VP PP | V NPPP → P NPN → chocolate | I | fork | strawberriesV → eatDet → aP → with
Giorgio Satta University of Padua NL Parsing
An Early Generative Model
Example: Phrase structure
S
VP
NP
PP
NP
N
chocolate
P
with
NP
N
strawberries
V
eat
NP
N
I
Giorgio Satta University of Padua NL Parsing
An Early Generative Model
Example: Underlying grammatical relations
S
VP
NP
PP
NP
N
chocolate
P
with
NP
N
strawberries
V
eat
NP
N
I
sbj
obj
mod
Giorgio Satta University of Padua NL Parsing
An Early Generative Model
Example: Long distance syntactic movement
CP
C′
S
VP
NP
PP
NP
N
chocolate
P
with
NP
ti
V
eat
NP
N
I
C
do
NPi
What
mov
Giorgio Satta University of Padua NL Parsing
Ambiguity
In contrast with programming languages, NL is highly ambiguous
The number of possible syntactic interpretations of a sentence withn words can grow exponentially with n
Lexical, semantic and pragmatic knowledge needed to rule outundesired/unlikely interpretations (example next)
Giorgio Satta University of Padua NL Parsing
Ambiguity
S
VP
NP
PP
NP
N
chocolate
P
with
NP
N
strawberries
V
eat
NP
N
I
Giorgio Satta University of Padua NL Parsing
Ambiguity
S
VP
PP
NP
N
fork
Det
a
P
with
VP
NP
N
strawberries
V
eat
NP
N
I
Giorgio Satta University of Padua NL Parsing
Ambiguity
?? S
VP
NP
PP
NP
N
fork
Det
a
P
with
NP
N
strawberries
V
eat
NP
N
I
Giorgio Satta University of Padua NL Parsing
Ambiguity
?? S
VP
PP
NP
N
chocolate
P
with
VP
NP
N
strawberries
V
eat
NP
N
I
Giorgio Satta University of Padua NL Parsing
Parsing Algorithms
Chart Parsing, credited to Martin Key, is an algorithm forcontext-free grammar parsing that became very popular in the 80’s
Although independently discovered, it is very similar to analgorithm developed by Jay Earley in 1968
The algorithm uses dynamic programming to efficiently copewith syntactic ambiguity, running in time O(n3) and space O(n2),n the number of words in the input
Giorgio Satta University of Padua NL Parsing
Parsing Algorithms
Example: Chart parsing
0
I
1
eat
2
strb
3
. . .
[S → • NP VP]
[NP → • NP PP]
[NP → • det N][NP → • N][N → • I]
...
[N → I •][NP → N •][S → NP • VP]
[VP → • VP PP]
[VP → • V NP]
[V → • eat]
[V → eat •][VP → V • NP]
[NP → • NP PP]
[NP → • det N][NP → • N][N → • strb]
[S → NP VP •]
[VP → V NP •]
[N → strb •][NP → N •][NP → NP • PP]
[PP → • P NP]
[P → • with]
Giorgio Satta University of Padua NL Parsing
Parsing Algorithms
Example: Parse forest
0
I
1
eat
2
strb
3
. . .
[S → • NP VP]
[NP → • NP PP]
[NP → • det N][NP → • N][N → • I]
...
[N → I •][NP → N •][S → NP • VP]
[VP → • VP PP]
[VP → • V NP]
[V → • eat]
[V → eat •][VP → V • NP]
[NP → • NP PP]
[NP → • det N][NP → • N][NP → • strb]
[S → NP VP •]
[VP → V NP •]
[N → strb •][NP → N •][NP → NP • PP]
[PP → • P NP]
[P → • with]Giorgio Satta University of Padua NL Parsing
Parsing Algorithms
Chart parsing and many other parsing algorithms using dynamicprogramming can be seen as
Special constructions that ‘intersect’ a context-free grammarand a finite-state automaton
Simulations of specific push-down automata
Giorgio Satta University of Padua NL Parsing
Turning Point
70s and the 80s: Most of NL parsers based on hand-written rules,developed by linguists
High cost
Poor system coverage
Ad hoc evaluation
Giorgio Satta University of Padua NL Parsing
Turning Point
Every time I fire a linguist, the performance of the speechrecognizer goes up (Frederick Jelinek, IBM)
In contrast to main stream research, IBM developed its ownstatistical, data-centric approach, leading to the Penn TreebankProject (1989-1992): the construction of a large bank of linguistictrees, developed at University of Pennsylvania
Giorgio Satta University of Padua NL Parsing
Turning Point
Giorgio Satta University of Padua NL Parsing
Turning Point
Giorgio Satta University of Padua NL Parsing
Turning Point
Starting with the 90s
Surge of empirical approaches in all areas of NL processing
Many linguistic annotated corpora realised
Nowadays parsing is viewed as a structured prediction problem:input sentence has to be assigned a structured object (syntactictree) from an infinite space
Supervised machine learning techniques used to train parsers
Giorgio Satta University of Padua NL Parsing
Lexicalized Context-Free Grammars
Lexicalized context-free grammars are a syntactic model morefine-grained than phrase structure grammars
Accounts for lexical use
If used with scores (e.g., probabilities), very effective indisambiguation
Basic idea: enrich grammar symbols with lexical heads
NP[strawberry] → NP[strawberry] PP[chocolate]
Giorgio Satta University of Padua NL Parsing
Lexicalized Context-Free Grammars
S[eat]
VP[eat]
NP[strb]
PP[choc]
NP[choc]
N[choc]
chocolate
P[with]
with
NP[strb]
N[strb]
strawberries
V[eat]
eat
NP[I]
N[I]
I
Giorgio Satta University of Padua NL Parsing
Lexicalized Context-Free Grammars
Naıve parsing algorithm for lexicalized context-free grammars runsin time O(n5), n the number of words in the input
Using advanced dynamic programming, parsing can be done intime O(n3) and space O(n2)
Giorgio Satta University of Padua NL Parsing
More advanced formalisms
In 1985, Stuart Shieber showed that natural language is notcontext-free
This boosted the investigation of more powerful formalisms
Tree-adjoing grammars
Combinatorial categorial grammars
Linear context-free rewriting systems
Giorgio Satta University of Padua NL Parsing
More Advanced Formalisms
S
A
CFG
S
A
A
TAG
B
A
LCFRS
Giorgio Satta University of Padua NL Parsing
Summary
Part I
• Introduction to natural language parsing
• A little bit of history
Part II
Dependency grammar
• Abstract meaning representation
Giorgio Satta University of Padua NL Parsing
Dependency Grammars
Dependency grammars can be traced back to the work of Frenchlinguist Lucien Tesniere (1893-1954)
Very good balance between linguistic expressivity, annotationcost, and processing efficiency
10K to 102K words per second with greedy parsers
50++ languages covered
Universal dependencies project (ongoing)
Giorgio Satta University of Padua NL Parsing
Dependency Tree
In a dependency tree, clausal structure is determined by a binaryrelation, called dependency, between pair of words called headand dependent
root Rolls-Royce said it expects its U.S. sales to remain steady .
root
nsubj
ccomp
punc
nsubj
xcomp
nn
poss
aux
nsubj
acomp
Giorgio Satta University of Padua NL Parsing
Projectivity
A node is projective if it generates a substring of the input string
Example : Node 6 dominates substring [6, 9]
1root
2Mr.
3Tomash
4will
5remain
6as
7a
8director
9emeritus
10.
root
sbjnmod vc pp nmodnmod
np
punc
Giorgio Satta University of Padua NL Parsing
Projectivity
A node is non-projective if it is not projective
Example : Node 3 dominates substrings [2, 3] and [6, 8]
1root
2A
3hearing
4is
5scheduled
6on
7the
8issue
9today
10.
root
sbjnmod vc
tmppp
npnmod
punc
Giorgio Satta University of Padua NL Parsing
Accuracy (Greedy Parsing)
Labeled (LAS) and unlabeled (UAS) attachment scores on theCoNLL 2007 dataset (including punctuation) and on Penn TreeBank (excluding punctuation)
parser Arabic Basque Catalan Chinese Czech English Greek Hungarian Italian Turkish PTB
UASstd 81.39 75.37 90.32 85.17 78.90 85.69 79.90 77.67 82.98 77.04 89.86dyn 82.56 74.39 90.95 85.65 81.01 87.70 81.85 78.72 84.37 77.21 90.92spine 84.54 75.82 91.92 86.72 81.19 89.37 81.78 77.48 85.38 78.61 91.77
LASstd 71.93 65.64 84.90 80.35 71.39 84.60 72.25 67.66 78.77 65.9 87.56dyn 72.89 65.27 85.82 81.28 72.92 86.79 74.22 69.57 80.25 66.71 88.72spine 74.54 66.91 86.83 82.38 72.72 88.44 74.04 68.76 81.50 68.06 89.53
Giorgio Satta University of Padua NL Parsing
Arc-Standard Parser
The arc-standard parser is very similar to a shift-reducepush-down automaton
Internal states are represented by a very large feature vector(fortunately, also very sparse)
Transition relation inferred from data (not fully observed)
Giorgio Satta University of Padua NL Parsing
Arc-Standard Parser
Example :
Trans Stack Buffer
sh — rootroot Mr. · · ·
sh root Mr. Tomash · · ·
sh root Mr. Tomash will · · ·
la root Mr. Tomash will remain · · ·
Giorgio Satta University of Padua NL Parsing
Arc-Standard Parser
Example (cont’d) :
Trans Stack Buffer
sh root Tomash will remain · · ·
Mr.
la root Tomash will remain as · · ·
Mr.
Giorgio Satta University of Padua NL Parsing
Arc-Standard Parser
Example (cont’d) :
Trans Stack Buffer
sh root will remain as · · ·
Tomash
Mr.
ra root will remain as a · · ·
Tomash
Mr.
Giorgio Satta University of Padua NL Parsing
Arc-Standard Parser
Example (cont’d) :
Trans Stack Buffer
sh root will as a · · ·
Tomash
Mr.
remain
sh root will as a director · · ·
Tomash
Mr.
remain
Giorgio Satta University of Padua NL Parsing
Summary
Part I
• Introduction to natural language parsing
• A little bit of history
Part II
• Dependency grammar
Abstract meaning representation
Giorgio Satta University of Padua NL Parsing
Semantic Parsing
Syntax: Single parsing task rather than separate tasks as, e.g.,base noun identification, prepositional phrase attachment, tracerecovery, verb-argument dependencies, etc.
Semantics: Separate annotations for named entities, co-reference,semantic relations, discourse connectives, temporal entities, etc.
We lack a sembank of sentence, logical meaning pairs
Giorgio Satta University of Padua NL Parsing
Semantic Parsing
Abstract meaning representations (AMRs) are directed acyclicgraphs representing abstract concepts, predicates and semanticrelations realised by natural language sentences
Sentences with the same basic meaning are assigned the sameAMR
He described her as a genius
His description of her: genius
She was a genius, according to his description
Giorgio Satta University of Padua NL Parsing
Abstract Meaning Representation
‘Then the little prince flashed back at me, with a kind ofresentfulness: I don’t believe you!’
Giorgio Satta University of Padua NL Parsing
Abstract Meaning Representation
Formalisms being explored for AMR parsing
Directed acyclic graphs finite automata
Hyper-edge replacement grammars
Long term research program:
incorporate syntax and semantics into parsing
switch from conditional to join models
Giorgio Satta University of Padua NL Parsing