earley-parsing.pdf
TRANSCRIPT
-
7/28/2019 Earley-Parsing.pdf
1/27
11-711 Algorithms for NLP
The Earley Parsing Algorithm
Reading:
Jay Earley,
An Efficient Context-Free Parsing Algorithm
Comm. of the ACM vol. 13 (2), pp. 94102
-
7/28/2019 Earley-Parsing.pdf
2/27
The Earley Parsing Algorithm
General Principles:
A clever hybrid Bottom-Upand Top-Downapproach
Bottom-Upparsing completely guided by Top-Downpredictions
Maintains sets of dotted grammar rules that:
Reflect what the parser has seen so far
Explicitly predict the rules and constituents that will combine
into a complete parse
Similar to Chart Parsing - partial analyses can be shared
Time Complexity
3
, but better on particular sub-classes
Developed prior to Chart Parsing, first efficient parsing
algorithm for general context-free grammars.
1 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
3/27
The Earley Parsing Method
Main Data Structure: The state(or item)
A state is a dotted rule and starting position:
1
The algorithm maintains sets of states, one set for each
position in the input string (starting from 0)
We denote the set for position
by
2 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
4/27
The Earley Parsing Algorithm
Three Main Operations:
Predictor: If state
1
!
then for every
rule of the form # 1 # $ , add to the state
# 1 # $
Completer: If state
1
!
then for every state
in
% of form &
1
$
'
, add to
the state
&
1 $
'
Scanner: If state
1 )
!
and the next
input word is0
1
12
)
, then add to
1
1 the state
1 )
3 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
5/27
The Earley Recognition Algorithm
Simplified version with no lookaheads and for grammars
without epsilon-rules
Assumes input is string of grammar terminal symbols
We extend the grammar with a new rule
3
$
The algorithm sequentially constructs the sets
for
05
5
6
1
We initialize the set 0 with 02
7
3
$
08
4 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
6/27
The Earley Recognition Algorithm
The Main Algorithm: parsing input0
2
0 1 09
1. 02
7
3
$
08
2. For 05
5
do:
Process each itemA
!
in order by applying to it the single
applicable operation among:
(a) Predictor (adds new items to )
(b) Completer (adds new items to
)
(c) Scanner (adds new items to
1
1
3. If
1
12B
, Rejectthe input
4. If
2
and
9
1
127
3
$
08
then Acceptthe input
5 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
7/27
Earley Recognition - Example
The Grammar:
1
CD
E
D
2
CD
)FG )H
3
CD
)FG
4
CD
)H
5
E
D
)I 0 E
D
6
E
D
P
CD
The original input: 0
2 The large can can hold the water
POS assigned input: 0
2 art adj n aux v art n
Parser input: 0
2 art adj n aux v art n $
6 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
8/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
0: 3
$ 0
CD
E
D
0
CD
)FG )H
0
CD
)FG
0
CD
)H
0
1:
CD
)FG
)H
0
CD
)FG
0
7 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
9/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
1: CD )FG )H 0
CD
)FG
0
2:
CD
)FG )H
0
8 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
10/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
2: CD )FG )H 0
3:
CD
)FG )H
0
9 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
11/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
3: CD )FG )H 0
CD
E
D
0
E
D
)I 0 E
D
3
E
D
P
CD
3
4: E
D
)I 0
E
D
3
10 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
12/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
4: ED )I 0 ED 3
E
D
)I 0 E
D
4
E
D
P
CD
4
5:
E
D
P
C D
4
11 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
13/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
5: ED P C D 4
CD
)FG )H
5
CD
)FG
5
CD
)H
5
6: CD
)FG
)H
5
CD
)FG
5
12 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
14/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
6: CD )FG )H 5
CD
)FG
5
7:
CD
)FG
5
13 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
15/27
Earley Recognition - Example
The input: 0
2 art adj n aux v art n $
7: CD )FG 5
E
D
P
CD
4
E
D
)I 0 E
D
3
CD
E
D
0
3
$
0
8:
3
$
0
14 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
16/27
Time Complexity of Earley Algorithm
Algorithm iterates for each word of input (i.e.
iterations)
How many items can be created and processed in
?
Each item in
has the form
1
,0
5
5
Thus
items
The Scannerand Predictoroperations on an item each requireconstant time
The Completeroperation on an item adds items of form
&
1 $
'
to
, with 05
'
5
, so it may require up
to
time for each processed item
Time required for each iteration (
) is thus
2
Time bound on entire algorithm is therefore
3
15 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
17/27
Time Complexity of Earley Algorithm
Special Cases:
Completeris the operation that may require
2
time in
iteration
For unambiguous grammars, Earley shows that the completer
operation will require at most
time
Thus time complexity for unambiguous grammars is
2
For some grammars, the number of items in each
is
bounded by a constant
These are called bounded-stategrammars and include even
some ambiguious grammars.
For bounded-state grammars, the time complexity of the
algorithm is linear -
16 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
18/27
Parsing with an Earley Parser
As usual, we need to keep back-pointers to the constituents
that we combine together when we complete a rule
Each item must be extended to have the form
1
G 1
, where the
G
are pointers to the
already found RHS sub-constituents
At the end - reconstruct parse from the back-pointers
To maintain efficiency - we must do ambiguity packing
17 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
19/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
18 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
20/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
0: 3
$ 0
CD
E
D
0
CD
)FG )H
0
CD
)FG
0
CD
)H
0
1:
CD
)FG 1 )H
0
1)FG
CD
)FG 1
0
19 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
21/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
1: CD )FG 1
)H
0
CD
)FG 1
0
2:
CD
)FG 1 )H
2
0
2)H
20 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
22/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
2: CD )FG 1 )H 2
0
3:
CD 4 )FG 1 )H
2 3
0
3
4 C D )FG 1 )H
2 3
21 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
23/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
3: CD 4 )FG 1 )H 2 3 0
CD 4 ED
0
E
D
)I 0 E
D
3
E
D
P
CD
3
4:
E
D
)I 0
5
E
D
3
5)I 0
22 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
24/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
4: E
D
)I 0 5
E
D
3
E
D
)I 0 E
D
4
E
D
P
CD
4
5:
E
D
P 6 CD
4
6P
23 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
25/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
5: E
D
P 6
CD
4
CD
)FG )H
5
CD
)FG
5
CD
)H
5
6:
CD
)FG
7
)H
5
7)FG
CD
)FG 7
5
24 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
26/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
6: CD
)FG 7
)H
5
CD
)FG 7
5
7:
CD 9 )FG 7 8
5
8
9C D
)FG 7 8
25 11-711 Algorithms for NLP
-
7/28/2019 Earley-Parsing.pdf
27/27
Earley Parsing - Example
The input: 0
2 art adj n aux v art n $
7: CD 9 )FG 7 8
5
E
D 10 P 6 CD 9
4
10E
D
P 6 CD 9
E
D 11 )I 0 5 ED 10
3
11E
D
)I 0 5 ED 10
12 CD 4 ED 11
0
12
CD 4 ED 11
3
$
0
8:
3
$
0
26 11-711 Algorithms for NLP