greed%is%good%if%randomized:%new%inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf ·...

67
Greed is Good if Randomized: New Inference for Dependency Parsing 1 Yuan Zhang CSAIL, MIT Joint work with Tao Lei, Regina Barzilay, and Tommi Jaakkola

Upload: others

Post on 18-Jul-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Greed  is  Good  if  Randomized:  New  Inference  for  Dependency  Parsing  

1  

Yuan Zhang CSAIL, MIT

Joint work with Tao Lei,

Regina Barzilay, and Tommi Jaakkola

Page 2: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Inference vs. Scoring

2  

Inference  

Scoring  Func.on  

Approximate  

Exact  

Limited   Expressive  

Page 3: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Inference vs. Scoring

3  

Inference  

Scoring  Func.on  

Approximate  

Exact  

Limited   Expressive  

Minimum  Spanning  Tree  

Page 4: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Inference vs. Scoring

4  

Inference  

Scoring  Func.on  

Approximate  

Exact  

Limited   Expressive  

Minimum  Spanning  Tree  

Reranking  

•  Reranking:  incorporate  arbitrary  features  

Page 5: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Inference vs. Scoring

5  

Inference  

Scoring  Func.on  

Approximate  

Exact  

Limited   Expressive  

Minimum  Spanning  Tree  

Reranking  

Dual  DecomposiKon  

•  Reranking:  incorporate  arbitrary  features  •  Dual  DecomposiKon:  search  in  full  space  

Page 6: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Parsing Complexity

•  High-­‐order  parsing  is  NP-­‐hard  (McDonald  et  al.,  2006)  •  Hypothesis:  parsing  is  easy  on  average  •  Many  NP-­‐hard  problems  are  easy  on  average  

- MAX-­‐SAT  (Resende  et  al.,  1997)  - Set  cover  (Hochbaum,  1982)  

6  

Page 7: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Parsing Complexity

•  High-­‐order  parsing  is  NP-­‐hard  (McDonald  et  al.,  2006)  •  Hypothesis:  parsing  is  easy  on  average  •  Many  NP-­‐hard  problems  are  easy  on  average  

- MAX-­‐SAT  (Resende  et  al.,  1997)  - Set  cover  (Hochbaum,  1982)  

7  

We  show  •  Analysis  on  average  parsing  complexity  •  A  simple  inference  algorithm  based  on  the  analysis  

Page 8: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Our Approach

8  

Our            Approach  

Inference  

Scoring  Func.on  

Approximate  

Exact  

Limited   Expressive  

Minimum  Spanning  Tree  

Reranking  

Dual  DecomposiKon  

•  Reranking:  incorporate  arbitrary  features  •  Dual  DecomposiKon:  search  in  full  space  

Page 9: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Core Idea

•  Climb  to  the  opKmal  tree  in  a  few  small  greedy  steps  

9  

1)  Randomly  sample  a  dependency  tree  2)  Greedily  improve  the  tree  one  edge  at  a  Kme  

3)  Repeat  (2)  unKl  converge  

Randomized  Hill-­‐climbing  

For  k  =  1  to  K  

Select  the  tree  with  the  highest  score    

Page 10: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Core Idea

•  Climb  to  the  opKmal  tree  in  a  few  small  greedy  steps  

10  

1)  Randomly  sample  a  dependency  tree  2)  Greedily  improve  the  tree  one  edge  at  a  Kme  

3)  Repeat  (2)  unKl  converge  

Randomized  Hill-­‐climbing  

That’s  it!  

For  k  =  1  to  K  

Select  the  tree  with  the  highest  score    

Page 11: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

It Works!

11  

89.44%  

88.73%  

Our  Full  

Turbo  

Parsing  Performance  on  CoNLL  Dataset  

Dual  Decomposi;on  

Page 12: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

12  

“  I  ate  an  apple  today”  

Example

Page 13: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

13  

today

apple

I

ate

an

ROOT  

“  I  ate  an  apple  today”  

Initial treeExample

Page 14: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

14  

today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  Target tree

Initial treeExample

Page 15: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

15  

today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

Target tree

Initial treeExample

Page 16: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

16  

today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

Target tree

Example

Page 17: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

17  

today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

Target tree

Example

Page 18: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

18  

today

apple

I

ate

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an

Target tree

Example

Page 19: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

19  

today

apple

I

ate

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an

Target tree

Example

Page 20: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

20  

today

apple

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an I

ate

Target tree

Example

Page 21: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

21  

today

apple

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an I

ate

Target tree

Example

Page 22: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

22  

today

apple

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an I

ate

Target tree

Example

Page 23: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

23  

today

apple

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

an I

ate

Target tree

Example

Page 24: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

24  

today

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

an

ate

today

apple

I

ate

apple

an

Target tree

Example

Page 25: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

25  

today

ROOT  

today apple I

ate

an

ROOT  

“  I  ate  an  apple  today”  

I

ate

apple

an

Target tree

Example

Page 26: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

26  

Reachability:  transforming  any  tree  to  any  other  tree  

•  maintaining  the  structure  a  valid  tree  at  any  point  

•  using  as  few  as  d  steps  (d  :  head  differences/hamming  distance)  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

y(0) y(T )

Why Greedy Has a Chance to Work

Page 27: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

27  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

y(0) y(T )increase  S(x, y(t ) )

Greedy Hill-climbing

Page 28: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

28  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

y(0) y(T )increase  S(x, y(t ) )

Greedy Hill-climbing

Arbitrary  features  in  the  scoring  func;on  

Page 29: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

29  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

y(0) y(T )

tree  y

score  S

local  opKmum  

global  opKmum  

increase  S(x, y(t ) )

Challenge: Local Optimum

Page 30: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

tree  y

score  S

Hill-climbing with Restarts

30  Overcome  local  opKma  via  restarts  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

Page 31: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

31  

Hill-climbing with Restarts

Random  iniKalizaKon  (e.g.  uniform)  

……  max  

……  

Overcome  local  opKma  via  restarts  

……  today

apple

I

ate

an

ROOT  

today apple I

ate

an

ROOT  

y(0)

y(0)

y(0)

y(T )

y(T )

y(T )

Hill-­‐climbing  

Page 32: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Learning Algorithm

•  Follow  common  max-­‐margin  framework  

32  

∀y ∈ T (x) S(x, y) ≥ S(x, y)+ | y− y |−ξ

§           is  the  gold  tree  y

Page 33: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Learning Algorithm

•  Follow  common  max-­‐margin  framework  

•  Adopt  passive-­‐aggressive  online  learning  framework  (Crammer  et  al.  2006)    

•  Decode  with  our  randomized  greedy  algorithm    

33  

∀y ∈ T (x) S(x, y) ≥ S(x, y)+ | y− y |−ξ

§           is  the  gold  tree  y

Page 34: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Analysis

34  

Page 35: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Analysis

35  

TheoreKcal   Empirical  

First-­‐order  

Page 36: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Analysis

36  

TheoreKcal   Empirical  

First-­‐order  

High-­‐order   ?

Page 37: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Analysis

37  

TheoreKcal   Empirical  

First-­‐order  

High-­‐order   ?

Page 38: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

38  

Search Space Complexity: First-order

10  words  

Page 39: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

39  

Search Space Complexity: First-order

10  words  

≈  2  billion  trees  

Page 40: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

40  

Search Space Complexity: First-order

10  words  

≈  2  billion  trees  

<  512  local  opKma  

Page 41: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

41  

Search Space Complexity: First-order

Theorem:  For  any  first-­‐order  scoring  funcKon:  

•  there  are  at  most  2n-­‐1  locally  opKmal  trees  

•  this  upper  bound  is  .ght  

Page 42: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

42  

Search Space Complexity: First-order

Theorem:  For  any  first-­‐order  scoring  funcKon:  

•  there  are  at  most  2n-­‐1  locally  opKmal  trees  

•  this  upper  bound  is  .ght  

2n-­‐1  is  sKll  a  lot,  but  it  is  the  worst  case  

Page 43: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

43  

Search Space Complexity: First-order

Theorem:  For  any  first-­‐order  scoring  funcKon:  

•  there  are  at  most  2n-­‐1  locally  opKmal  trees  

•  this  upper  bound  is  .ght  

2n-­‐1  is  sKll  a  lot,  but  it  is  the  worst  case  

What  about  the  average  case?  

Page 44: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

44  

Algorithm for Counting Local Optima

root

Johnsaw

Mary

10

9 20

30

30

0

9

Page 45: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

45  

Algorithm for Counting Local Optima

The  method  is  based  on  Chu-­‐Liu-­‐Edmonds  algorithm    

root

Johnsaw

Mary

10

9 20

30

30

0

9

•  Select  the  best  heads  independently  

Page 46: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

46  

Algorithm for Counting Local Optima

root

Johnsaw

Mary

10

9 20

30

30

0

9

•  Contract  the  cycle  and  recursively  count  the  local  opKma  Ø     Any  local  opKmum  exactly  reassigns  one  edge  in  the  cycle  

root

Johnsaw

Mary9 30

9 root

Johnsaw

Mary

10

30

9

+  

Page 47: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

47  

Empirical Results: First-order

How  many  local  opKma  in  real  data?  

#  Op.ma  on  English  Dataset  

%  sentences  

Page 48: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

48  

Empirical Results: First-order

50%  

21  

How  many  local  opKma  in  real  data?  

#  Op.ma  on  English  Dataset  

%  sentences  

Page 49: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

49  

Empirical Results: First-order

50%   70%  

21  121  

How  many  local  opKma  in  real  data?  

#  Op.ma  on  English  Dataset  

%  sentences  

Page 50: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

50  

Empirical Results: First-order

50%   70%   90%  

21  121  

2000  

How  many  local  opKma  in  real  data?  

#  Op.ma  on  English  Dataset  

%  sentences  

Page 51: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

51  

Empirical Results: First-order

Does  the  hill-­‐climbing  find  the  argmax?  

Finding  Global  Op.mum  on  English  

Page 52: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

52  

Empirical Results: First-order

Does  the  hill-­‐climbing  find  the  argmax?  

Len.  ≤  15  100%  

Len.  >  15  99.3%  

Easy  search  space  leads  to  successful  decoding  

Finding  Global  Op.mum  on  English  

Page 53: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Empirical Results: High-order

%  Cer.ficate  94.5%  

Dual  decomposiKon  (Koo  et  al.,  2010)  

SDD  =  SHC  99.8%  

Given  a  cerKficate  by  DD  

Comparison  on  English  Given  DD  Cert.  

Does  the  hill-­‐climbing  find  the  argmax?  

Page 54: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Empirical Results: High-order

Overall  Comparison  on  English  

Does  the  hill-­‐climbing  find  the  argmax?  

Page 55: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Empirical Results: High-order

SDD  =  SHC  98.7%  

Overall  Comparison  on  English  

Does  the  hill-­‐climbing  find  the  argmax?  

Page 56: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Empirical Results: High-order

SDD  =  SHC  98.7%  

SDD  <  SHC  1.0%  

SDD  >  SHC  0.3%  

Overall  Comparison  on  English  

Does  the  hill-­‐climbing  find  the  argmax?  

Page 57: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Experimental Setup

57  

Implementa;on  

§  AdapKve  restarKng  strategy  with    

Datasets  §  14  languages  in  CoNLL  2006  &  2008  shared  tasks  

Features  

§  Up  to  3rd-­‐order  (three  arcs)  features  used  in  MST/Turbo  parsers  

§  Global  features  used  in  re-­‐ranking  

K = 300

Page 58: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Baselines and Evaluation Measure

58  

Baselines:  §  Turbo  Parser:  Dual  DecomposiKon  with  3rd-­‐order  

features  (MarKns  et  al.,  2013)  

§  Sampling-­‐based  Parser:  MCMC  sampling  with  global  features  (Zhang  et  al.,  2014)  

Evalua;on  Measure:  

§  Unlabeled  Amachment  Score  (UAS),  without  punctuaKons  

Page 59: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Comparing with Baselines

59  

89.24%  

89.23%  

88.66%  

88.73%  

Our  Full  

Sampling-­‐based  

Our  3rd  

Turbo  (DD)  

Our  Full  w/o  Tensor  

Sampling-­‐based  (MCMC)  

Our  Full  w/o  Tensor  

Page 60: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Comparing with Baselines

60  

89.24%  

89.23%  

88.66%  

88.73%  

Our  Full  

Sampling-­‐based  

Our  3rd  

Turbo  (DD)  

Our  Full  w/o  Tensor  

Sampling-­‐based  (MCMC)  

Page 61: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Comparing with Baselines

61  

89.24%  

89.23%  

88.66%  

88.73%  

Our  Full  

Sampling-­‐based  

Our  3rd  

Turbo  (DD)  

Our  Full  w/o  Tensor  

Sampling-­‐based  (MCMC)  

Global  Features  

Page 62: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Impact of Initialization

62  

88.0   88.1  

84.0  

85.0  

86.0  

87.0  

88.0  

89.0  

Uniform   Rnd-­‐1st  

UAS

(%)  

Page 63: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Impact of Restarts

63  

85.4  

88.1  

84.0  

85.0  

86.0  

87.0  

88.0  

89.0  

No  Restart   300  Restarts  

UAS

(%)  

Page 64: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Convergence Property

64  

•  Score  normalized  by  the  highest  score  in  3000  restarts    

0 100 200 300 400 500

0.994

0.996

0.998

1

# Restarts

Scor

e

Len ≤ 15Len > 15

Convergence  Analysis  on  English  

Page 65: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Trade-off between Speed and Performance

65  

Decoding  Speed  on  English  

2 4 6 8 10x 10−3

88

90

92

94

Sec/Tok

UAS

3rd−order ModelFull Model

Fast   Slow  

Page 66: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Conclusion

•  Analysis:  we  invesKgate  average  case  complexity  of  parsing    •  Algorithm:  we  introduce  a  simple  randomized  greedy  inference  algorithm  

 

 Source  code  available  at:  hSps://github.com/taolei87/RBGParser  

66  

Page 67: Greed%is%Good%if%Randomized:%New%Inference% …people.csail.mit.edu/yuanzh/papers/greed.pdf · Greed%is%Good%if%Randomized:%New%Inference% for%Dependency%Parsing% 1 Yuan Zhang CSAIL,

Thank  You!  

67