easy and hard constraint ranking in ot

36
Easy and Hard Constraint Ranking in OT Jason Eisner U. of Rochester August 6, 2000 – SIGPHON - Luxembourg

Upload: samuru

Post on 25-Feb-2016

64 views

Category:

Documents


2 download

DESCRIPTION

Easy and Hard Constraint Ranking in OT. Jason Eisner U. of Rochester August 6, 2000 – SIGPHON - Luxembourg. Outline. The Constraint Ranking problem Making fast ranking faster Extension: Considering all competitors How hard is OT generation? Making slow ranking slower. C 1. C 2. C 4. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Easy and Hard  Constraint Ranking in OT

Easy and Hard Constraint Ranking in OT

Jason EisnerU. of Rochester

August 6, 2000 – SIGPHON - Luxembourg

Page 2: Easy and Hard  Constraint Ranking in OT

Outline The Constraint Ranking problem Making fast ranking faster Extension: Considering all

competitors How hard is OT generation? Making slow ranking slower

Page 3: Easy and Hard  Constraint Ranking in OT

The Ranking Problem

Constraint

Ranker

finitepositive

data <C3, C1, C2, C5, C4>or “fail”

Find grammar consistent with data(or just determine whether one exists)

How efficient can this be? Different from Gold learnability Proposed by Tesar & Smolensky

C2

C1 C

3

C4

C5n constraints

m items

Page 4: Easy and Hard  Constraint Ranking in OT

What Is Each Input Datum?

A pairwise ranking g > h An attested form g An attested set G

1 grammatical element - learner doesn’t know which! Captures uncertainty about the representation or

underlying form of the speaker’s utterance Today we’ll assume learner does know underlying

gazebo

{ ga(zé.bo),

(ga.zé)bo }

Possibilities from Tesar & Smolensky

Page 5: Easy and Hard  Constraint Ranking in OT

Key Results

A pairwise ranking g > h An attested form g An attested set G

1 grammatical element - learner doesn’t know which!

Captures uncertainty about the representation or underlying form of the speaker’s utterance

Today we’ll assume learner does know underlyinggazeb

o{ ga(zé.bo

), (ga.zé)bo

}

linear time in n coNP-hard2-complete

even with m=1

Page 6: Easy and Hard  Constraint Ranking in OT

Outline The Constraint Ranking problem Making fast ranking faster Extension: Considering all

competitors How hard is OT generation? Making slow ranking slower

Page 7: Easy and Hard  Constraint Ranking in OT

Pairwise Rankings: g > h

Must eliminate h before C1 or C2 makes it winC4 or C5 » C1C4 or C5 » C2

Satisfying these is necessary and sufficient

C1 C2 C3 C4 C5g

h

favor h favor g

constraints not ranked yet

Page 8: Easy and Hard  Constraint Ranking in OT

More Pairwise Rankings …

g > h g’ > h’ g’’ > h’’C4 or C5 »

C1C2 » C1

C4 or C5 » C2

C1 or C3 or C5 » C2

C2 » C3

C2 » C4

C1 or C3 or C5 » C4We’ll now use Recursive Constraint Demotion (RCD)(Tesar & Smolensky - easy greedy algorithm)

evidence from more pairs

Page 9: Easy and Hard  Constraint Ranking in OT

1

2 4

5

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

Page 10: Easy and Hard  Constraint Ranking in OT

1

2 4

5

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

Needn’t be dominated by anyone

Page 11: Easy and Hard  Constraint Ranking in OT

1

4

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

2

5

Page 12: Easy and Hard  Constraint Ranking in OT

1

4

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

25 »

Page 13: Easy and Hard  Constraint Ranking in OT

1

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

25 » » 4

Page 14: Easy and Hard  Constraint Ranking in OT

Recursive Constraint Demotiong > h g’ > h’ g’’ > h’’

C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

How to find undominated constraint at each step?

T&S simply search: O(mn) per search O(mn2) But we can do better:

Abstraction: Topological sort of a hypergraph Ordinary topological sort is linear-time; same

here!

Page 15: Easy and Hard  Constraint Ranking in OT

shrink representation of hypergraph

1

2 4

5

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

n=nodesM=edges mn

0

2

2

2

1

maintain count of parents

5

Page 16: Easy and Hard  Constraint Ranking in OT

1

2 4

3

g > h g’ > h’ g’’ > h’’C4 or C5 » C1 C2 » C1C4 or C5 » C2 C1 or C3 or C5 » C2

C2 » C3C2 » C4 C1 or C3 or C5 » C4

1

1

0

1

maintain count of parents

n=nodesM=edges mn

Delete that structure in time proportional to its sizeMaintain list of red nodes: find next in time O(1)Total time: O(M+n), down from O(Mn)

Page 17: Easy and Hard  Constraint Ranking in OT

Comparison: Constraint Demotion

Tesar & Smolensky 1996 Formerly same speed, but now RCD is faster

Advantage: CD maintains a full ranking at all times Can be run online (memoryless) This eventually converges; but not a conservative strategy

Current grammar is often inconsistent with past data To make it conservative:

On each new datum, rerank from scratch using all data (memorized)

Might as well use faster RCD for this Modifying the previous ranking is no faster, in worst case

Page 18: Easy and Hard  Constraint Ranking in OT

Outline The Constraint Ranking problem Making fast ranking faster Extension: Considering all

competitors How hard is OT generation? Making slow ranking slower

Page 19: Easy and Hard  Constraint Ranking in OT

New Problem Observed data: g, g’, … Must beat or tie all competitors

(Not enough to ensure g > h, g’ > h’ …)

Just use RCD? Try to divide g’s competitors h into equiv.

classes But can get exponentially many classes Hence exponentially many blue nodes

Page 20: Easy and Hard  Constraint Ranking in OT

But Greedy Algorithm Still Works Preserves spirit of RCD Greedily extend grammar 1 constraint at a time No compilation into hypergraph

14 325 »

chosen so far remaining

25 » 1»

25 » 3»

25 » 4»

check these partial grammars:pick one making g, g’, … optimal(maybe with ties to be broken later)

Page 21: Easy and Hard  Constraint Ranking in OT

But Greedy Algorithm Still Works Preserves spirit of RCD Greedily extend grammar 1 constraint at a time No compilation into hypergraph But must run OT generation mn2 times

To pick each of n constraints, check m forms under n grammars

We’ll see that this is hard … T&S’s solution also runs OT generation mn2 times

Error-Driven Constraint Demotion For n2 CD passes, for m forms, find (profile of) optimal

competitor That requires more info from generation - we’ll return to

this!

Page 22: Easy and Hard  Constraint Ranking in OT

Continuous Algorithms Simulated annealing

Boersma 1997: Gradual Learning Algorithm Constraint ranking is stochastic, with real-valued bias &

variance Maximum likelihood

Johnson 2000: Generalized Iterative Scaling (maxent) Constraint weights instead of strict ranking

Deal with noise and free variation! How many iterations to convergence?

Page 23: Easy and Hard  Constraint Ranking in OT

Outline The Constraint Ranking problem Making fast ranking faster Extension: Considering all

competitors How hard is OT generation? Making slow ranking slower

Page 24: Easy and Hard  Constraint Ranking in OT

Complexity Classes: Boolean

P

NPx(x)

coNPx(x)

Dp

2 =NPNP

xy(x,y)

2 =PNP

polytime w/ oracle: NP subroutinesrun in unit

time

X-hard X-complete = hardest in X

Page 25: Easy and Hard  Constraint Ranking in OT

Complexity Classes: Integer Integer-valued functions have classes too

FP (like P) Turing-machine polytime OptP (like NP x(x) ) min f(x) FPNP (like PNP = 2 )

Note: OptP-complete FPNP-complete Can ask Boolean questions about output of an OptP-

complete function; often yields complete decision problems

Page 26: Easy and Hard  Constraint Ranking in OT

OptP-complete Functions Traveling Salesperson

Minimum cost for touring a graph? Minimum Satisfying Assignment

Minimum bitstring b1 b2 … bn satisfying (b1, b2, … bn), a Boolean formula?

Optimal violation profile in OT! Given underlying form Given grammar of bounded finite-state constraints Clearly in OptP: min f(x) where f computes violation

profile As hard as Minimum Satisfying Assignment

Page 27: Easy and Hard  Constraint Ranking in OT

Hardness Proof Given formula (b1, b2, … bn) Need minimum satisfier b1b2 … bn (or 11…1 if unsat) Reduce to finding minimum violation profile

Let OT candidates be bitstrings b1b2 … bn

Let constraint C() be satisfied if (b1, b2, … bn)

C() C(b1) C(b2) C(b3)000 only

satisfiers survive

past here

0 0 0001 0 0 1010 0 1 0… …

Page 28: Easy and Hard  Constraint Ranking in OT

Subtlety in the Proof Turning into a DFA for C() might blow it

up exponentially - so not poly reduction! Luckily, we’re allowed to assume is in

CNF: = D1^ D2 ^ … Dm

C(D1) … C(Dm)

C(b1) C(b2) C(b3)

000 equivalent to C();

only satisfiers survive past here

0 0 0001 0 0 1010 0 1 0… …

Page 29: Easy and Hard  Constraint Ranking in OT

Another Subtlety Must ensure that if there is no satisfying

assignment, 11…1 wins Modify each C(Di) so that 11…1 satisfies it At worst, this doubles the size of the DFA

C(D1) … C(Dm)

C(b1) C(b2) C(b3)

000 equivalent to C();

only satisfiers survive past here

0 0 0001 0 0 1010 0 1 0… …

Page 30: Easy and Hard  Constraint Ranking in OT

Associated Decision Problems

OptVal FPNP-complete

OptVal < k? NP-completeOptVal = k? Dp-complete

Last bit of OptVal?

2 -complete

Is g optimal? coNP-complete

Is some g G optimal?

2 -complete

EDCD

RCD (mult.competitors)

Page 31: Easy and Hard  Constraint Ranking in OT

Is some g G optimal? Problem is in 2 =PNP:

OptVal < k? is in NP So binary search for OptVal via NP oracle Then ask oracle: g G with profile OptVal?

Completeness: Given , we built grammar making the MSA optimal 2-complete problem: Is final bit of MSA zero? Reduction: Is some g in {0,1}n-10 optimal? Notice that {0,1}n-10 is a natural attested set

gazebo { ga(zé.bo), (ga.zé)bo

}

Page 32: Easy and Hard  Constraint Ranking in OT

Outline The Constraint Ranking problem Making fast ranking faster Extension: Considering all

competitors How hard is OT generation? Making slow ranking slower

Page 33: Easy and Hard  Constraint Ranking in OT

Ranking With Attested Forms

Complexity of ranking? If restricted to 1 form: coNP-complete

no worse than checking correctness of ranking!

General lower bound: coNP-hard General upper bound: 2 =PNP

because RCD solves with O(mn2) many checks

Page 34: Easy and Hard  Constraint Ranking in OT

Ranking With Attested Sets Problem is in 2 xy(x,y)

(ranking, g G) h : g > h In fact 2-complete!

Proof by reduction from QSAT2 b1,…br c1,…cs (b1,…br, c1,…cs)

Few natural problems in this category Some learning problems that get positive and

negative evidence OT only has implicit negative evidence: no

other form can do better than the attested form

Page 35: Easy and Hard  Constraint Ranking in OT

Conclusions Easy ranking easier than known Hard ranking harder than known Adding bits of realism quickly drives

complexity of ranking through the roof Optimization adds a quantifier:

generation

ranking w/ uncertainty

derivational FP NP-complete

NP-complete

OT OptP-complete

coNP-hard, in 2

2-complete

Page 36: Easy and Hard  Constraint Ranking in OT

Open Questions Rescue OT by restricting something? Effect of relaxing restrictions?

Unbounded violations Non-finite-state constraints Non-poly-bounded candidates Uncertainty about underlying form

Parameterized analysis (Wareham 1998) Should exploit structure of Con

huge (linear time is too long!) but universal