600.325/425 declarative methods - j. eisner1 constraints on strings

70
600.325/425 Declarative Metho ds - J. Eisner 1 Constraints on Strings

Upload: myra-roberts

Post on 03-Jan-2016

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

1

Constraints on Strings

Page 2: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

2

What’s a constraint, again?

X=0

1

2

3

4

5

X

Y

unary binary

A set ofallowedvalues

A set ofallowed

value pairs

Infinite sets?Sure …

Infinite subsetsof (pairs of)

integers, reals, …How about soft constraints?

Page 3: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

3

What’s a constraint on strings? Hard constraint:

Does string S match pattern P? (Is it in the set?) A description of a set of strings Like a constraint … how?

S is a variable whose domain is set of all strings! So P can be regarded as a unary constraint: let’s write

P(S).

Soft constraint: How well does string S fit pattern P? A function mapping each string to a score / weight / cost. Like a soft constraint …

Page 4: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

4

What is a pattern? What operations would you expect for combining

these string constraints?

If P is a pattern, then so is ~P ~P matches exactly the strings that P doesn’t

If P and Q are both patterns, then so is P & Q If P and Q are both patterns, then so is P | Q

Wow, we can build up boolean formulas! Does this allow us to encode SAT? How?

Page 5: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

5

More about the relation to constraints By building complicated patterns from simple ones, we are

building up complicated constraints!

That is also allowed in ECLiPSe: alldiff3(X,Y,Z) :- X #\= Y, Y \#= Z, X \#= Z.

between(X,Y,Z) :- X #< Y, Y #< Z. % either this between(X,Y,Z) :- X #> Y, Y #> Z. % ... or this

Now we can use “alldiff3” and “between” as new constraints

Hang on, patterns are only unary constraints. Generalize?

between(X,Y,Z) :- (X #< Y, Y #< Z) or (X #> Y, Y ># Z).

Page 6: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

6

What is a pattern?

Binary constraint (relation): What are all the possible translations of string S? A description of a set of string pairs (S,T) Like a binary constraint: let’s write P(S,T) We can also do n-ary constraints more generally,

but most current solvers don’t allow them

Fuzzy case: How strongly is string S related to each T? Which one is it most strongly related to?

Ok, so what’s new here? Why does it matter that they’re string variables?

Page 7: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

7

Some Pattern Operators

~ complementation ~P& intersection P & Q| union P | Q

concatenation PQ* iteration (0 or more) P*+ iteration (1 or more) P+- difference P - Q\ char complement \P (equiv. to ?-

P)Which of these can be treated as syntactic sugar?That is, which of these can we get rid of?

Page 8: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

8

More Pattern Operators

.x. crossproduct P .x. Q

.o. composition P .o. Q

.u upper (input) language P.u “domain”

.l. lower (output) language P.l “range”

Page 9: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

9

The language of “regular expressions” A variable S has infinitely many possible values if its type is

“string” or “real” So to specify a constraint on S, not enuf to list possible values Language for simple constraints on reals: linear equations Language for simple constraints on strings: regular expressions

Regular expression language You probably know the standard form of regular expressions

Standard regexp is a unary constraint (“X must match a*b(c|d)*”) Basic operators: union “|”, concatenation, closure “*”

But the language has been extended in various ways: soft constraints (specifies costs) binary constraints (over pairs of string variables) n-ary constraints (over n string variables)

Page 10: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

10

Regular expressions finite-state automata

1. Given a regexp that specifies a constraint, you can build an FSA that efficiently determines whether a given string satisfies the constraint.

2. Given an FSA, you can find an equivalent regexp.

So the “compiled” form of the little language can be converted back to the source form.

Conclusion: Anything you can do with regexps, you can do with FSAs, and vice-versa.

Page 11: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

11

Given a regular expression …

1. Make a parse tree for it

2. Build up the FSA from the bottom up

Example: (ab|c)*(bb*a)

a b

cconcat

union

closure

b

b

aconcat

concat

closure

concat

Page 12: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

12

Concatenation (of soft constraints)

==

example thanks to M. Mohri

Page 13: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

13

++

Union

==

example thanks to M. Mohri

Page 14: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

14

++

Union

==

example thanks to M. Mohri

eps/0

eps/0.3

eps/0.8

Page 15: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

15

Closure (also illustrates binary constraints)

==

**

why add new start state 4?why add new start state 4?why not just make state 0 final?why not just make state 0 final?

example thanks to M. Mohri

Page 16: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

16

Complementation

M represents a constraint on strings We’d like to represent ~M

(i.e., a constraint that says that the string must not be accepted by M)

Just change M’s final states to non-final and vice-versa

Only works if every string takes you to exactly one state in M (final or non-final). So M must be both deterministic and complete. Any M can be put in this form.

example thanks to M. Mohri

Page 17: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

17

Intersectionfat/0.5

10 2/0.8pig/0.3 eats/0

sleeps/0.6

fat/0.210 2/0.5

eats/0.6

sleeps/1.3

pig/0.4

&&

0,0fat/0.7

0,1 1,1pig/0.7

2,0/0.8

2,2/1.3

eats/0.6

sleeps/1.9

==

example adapted from M. Mohri

Page 18: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

18

Intersectionfat/0.5

10 2/0.8pig/0.3 eats/0

sleeps/0.6

0,0fat/0.7

0,1 1,1pig/0.7

2,0/0.8

2,2/1.3

eats/0.6

sleeps/1.9

==

fat/0.210 2/0.5

eats/0.6

sleeps/1.3

pig/0.4

&&

Paths 0012 and 0110 both accept fat pig eats So must the new machine: along path 0,0 0,1 1,1 2,0

example adapted from M. Mohri

Page 19: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

19

fat/0.5

fat/0.2

Intersection

10 2/0.5

10 2/0.8pig/0.3 eats/0

sleeps/0.6

eats/0.6

sleeps/1.3

pig/0.4

0,0fat/0.7

0,1==

&&

Paths 00 and 01 both accept fatSo must the new machine: along path 0,0 0,1

Page 20: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

20

pig/0.3

pig/0.4

Intersectionfat/0.5

10 2/0.8eats/0

sleeps/0.6

fat/0.210 2/0.5

eats/0.6

sleeps/1.3

0,0fat/0.7

0,1pig/0.7

1,1==

&&

Paths 00 and 11 both accept pigSo must the new machine: along path 0,1 1,1

Page 21: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

21

sleeps/0.6

sleeps/1.3

Intersectionfat/0.5

10 2/0.8pig/0.3 eats/0

fat/0.210

eats/0.6

pig/0.4

0,0fat/0.7

0,1 1,1pig/0.7

sleeps/1.9 2,2/1.3

2/0.5

==

&&

Paths 12 and 12 both accept fatSo must the new machine: along path 1,1 2,2

Page 22: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

22

eats/0.6

eats/0

sleeps/0.6

sleeps/1.3

Intersectionfat/0.5

10 2/0.8pig/0.3

fat/0.210

pig/0.4

0,0fat/0.7

0,1 1,1pig/0.7

sleeps/1.9

2/0.5

2,2/0.8

eats/0.6 2,0/1.3

==

&&

Page 23: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

23

Intersection

Why is intersection guaranteed to terminate?

How big a machine might be produced by intersection?

Page 24: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

24

Given a regular expression …

1. Make a parse tree for it

2. Build up the FSA from the bottom up

Example: (ab|c)*(bb*a)

a b

cconcat

union

closure

b

b

aconcat

concat

closure

concat

Page 25: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

25

Given an FSA …Find a regular expression describing all paths from initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ((e23 e33* e35) | e24 e45)

5>

Page 26: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

26

Paths from 1 to 5:

e12 ((e23 e33* e35) | e24 e45)

Given an FSA …Find a regular expression describing all paths from initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

5>

Page 27: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

27

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Given an FSA …Find a regular expression describing all paths from initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45)) | (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5>

Page 28: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

28

Given an FSA …Find a regular expression describing all paths from initial state 1 to final state 5.

1

2

3

4

Paths from 1 to 5:

???

5

>

Page 29: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

29

Does there exist any path from initial state 1 to final state 5?

Let’s do a simpler variant first …

1 2 3

4

5>

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

slide thanks to R. Tamassia & M. Goodrich (modified)

More generally, transitive closure problem:For each A, B, does there exist any pathfrom A to B?

Page 30: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

30

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Does there exist any path from initial state 1 to final state 5?

Let’s do a simpler variant first … Hmm … should I look for

a 13 path first in hopes of using it to build a 15 path? Or vice-versa?

More generally, transitive closure problem:For each A, B, does there exist any pathfrom A to B?

1

2

3

45

>

1 2 3 5>

1 2 5 3>

Page 31: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

31

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Let’s do a simpler variant first … Hmm … should I look for

a 13 path first in hopes of using it to build a 15 path? Or vice-versa?

1 2 3 5>

1 2 5 3>

Option #1:Gradually build up longer paths (length-1, length-2, length-3 …) How do we deal with cycles?

Option #2 (less obvious):Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Both have O(n3) runtime. But option #2 allows more

flexible handling of cycles. We’ll need that when we return to our FSA problem.

Page 32: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

32

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm

Hmm … should I look for a 13 path first in hopes of using it to build a 15 path? Or vice-versa?

1 2 5 3>

Option #2 (less obvious):Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

What are the paths of order 0? What are the paths of order 1? What are the paths of order 2? How big can a path’s order be? What are the paths of order 5?

Page 33: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

33

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm Option #2 (less obvious):

Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Definition: pkij = true iff there is an

ij path of order k.

1. Define p0: For each i,j, set p0ij

= true iff there is an ij edge.2. For k=1, 2, …n, define pk:

1

2

3

45

>

Page 34: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

34

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Floyd-Warshall transitive closure algorithm Option #2 (less obvious):

Gradually allow paths of higher and higher order, where a path’s order is the number of the highest vertex that the path goes through.

Definition: pkij = true iff there is an

ij path of order k.

1. Define p0: For each i,j, set p0ij =

true iff there is an ij edge.2. For k=1, 2, …n, define pk:

For each i,j, set pij

k = pijk-1 v (pik

k-1 ^ pkjk-1)

3. return pn (e.g., what is pn1n ?)

k

j

i

Uses only vertices

numbered 1,…,k-1 Uses only

verticesnumbered

1,…,k-1

New: but still uses only vertices

numbered 1,…,k

parts of slide thanks to R. Tamassia & M. Goodrich

Page 35: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

35

Floyd-Warshall Example

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Page 36: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

36

Floyd-Warshall: k=1 (computes p1 from p0)

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Page 37: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

37

Floyd-Warshall: k=2 (computes p2 from p1)

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Page 38: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

38

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=3 (computes p3 from p2)

Page 39: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

39

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=4 (computes p4 from p3)

Page 40: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

40

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=5 (computes p5 from p4)

Page 41: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

41

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=6 (computes p6 from p5)

Page 42: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

42

v2

v1v3

v4

v5

v6

v7

slide thanks to R. Tamassia & M. Goodrich (modified)

Floyd-Warshall: k=7 (computes p7 from p6)

Page 43: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

43

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)Find a regular expression describing all paths from initial state 1 to final state 5.

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45)) | (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5>

Page 44: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

44

Regular expression version (Kleene/Tarjan)Find a regular expression describing all paths from initial state 1 to final state 5.

1

2

3

4

Paths from 1 to 5:

???

5

>

Page 45: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

45

If there’s a way to get from 1 to 3 and from3 to 5, then there's a way to get from 1 to 5.

Regular expression version (Kleene/Tarjan)Definition: pk

ij = regular expression describing all ij paths that have order k.

1. Define p0: For each i,j, set p0ij = eij

if that edge exists, else .2. For k=1, 2, …n, define pk:

For each i,j, set pijk =

pijk-1 | (pik

k-1 pkkk-1* pkj

k-1)

(a regexp using all three of union, concat, closure!)

3. return pn (e.g., what is pn1n ?)

k

j

i

Uses only vertices

numbered 1,…,k-1 Uses only

verticesnumbered

1,…,k-1

New: but still uses only vertices

numbered 1,…,k

parts of slide thanks to R. Tamassia & M. Goodrich

Page 46: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

46

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)What if the arcs have labels?

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45)) | (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5>

a

a b

c

b

aa

Page 47: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

47

Paths from 1 to 5:

e12 ((e23 e33* (e35 | e34 e45)) | e24 e45)

Regular expression version (Kleene/Tarjan)What if the arcs have labels?Just substitute them in:

1 2 3

4

Paths from 1 to 5:

e12 ( (e23 (e33 | e34 e43 )* (e35 | e34 e45)) | (e24 (e43 e33* e34 )* (e45 | e43 e35)))

5>a b

c

b

aaa

a b c a b

ba caa

a

Page 48: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

48

Regular languages as points in a high-dimensional space abc abc abc:2 2abc (weighted) ab|ac ab + ac a(b|c) ab + ac a(b|(c:2)) ab + 2ac ab* c ac + abc + abbc + abbbc + … a(b:2)*c ac + 2abc + 4abbc +8abbbc + …

Instead of dimensions x2, y2, xy, etc.,every possible string is a dimension

and its coefficient is the coordinate (often 0)

Page 49: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

49

Suppose P, Q are two regular languages represented as these “formal power series.”

What is the sum P+Q? Union! We double-count …

What is the product PQ? Concatenation!

What is the Hadamard product P Q? (i.e., the dot product before you sum: x y = (x1y1, x2y2, …)) Intersection!

What is 1/(1-P)? * closure!

Could we use these techniques to classify strings using kernel SVMs?

Regular languages as points in a high-dimensional space

Page 50: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 50

Function from strings to ...

a:x/.5

c:z/.7

:y/.5.3

Acceptors (FSAs) Transducers (FSTs)

a:x

c:z

:y

a

c

Unweighted

Weighted a/.5

c/.7

/.5.3

{false, true} strings

numbers (string, num) pairs

Page 51: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 51

Sample functions

Unweighted

Weighted

{false, true} strings

numbers (string, num) pairs

Grammatical?

How grammatical?Better, how likely?

MarkupCorrectionTranslation

Good markupsGood correctionsGood translations

Acceptors (FSAs) Transducers (FSTs)

Page 52: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 52

Sample data, encoded same way

Unweighted

Weighted

{false, true} strings

numbers (string, num) pairs

Input stringCorpusDictionary

Input latticeReweighted corpusWeighted dictionary

Bilingual corpusBilingual lexiconDatabase (WordNet)

Prob. bilingual lexiconWeighted database

Acceptors (FSAs) Transducers (FSTs)b a n a n a

a i dd

Page 53: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 53

Some Applications Prediction, classification, generation of text More generally, “filling in the blanks”

(probabilistic reconstruction of hidden data)

Speech recognition Machine translation, OCR, other noisy-channel models Sequence alignment / Pdit distance / Computational

biology Text normalization, segmentation, categorization Information extraction Stochastic phonology/morphology, including lexicon Tagging, chunking, finite-state parsing Syntactic transformations (smoothing PCFG rulesets)

Page 54: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 54

Finite-state “programming”

Object code

compiler

Function

Source code

programmer

Finite-state machine

regexpcompiler

Better object code

optimizer

Better object code

determinization,minimization,pruning

Function on strings

Regular expression

programmer

ca

a?c*

Programming Langs Finite-State World

Page 55: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 55

Finite-state “programming”

Function composition

FST/WFST composition

Function inversion (available in Prolog)

FST inversion

 

 

Higher-order functions

...

Finite-state operators

...

Small modular cooperating functions (structured programming)

Small modular regexps, combined via operators

Programming Langs Finite-State World

Page 56: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 56

Finite-state “programming”

Parallelism Apply to set of strings

Nondeterminism Nondeterministic FST

Stochasticity Prob.-weighted arcs

Programming Langs Finite-State World

More features you wish other languages had!

Page 57: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 57

p(x)=

Finite-State Operations

Projection GIVPS YOU marginal distribution

domain( p(x,y)

)

p(y)= range( p(x,y)

)

a : b / 0.3 a : b / 0.3

Page 58: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 58

0.3 p(x) + 0.7 q(x)=

Finite-State Operations

Probabilistic union GIVPS YOU mixture model

p(x) +0.3 q(x)

p(x)

q(x)

0.3

0.7

Page 59: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 59

p(x) + (1- )q(x)=

Finite-State Operations

Probabilistic union GIVPS YOU mixture model

p(x) + q(x)

p(x)

q(x)

1-

Learn the mixture parameter !

Page 60: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 60

p(x|z)=

Finite-State Operations

Composition GIVPS YOU chain rule

p(x|y)

o p(y|z)

p(x,z)=o zp(x|y)

o p(y|z)

The most popular statistical FSM operation

Cross-product construction

Page 61: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 61

Finite-State Operations

Concatenation, probabilistic closure HANDLP unsegmented text

p(x) q(x)

p(x)p(x) q(x) *0.3

0.3

0.7

p(x)

Just glue together machines for the different segments, and let them figure out how to align with the text

Page 62: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 62

Finite-State Operations

Directed replacement MODPLS noise or postprocessing

p(x, noisy y)= p(x,y) o

Resulting machine compensates for noise or postprocessing

D

noise model definedby dir. replacement

Page 63: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 63

p(x)*q(x)=

Finite-State Operations

Intersection GIVPS YOU product models e.g., exponential / maxent, perceptron, Naïve Bayes, …

p(x) & q(x)

pNB(y | x)& p(y)

p(A(x)|y)

& p(B(x)|y)

&

Cross-product construction (like composition)

Need a normalization op too – computes x f(x) “pathsum” or “partition function”

Page 64: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 64

Finite-State Operations

Conditionalization (new operation)

p(y | x)= condit( p(x,y)

)

p(x,y)

Construction:reciprocal(determinize(domain( ))) op(x,y

) not possible for all weighted FSAs

Resulting machine can be composed with other distributions: p(y | x) * q(x)

Page 65: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 65

Other Useful Finite-State Constructions Complete graphs YIPLD n-gram models Other graphs YIPLD fancy language models (skips, caching, etc.)

Compilation from other formalism FSM: Wordlist (cf. trie), pronunciation dictionary ... Speech hypothesis lattice Decision tree (Sproat & Riley) Weighted rewrite rules (Mohri & Sproat) TBL or probabilistic TBL (Roche & Schabes) PCFG (approximation!) (e.g., Mohri & Nederhof) Optimality theory grammars (e.g., Pisner) Logical description of set (Vaillette; Klarlund)

Page 66: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 66

Object code

compiler

Function

Source code

programmer

Finite-state machine

regexpcompiler

Better object code

optimizer

Better object code

determinization,minimization,pruning

Function on strings

Regular expression

programmer

ca

a?c*

Programming Langs Finite-State World

Regular Expression Calculus as a Programming Language

Page 67: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 67

Regular Expression Calculus as a Modelling Language

Oops! Statistical FSMs still done “in assembly language”! Build machines by manipulating arcs and states For training,

get the weights by some exogenous procedure and patch them onto arcs

you may need extra training data for this you may need to devise and implement a new variant of PM

Would rather build models declaratively

((a*.7 b) +.5 (ab*.6)) repl.9((a:(b +.3 ))*,L,R)

Page 68: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 68

A Simple Example: Segmentation

tapirseatgrass tapirs eat grass?tapir seat grass?

tap irse at grass?...

Strategy: build a finite-state model of p(spaced text, spaceless text)

Then maximizep(???, tapirseatgrass)

Start with a distributionp(English word) a machine D (for dictionary)

Construct p(spaced text) (D space)*0.99 D

Compose withp(spaceless | spaced) ((space)+(space:))*

Page 69: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner 69

train on spacedor spaceless text

Strategy: build a finite-state model of p(spaced text, spaceless text)

Then maximizep(???, tapirseatgrass)

Start with a distributionp(Pnglish word) a machine D (for

dictionary)Construct

p(spaced text) (D space)*0.99 DCompose with

p(spaceless | spaced) ((space)+(space:))*

A Simple Example: Segmentation

D should include novel words:D = KnownWord +0.99 (Letter*0.85 Suffix)

Could improve to consider letter n-grams, morphology ...

Noisy channel could do more than just delete spaces:Vowel deletion (Semitic); OCR garbling ( cl d, ri n, rn m ...)

Page 70: 600.325/425 Declarative Methods - J. Eisner1 Constraints on Strings

600.325/425 Declarative Methods - J. Eisner

70