finite-state automata 3 morphology day 14 ling 681.02 computational linguistics harry howard tulane...

21
Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

Upload: rachel-thompson

Post on 26-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

Finite-state automata 3

MorphologyDay 14

LING 681.02Computational Linguistics

Harry HowardTulane University

Page 2: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

2

Course organization

http://www.tulane.edu/~ling/NLP/NLTK is installed on the computers in this

room!How would you like to use the Provost's

$150?

Page 3: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

SLP §2.2 Finite-state automata

2.2.6 Recognition as search

Page 4: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

4

Non-deterministic recognition: Search

In a non-deterministic FSA, there is at least one path through the machine for a string that is in the language defined by the machine.

There is no path through the machine that leads to an accept state for a string not in the language.

But not all paths directed through the machine for an accept string lead to an accept state.

Page 5: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

5

Non-deterministic recognition

So success in non-deterministic recognition occurs when a path is found through the machine that ends in an accept.

Failure occurs when all of the possible paths for a given string lead to failure.

Page 6: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

6

Back to the example

b a a a ! $

q0 q1 q2 q2 q3 q4

Page 7: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

7

Exampleq

0

b a a a !q

1

b a a a !q

2

b a a a !

q

2

b a a a !q

2

b a a a !

X

q

3

b a a a !q

4

b a a a !

1

2

3

4

5

6

Page 8: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

8

Summary

States in the search space are pairings of tape positions and states in the machine.

By keeping track of as yet unexplored states, a recognizer can systematically explore all the paths through the machine given an input.

Page 9: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

9

Keeping track

But how do you keep track?Depth-first/last in first out (LIFO)/stack

Unexplored states are added to the front of the agenda, and they are explored by going to the most recent.

Breadth-first/first in first out (FIFO)/queueUnexplored states are added to the back of the

agenda, and they are explored by going to the most recent.

Page 10: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

10

Depth-first/LIFO/stackq2

q18q12

q41

q27

q2

q12

q27

q50

q31

stack

Page 11: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

11

Breadth-first/FIFO/queue

q2

q18q12

q41

q27

q2 q12 q27

q50

q31

queue

Page 12: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

SLP §2.2 Finite-state automata

2.2.7 Comparison

Page 13: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

13

Equivalence

Non-deterministic machines can be converted to deterministic ones with a fairly simple construction.

That means that they have the same power:non-deterministic machines are not more

powerful than deterministic ones in terms of the languages they can accept.

Page 14: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

14

Why bother?

Non-determinism doesn’t get us more formal power and it causes headaches, so why bother?

More natural (understandable) solutions.

Page 15: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

SLP §3 Words and transducers

Intro

Page 16: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

16

Concepts and terminology

study of spelling study of word composition to build a structured

representation of a word or sentence

input to this process a process that applies

without limitations Can all forms be stored in

advance?

orthographymorphologyparsingsurface or input formproductive

Page 17: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

17

Concepts and terminology

the minimal meaning-bearing unit in a language

the main unit additional units a unit that:

precedes the main one follows the main one surrounds the main one is inserted within the main one

a language in which the main unit can have many additional units

morphemestemaffixprefixsuffixcircumfixinfixagglutinative

Page 18: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

18

Concepts and terminology

Combining an affix to a stem does not change the part of speech of the stem.

Combining an affix to a stem DOES change the part of speech of the stem.

Combining multiple stems.Combining a stem with a

phonologically reduced stem.

inflectionderivationcompoundingcliticization

Page 19: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

SLP §3 Words and transducers

§3.1 Survey of (mostly) English morphology

Page 20: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

25-Sept-2009 LING 681.02, Prof. Howard, Tulane University

20

Inflectional morphology

stem -s -ing preterite past part.

walk walks walking walked walked

try tries trying tried tried

map maps mapping mapped mapped

eat eats eating ate eaten

catch catches catching caught caught

be is being was been

Page 21: Finite-state automata 3 Morphology Day 14 LING 681.02 Computational Linguistics Harry Howard Tulane University

Next time

P4

SLP §3.2ff