morphological learning as principled argument lars g johnsen university of bergen norway

Post on 18-Dec-2015

216 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Maybe in order to understand mankind, we have to look at the word itself: "Mankind". Basically, it's made up of two separate words - "mank" and "ind". What do these words mean? It's a mystery, and that's why so is mankind.

Jack Handy

There are reasons for positing a word structure

There are at least three conditions on structuring a word w into x.y

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffix

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffix

y selects x

There are at least three conditions on structuring w into x.y

x is a stem and y is a suffix

y selects x

x and y are relevant for the distribution of w

Arguments for x being a stem carries over to an argument that y is a suffix

If x is a stem then x has meaning

If x is a stem then x has meaning

stem(x) → meaning(x)

If x is a stem then x has meaning

stem(x) → meaning(x)

word(x) → meaning(x)

Being a stem is translated into being a word

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

Being a stem is translated into being a word

Pr( stem(x)| w=x.y) ~ Pr(word(x)| w=x.y)

{ | ( . ) ( )}

{ | ( . )}

z z y z

z z y

W W

W

A beta distribution is used for assigning a probability based on the proportion

A beta distribution is used for assigning a probability based on the proportion

beta(positive, negative)

The top ten listMorph Ratio SD Prob Pos Neg

less 95 1 94 368 18

' 93 1 92 1723 133

's 92 0 92 9783 857

ship 91 2 89 167 16

like 91 2 89 140 14

house 91 3 88 75 7

'll 91 3 88 105 11

head 88 4 85 61 8

fish 88 4 84 66 9

stone 87 4 83 66 10

The top ten listMorph Ratio SD Prob Pos Neg

less 95 1 94 368 18

' 93 1 92 1723 133

's 92 0 92 9783 857

ship 91 2 89 167 16

like 91 2 89 140 14

house 91 3 88 75 7

'll 91 3 88 105 11

head 88 4 85 61 8

fish 88 4 84 66 9

stone 87 4 83 66 10

The top ten listMorph Ratio SD Prob Pos Neg

less 95 1 94 368 18

' 93 1 92 1723 133

's 92 0 92 9783 857

ship 91 2 89 167 16

like 91 2 89 140 14

house 91 3 88 75 7

'll 91 3 88 105 11

head 88 4 85 61 8

fish 88 4 84 66 9

stone 87 4 83 66 10

Analyzing easiness

easi ness 78

eas iness 46

easines s 42

easin ess 6

easine ss 4

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

Analyzing termites

Suffix Ratio SD Prob Pos Neg

s 42 0 42 18098 25001

ites 43 4 40 78 102

es 23 0 23 2094 6925

tes 19 1 17 211 927

The measure of meaning captures the stem and suffix part

x is a stem and y is a suffix

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Selectional relation is treated as the predictive power of the stem and suffix

easinesseasi → easi.er, easi.ly ness → readi.ness, fond.ness, hard.ness

eas → eas.ier, eas.ily, eas.ter, eas.toniness → read.iness,

Combining the endings from the stem and the starts from the suffix results in a collection of possible words

The first hypothesis is easi.ness

easi → .er, .ly ness → readi., fond., hard.

readi.er, readi.ly, fond.er, fond.ly, hard.er, hard.ly

5 positive 1 negative approx 90%

The second hypothesis is eas.iness

eas → .ier,.ily,.ter,.toniness → read.

read.ier, read.ily, read.ter, read.ton

1 positive 3 negative, 25%

easi.ness is best on both accounts and is the preferred analysis

top related