formal learning theory: an introduction

Post on 18-Nov-2021

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Formal Learning Theory:an Introduction

Roberto Bonato

roberto.bonato@labri.fr

LaBRI

Universite de Bordeaux I

Universita degli Studi di Verona

Formal Learning Theory:an Introduction – p.1/35

Contents

Basic notions of Learnability Theory

Gold’s model of grammatical inference:identification in the limit

Some representative results of(un)learnability

Learning Categorial Grammars:the RG algorithm

Formal Learning Theory:an Introduction – p.2/35

Contents

Basic notions of Learnability Theory

Gold’s model of grammatical inference:identification in the limit

Some representative results of(un)learnability

Learning Categorial Grammars:the RG algorithm

Formal Learning Theory:an Introduction – p.2/35

Contents

Basic notions of Learnability Theory

Gold’s model of grammatical inference:identification in the limit

Some representative results of(un)learnability

Learning Categorial Grammars:the RG algorithm

Formal Learning Theory:an Introduction – p.2/35

Contents

Basic notions of Learnability Theory

Gold’s model of grammatical inference:identification in the limit

Some representative results of(un)learnability

Learning Categorial Grammars:the RG algorithm

Formal Learning Theory:an Introduction – p.2/35

The poverty of the stimulus paradox

How comes it that human beings,whose contacts with the worldare brief and personal and limited,are nevertheless able to know as much asthey do know?

Sir Bertrand Russellquoted by Chomsky, 1975

Formal Learning Theory:an Introduction – p.3/35

Its theoretical linguistics version

The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:

Natural language syntax is very complicated

Not so many examples are provided

Negative examples are of no use

Formal Learning Theory:an Introduction – p.4/35

Its theoretical linguistics version

The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:

Natural language syntax is very complicated

Not so many examples are provided

Negative examples are of no use

Formal Learning Theory:an Introduction – p.4/35

Its theoretical linguistics version

The learning paradox: how do children learn thesyntax of their mother tongue (and ratherquickly!) given that:

Natural language syntax is very complicated

Not so many examples are provided

Negative examples are of no use

Formal Learning Theory:an Introduction – p.4/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:

Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:Universal rules

Only types in the lexicon are languagespecific. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific

. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

Chomsky’s solution

Universal Grammar, defined by Principles

(Binary) parameters that specify any givenhuman language (example: SVO vs. SOVlanguages)

Categorial analogy:Universal rulesOnly types in the lexicon are languagespecific. . . MORE on this later

Formal Learning Theory:an Introduction – p.5/35

What is Learnability Theory?

Learnability refers to a set of mathematicalmodels of how a human language can beacquired

Learnability is a constraint on UniversalGrammar:

the class of human languagesmust be learnable

Formal Learning Theory:an Introduction – p.6/35

What is Learnability Theory?

Learnability refers to a set of mathematicalmodels of how a human language can beacquired

Learnability is a constraint on UniversalGrammar:

the class of human languagesmust be learnable

Formal Learning Theory:an Introduction – p.6/35

What is Learnability Theory?

Learnability refers to a set of mathematicalmodels of how a human language can beacquired

Learnability is a constraint on UniversalGrammar:

the class of human languagesmust be learnable

Formal Learning Theory:an Introduction – p.6/35

Why should we care?

Mathematical precision is a good thing!

Learnability Theory can suggest differenttheories of Universal Grammar:

If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.

We can use learnability to constrain the observedset of languages, not just UG.

Formal Learning Theory:an Introduction – p.7/35

Why should we care?

Mathematical precision is a good thing!

Learnability Theory can suggest differenttheories of Universal Grammar:

If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.

We can use learnability to constrain the observedset of languages, not just UG.

Formal Learning Theory:an Introduction – p.7/35

Why should we care?

Mathematical precision is a good thing!

Learnability Theory can suggest differenttheories of Universal Grammar:

If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.

We can use learnability to constrain the observedset of languages, not just UG.

Formal Learning Theory:an Introduction – p.7/35

Why should we care?

Mathematical precision is a good thing!

Learnability Theory can suggest differenttheories of Universal Grammar:

If one can show that some theory of UG can resultin an unlearnable array of possible languages, thattheory must be changed.

We can use learnability to constrain the observedset of languages, not just UG.

Formal Learning Theory:an Introduction – p.7/35

The acquisition framework

Innateness

Positive evidence

Learning = setting the parameters of aUniversal Grammar

Formal Learning Theory:an Introduction – p.8/35

The acquisition framework

Innateness

Positive evidence

Learning = setting the parameters of aUniversal Grammar

Formal Learning Theory:an Introduction – p.8/35

The acquisition framework

Innateness

Positive evidence

Learning = setting the parameters of aUniversal Grammar

Formal Learning Theory:an Introduction – p.8/35

Innateness

A grammar is a finite specification of alanguage.

Innateness holds that the learner can onlyacquire certain kinds of grammars and notothers.

Some language type would therefore beimpossible.

Formal Learning Theory:an Introduction – p.9/35

Positive evidence

In general, children do not learn fromcorrection

R. Brown, C. Hanlon, DerivationalComplexity and the order of acquisition ofchild speech. 1970

Effectively, the input to the learner onlyinclude grammatical sentences:

Steven Pinker, The language instinct.Harper, 2000

Formal Learning Theory:an Introduction – p.10/35

The learning “algorithm”

The learner has a set of possible grammarsto choose from.

The learner is presented with some finite setof sentences.

What grammar does the learner choose?

Human Languages?

CSLCFLRL REL

Formal Learning Theory:an Introduction – p.11/35

The learning “algorithm”

The learner has a set of possible grammarsto choose from.

The learner is presented with some finite setof sentences.

What grammar does the learner choose?

Human Languages?

CSLCFLRL REL

Formal Learning Theory:an Introduction – p.11/35

The learning “algorithm”

The learner has a set of possible grammarsto choose from.

The learner is presented with some finite setof sentences.

What grammar does the learner choose?

Human Languages?

CSLCFLRL REL

Formal Learning Theory:an Introduction – p.11/35

The learning “algorithm”

The learner has a set of possible grammarsto choose from.

The learner is presented with some finite setof sentences.

What grammar does the learner choose?

Human Languages?

CSLCFLRL REL

Formal Learning Theory:an Introduction – p.11/35

Let us play a game...

I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it

I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time

After each rule, you make a guess

I will never tell whether you’re right or not

Formal Learning Theory:an Introduction – p.12/35

Let us play a game...

I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it

I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time

After each rule, you make a guess

I will never tell whether you’re right or not

Formal Learning Theory:an Introduction – p.12/35

Let us play a game...

I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it

I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time

After each rule, you make a guess

I will never tell whether you’re right or not

Formal Learning Theory:an Introduction – p.12/35

Let us play a game...

I think of a certain set of numbers e.g.x : x ≥ 10 and x is even and you have toguess it

I’ll provide you with an infinite number ofclues in the form “the number x belong to theset”, one at a time

After each rule, you make a guess

I will never tell whether you’re right or not

Formal Learning Theory:an Introduction – p.12/35

Some questions

What should count as winning this game?

What happens if I am allowed to select theset of all positive integers?

Formal Learning Theory:an Introduction – p.13/35

Some questions

What should count as winning this game?

What happens if I am allowed to select theset of all positive integers?

Formal Learning Theory:an Introduction – p.13/35

Who are the players?

SCIENTIFIC INDUCTION

Nature vs. Scientists

FIRST LANGUAGE ACQUISITIONAdults vs. Child

Formal Learning Theory:an Introduction – p.14/35

Who are the players?

SCIENTIFIC INDUCTIONNature vs. Scientists

FIRST LANGUAGE ACQUISITIONAdults vs. Child

Formal Learning Theory:an Introduction – p.14/35

Who are the players?

SCIENTIFIC INDUCTIONNature vs. Scientists

FIRST LANGUAGE ACQUISITION

Adults vs. Child

Formal Learning Theory:an Introduction – p.14/35

Who are the players?

SCIENTIFIC INDUCTIONNature vs. Scientists

FIRST LANGUAGE ACQUISITIONAdults vs. Child

Formal Learning Theory:an Introduction – p.14/35

Learning in Gold’s Framework

the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;

at each step i the learner makes a guess Gi

compatible with the examples seen thus far;

the process is infinite:

s1, s2, s3, . . . sn, . . .

G1, G2, G3, . . . Gn, . . .

learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct

Formal Learning Theory:an Introduction – p.15/35

Learning in Gold’s Framework

the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;

at each step i the learner makes a guess Gi

compatible with the examples seen thus far;

the process is infinite:

s1, s2, s3, . . . sn, . . .

G1, G2, G3, . . . Gn, . . .

learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct

Formal Learning Theory:an Introduction – p.15/35

Learning in Gold’s Framework

the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;

at each step i the learner makes a guess Gi

compatible with the examples seen thus far;

the process is infinite:

s1, s2, s3, . . . sn, . . .

G1, G2, G3, . . . Gn, . . .

learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct

Formal Learning Theory:an Introduction – p.15/35

Learning in Gold’s Framework

the learner is provided with infinite stream of examples:s1, s2, . . . , si, . . .;

at each step i the learner makes a guess Gi

compatible with the examples seen thus far;

the process is infinite:

s1, s2, s3, . . . sn, . . .

G1, G2, G3, . . . Gn, . . .

learning is successful when there is a certain point(even if we don’t know which!) after which the guessmade by the learner doesn’t change and is correct

Formal Learning Theory:an Introduction – p.15/35

Grammatical Inference

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

grammars S = samplesΩ =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

grammars S = samples

G

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

G

grammars S = samples

G

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

G

L(G)

grammars S = samples

G

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

G

L(G)

grammars S = samples

G l1

li

l2

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

l1

li

l2

l1

G

L(G)

grammars S = samples

G

G1

φ( )

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

l1

li

l2

l1

l2

G

L(G)

grammars S = samples

G

G1

G2

φ( )

φ( )

Ω =

Formal Learning Theory:an Introduction – p.16/35

Grammatical Inference

l1

li

l2

l1

li

l2

G

L(G)

grammars S = samples

G

G1

iG

G2

φ( )

φ( )

φ( )

Ω =

Formal Learning Theory:an Introduction – p.16/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

More Formally...

Let 〈Ω, S, L〉 be a grammar system

let φ : finite seq. of sentences of S Ω

let 〈si〉i∈N= 〈s0, s1, s2, . . .〉 be an infinite sequence of

sentences from S

let Gi = φ(〈s0, . . . , si〉)

φ converges to G on 〈si〉i∈Nif there exists an n ∈ N

such that for all i ≥ n,

Gi = φ(〈s0, . . . , si〉) is defined

and L(Gi) = L(G)

Formal Learning Theory:an Introduction – p.17/35

Towards Learnability

Convergence is about a function and agrammar

Learnability is about a (learning) function andclasses of grammars

Formal Learning Theory:an Introduction – p.18/35

Learnability in the limit

Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:

for every language L ∈ L(G)

for every infinite sequence 〈si〉i∈Nwhich enumerates

the elements of L (i.e. si|i ∈ N = L)

there exists some G ∈ G such that

L(G) = L

and φ converges to G on 〈si〉i∈N

Formal Learning Theory:an Introduction – p.19/35

Learnability in the limit

Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:

for every language L ∈ L(G)

for every infinite sequence 〈si〉i∈Nwhich enumerates

the elements of L (i.e. si|i ∈ N = L)

there exists some G ∈ G such that

L(G) = L

and φ converges to G on 〈si〉i∈N

Formal Learning Theory:an Introduction – p.19/35

Learnability in the limit

Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:

for every language L ∈ L(G)

for every infinite sequence 〈si〉i∈Nwhich enumerates

the elements of L (i.e. si|i ∈ N = L)

there exists some G ∈ G such that

L(G) = L

and φ converges to G on 〈si〉i∈N

Formal Learning Theory:an Introduction – p.19/35

Learnability in the limit

Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:

for every language L ∈ L(G)

for every infinite sequence 〈si〉i∈Nwhich enumerates

the elements of L (i.e. si|i ∈ N = L)

there exists some G ∈ G such that

L(G) = L

and φ converges to G on 〈si〉i∈N

Formal Learning Theory:an Introduction – p.19/35

Learnability in the limit

Let 〈Ω, S, L〉 be given, and G ⊆ Ω.A learning function φ learns G if:

for every language L ∈ L(G)

for every infinite sequence 〈si〉i∈Nwhich enumerates

the elements of L (i.e. si|i ∈ N = L)

there exists some G ∈ G such that

L(G) = L

and φ converges to G on 〈si〉i∈N

Formal Learning Theory:an Introduction – p.19/35

Initial Pessimism

Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.

just like regular languages!

and context free-grammars!

and many others...

Formal Learning Theory:an Introduction – p.20/35

Initial Pessimism

Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.

just like regular languages!

and context free-grammars!

and many others...

Formal Learning Theory:an Introduction – p.20/35

Initial Pessimism

Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.

just like regular languages!

and context free-grammars!

and many others...

Formal Learning Theory:an Introduction – p.20/35

Initial Pessimism

Gold, 1967: A class G of grammars is notlearnable if L(G) contains all finite languagesand at least one infinite language.

just like regular languages!

and context free-grammars!

and many others...

Formal Learning Theory:an Introduction – p.20/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0

L0

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0 ⊂ L1

L0

L1

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0 ⊂ L1 ⊂ . . .

L0

L1

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0 ⊂ L1 ⊂ . . . ⊂ Ln

L0

L1

Ln

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0 ⊂ L1 ⊂ . . . ⊂ Ln ⊂ . . .

L0

L1

Ln

Formal Learning Theory:an Introduction – p.21/35

More Generally: Limits Points

A class L of languages has a limit point if there exists

an infinite sequence 〈Ln〉n∈Nof languages in L such that

L0 ⊂ L1 ⊂ . . . ⊂ Ln ⊂ . . .

L0

L1

Ln

L

AND there exists another language L in L such that L =⋃

n∈NLn

Formal Learning Theory:an Introduction – p.21/35

A Renewed Interest

Gold, 1968 (pessimist!): neither regular nor context-free

grammars are identifiable in the limit from positive examples.

Angluin, 1980: “pattern” languages are learnable

Shinohara, 1990: more non-trivial classes learnable, k-rigid

context sensitive grammars are learnable!

Kanazawa, 1998: rigid and k-valued classical categorial

grammars are learnable, both from structures and from

strings (but that’s NP-hard)

Formal Learning Theory:an Introduction – p.22/35

A Renewed Interest

Gold, 1968 (pessimist!): neither regular nor context-free

grammars are identifiable in the limit from positive examples.

Angluin, 1980: “pattern” languages are learnable

Shinohara, 1990: more non-trivial classes learnable, k-rigid

context sensitive grammars are learnable!

Kanazawa, 1998: rigid and k-valued classical categorial

grammars are learnable, both from structures and from

strings (but that’s NP-hard)

Formal Learning Theory:an Introduction – p.22/35

A Renewed Interest

Gold, 1968 (pessimist!): neither regular nor context-free

grammars are identifiable in the limit from positive examples.

Angluin, 1980: “pattern” languages are learnable

Shinohara, 1990: more non-trivial classes learnable, k-rigid

context sensitive grammars are learnable!

Kanazawa, 1998: rigid and k-valued classical categorial

grammars are learnable, both from structures and from

strings (but that’s NP-hard)

Formal Learning Theory:an Introduction – p.22/35

A Renewed Interest

Gold, 1968 (pessimist!): neither regular nor context-free

grammars are identifiable in the limit from positive examples.

Angluin, 1980: “pattern” languages are learnable

Shinohara, 1990: more non-trivial classes learnable, k-rigid

context sensitive grammars are learnable!

Kanazawa, 1998: rigid and k-valued classical categorial

grammars are learnable, both from structures and from

strings (but that’s NP-hard)

Formal Learning Theory:an Introduction – p.22/35

Pattern Languages

Σ = a, b, c, . . . is any finite alphabet

Var = x1, x2, x3, . . . set of variables

Σ ∩ Var = ∅

a pattern p over Σ is an element of (Σ ∪ Var)+

L(p)=w | w is obtained from p by replacing variableswith non-empty constant strings

example: L(axbx) = awbw|w ∈ Σ+

Formal Learning Theory:an Introduction – p.23/35

Finite Elasticity

A class L of languages is said to have infiniteelasticity if there exists an infinite sequence〈sn〉n∈N of sentences and an infinite sequence〈Ln〉n∈N of languages in L such that for all n ∈ N,

sn 6∈ Ln

s0, . . . , sn ⊆ Ln+1

A class L of languages is said to have finite elas-

ticity if it doesn’t have infinite elasticity

Formal Learning Theory:an Introduction – p.24/35

A Theorem by Angluin (1979)

Any class G with finite elasticity is inferablefrom positive data

The class of pattern languages has finiteelasticity so...

The class of pattern languages is inferablefrom positive data

Formal Learning Theory:an Introduction – p.25/35

A Theorem by Angluin (1979)

Any class G with finite elasticity is inferablefrom positive data

The class of pattern languages has finiteelasticity so...

The class of pattern languages is inferablefrom positive data

Formal Learning Theory:an Introduction – p.25/35

A Theorem by Angluin (1979)

Any class G with finite elasticity is inferablefrom positive data

The class of pattern languages has finiteelasticity so...

The class of pattern languages is inferablefrom positive data

Formal Learning Theory:an Introduction – p.25/35

Summing up...

L(G) has a limit point ⇒ G is unlearnable ⇒ L(G) has infinite

elasticity

L(G) has finite elasticity ⇒ G is learnable

Formal Learning Theory:an Introduction – p.26/35

Classical Categorial Grammars

Johnnp

likes(np\s)/np

Marynp

np\s

s

\E

s

runsnp\s

Johnnp

\E

/E

Formal Learning Theory:an Introduction – p.27/35

Classical Categorial Grammars

CCGs = typed words

G :

loves 7→ (np\s)/np, np\(s/np)

John 7→ np

Mary 7→ np

runs 7→ np\s

Johnnp

likes(np\s)/np

Marynp

np\s

s

\E

s

runsnp\s

Johnnp

\E

/E

Formal Learning Theory:an Introduction – p.27/35

Classical Categorial Grammars

CCGs = typed words + composition rules

G :

loves 7→ (np\s)/np, np\(s/np)

John 7→ np

Mary 7→ np

runs 7→ np\s

A A\B[\E]

B

B/A A[/E]

B

Johnnp

likes(np\s)/np

Marynp

np\s

s

\E

s

runsnp\s

Johnnp

\E

/E

Formal Learning Theory:an Introduction – p.27/35

Classical Categorial Grammars

CCGs = typed words + composition rules

G :

loves 7→ (np\s)/np, np\(s/np)

John 7→ np

Mary 7→ np

runs 7→ np\s

A A\B[\E]

B

B/A A[/E]

B

Johnnp

likes(np\s)/np

Marynp

np\s

s

\E

s

runsnp\s

Johnnp

\E

/E

Formal Learning Theory:an Introduction – p.27/35

The RG Algorithm (Buszkowski 1989)

Input: finite sets of functor-argumentstructures

Output: a rigid categorial grammar thatgenerates them

/E /E

\E

/E \E

a man

swims

a fish swims fast

D = ,

Formal Learning Theory:an Introduction – p.28/35

The RG Algorithm (Buszkowski 1989)

Input: finite sets of functor-argumentstructures

Output: a rigid categorial grammar thatgenerates them

/E /E

\E

/E \E

a man

swims

a fish swims fast

D = ,

Formal Learning Theory:an Introduction – p.28/35

The RG Algorithm (Buszkowski 1989)

Input: finite sets of functor-argumentstructures

Output: a rigid categorial grammar thatgenerates them

/E /E

\E

/E \E

a man

swims

a fish swims fast

D = ,

Formal Learning Theory:an Introduction – p.28/35

RG runs

/E /E \E

/E \E

fastswimsa fish

swims

a man

_ _

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E \E

/E \E

fastswimsa fish

swims

a man

_ _

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E \E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E \E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

1x

\E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

1x

x2

\E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

x31x

x2

\E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

x4

x31x

x2

\E

/E \E

fastswimsa fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

x4

x31x

x2

\E

/E \E

fastswimsx5

a fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodes

Formal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

x4

x31x

x2

\E

/E \E

fastswimsx5

a fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E /E

x4

x31x

2x1 / x x2

\E

/E \E

fastswimsx5

a fish

s

swims

a

s

man

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

manx2

/E

x4

x3

2x1 / x

1x

\E

/E \E

fastswimsx5

a fish

s

swims

a

s

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

manx2

/E

x4

x3

2x1 / x

x1 \ s1x

\E

/E \E

fastswimsx5

a fish

s

swims

a

s

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x4

x3

\E

s

/E \E

fastswimsx5

a fish

s

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x4

x3

\E

s

/E \E

fastswimsx5

a fish

x3 \ s

s

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x4

x3

\E

ss

/E \E

fastswimsx5

x3 \ s

a fish

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x3 / x 4 x4

x3

\E

ss

/E \E

fastswimsx5

x3 \ s

a fish

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x3 / x 4

ax4

fish

x3

\E

ss

/E \E

fastswimsx5

x \ s3

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x3 / x 4

ax4

fish

x3

\E

ss

/E \E

fastswimsx5 x5 \ s)3 \ (x

x \ s3

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG runs

/E

a

2x1 / xmanx2

1xswimsx1 \ s

/E

x3 / x 4

ax4

fishx5

swimsx5 \ (x 3\ s)

fast

x3 \ sx3

\E

ss

/E \E

Assign a type to each node of the structures:

Assign s to each distinct root node

Assign distinct variables to argument nodes

Compute types for the functor nodesFormal Learning Theory:an Introduction – p.29/35

RG: Collecting Types

GF(D):

a 7→ x1/x2, x3/x4

fast 7→ x5\(x3\s)

fish 7→ x4

man 7→ x2

swims 7→ x1\s, x5

Formal Learning Theory:an Introduction – p.30/35

RG: Collecting Types

GF(D):

a 7→ x1/x2, x3/x4

fast 7→ x5\(x3\s)

fish 7→ x4

man 7→ x2

swims 7→ x1\s, x5

Formal Learning Theory:an Introduction – p.30/35

RG: Unifying Types

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2

swims 7→ x1\s, x5

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2

swims 7→ x1\s, x5 ⇒ x5 7→ x1\s

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2

swims 7→ x1\s, x5 ⇒ x5 7→ x1\s

σ = x3 7→ x1, x4 7→ x2, x5 7→ x1\s

Formal Learning Theory:an Introduction – p.31/35

RG: Unifying Types

a 7→ x1/x2, x3/x4 ⇒ x3 7→ x1, x4 7→ x2

swims 7→ x1\s, x5 ⇒ x5 7→ x1\s

σ = x3 7→ x1, x4 7→ x2, x5 7→ x1\s

RG(D) = σ[GF (D)]:

a 7→ x1/x2

fast 7→ (x1\s)\(x1\s)

fish 7→ x2

man 7→ x2

swims 7→ x1\s

Formal Learning Theory:an Introduction – p.31/35

Properties of RG

φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)

φRG learns Grigid from structures

φRG is incremental

φRG can be implemented to run in linear time

Formal Learning Theory:an Introduction – p.32/35

Properties of RG

φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)

φRG learns Grigid from structures

φRG is incremental

φRG can be implemented to run in linear time

Formal Learning Theory:an Introduction – p.32/35

Properties of RG

φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)

φRG learns Grigid from structures

φRG is incremental

φRG can be implemented to run in linear time

Formal Learning Theory:an Introduction – p.32/35

Properties of RG

φRG(〈T0, . . . , Tn〉) w RG(T0, . . . , Tn)

φRG learns Grigid from structures

φRG is incremental

φRG can be implemented to run in linear time

Formal Learning Theory:an Introduction – p.32/35

Isn’t there a contradiction?

Regular languages are not learnable from positivedata

Context-free languages are not learnable from positivedata

Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy

RL

CFL

RCCL

Formal Learning Theory:an Introduction – p.33/35

Isn’t there a contradiction?

Regular languages are not learnable from positive data

Context-free languages are not learnable from positivedata

Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy

RL

CFL

RCCL

Formal Learning Theory:an Introduction – p.33/35

Isn’t there a contradiction?

Regular languages are not learnable from positive data

Context-free languages are not learnable from positivedata

Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy

RL

CFL

RCCL

Formal Learning Theory:an Introduction – p.33/35

Isn’t there a contradiction?

Regular languages are not learnable from positive data

Context-free languages are not learnable from positivedata

Rigid classical categorial languages are learnable, butthey are “transversal” to Chomsky’s hierarchy

RL

CFL

RCCL

Formal Learning Theory:an Introduction – p.33/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)

Formal Learning Theory:an Introduction – p.34/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)

Formal Learning Theory:an Introduction – p.34/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)

Formal Learning Theory:an Introduction – p.34/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)

Formal Learning Theory:an Introduction – p.34/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)

Formal Learning Theory:an Introduction – p.34/35

Other results

Extensions to k-valued classical categorial grammars

converge (Kanazawa 1998)

k-valued categorial grammars are even learnable from

strings (Kanazawa 1998), but that’s NP-hard

(Costa-Florencio 2002)

Rigid Lambek grammars are learnable from structures

(Bonato 2000)

Lambek rigid grammars are not learnable from strings

(Le Nir and Foret 2002)

Some classes of regular tree languages are learnable

(Marion and Besombes 2001)Formal Learning Theory:an Introduction – p.34/35

Open Issues

Learning WHAT? (CFG, Categorial,Miminalist, . . . )

Learning FROM WHAT? (sentences,structures, skeletons. . . )

Learning HOW? (Gold, PAC,. . . )

Formal Learning Theory:an Introduction – p.35/35

Open Issues

Learning WHAT? (CFG, Categorial,Miminalist, . . . )

Learning FROM WHAT? (sentences,structures, skeletons. . . )

Learning HOW? (Gold, PAC,. . . )

Formal Learning Theory:an Introduction – p.35/35

Open Issues

Learning WHAT? (CFG, Categorial,Miminalist, . . . )

Learning FROM WHAT? (sentences,structures, skeletons. . . )

Learning HOW? (Gold, PAC,. . . )

Formal Learning Theory:an Introduction – p.35/35

Open Issues

Learning WHAT? (CFG, Categorial,Miminalist, . . . )

Learning FROM WHAT? (sentences,structures, skeletons. . . )

Learning HOW? (Gold, PAC,. . . )

Formal Learning Theory:an Introduction – p.35/35

top related