a coalgebraic view on context-free languages and streamshzantema/strsemjw.pdf · introduction...

28
Introduction Regular expressions and languages Context-free languages Generalizing context-freeness Conclusions and further work A Coalgebraic View on Context-Free Languages and Streams Joost Winter Centrum Wiskunde & Informatica May 10, 2011 Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Upload: others

Post on 23-Aug-2020

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

A Coalgebraic View on Context-Free Languagesand Streams

Joost Winter

Centrum Wiskunde & Informatica

May 10, 2011

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 2: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Overview

I The coalgebraic picture of regular languages and expressions,and likewise that of rational streams and power series, iswell-known.

I It is interesting to see how this work can be extended to, infirst instance, context-free languages.

I In this presentation, a coalgebraic treatment of context-freelanguages through systems of behavioural differentialequations is given.

I This definition format can be generalized to arbitrary formalpower series (in noncommuting variables), including streams,yielding a notion of context-free power series and streams.

I Some examples of streams that are found to be ‘context-free’in this sense are given.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 3: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Formal power series, formal languages, and streams

I Given a finite set A called the alphabet, and a semiring R, aformal power series on A with coefficients in R is a function

A∗ → R.

I When R is the Boolean semiring {0, 1} (with 1 + 1 = 1),formal power series on A with coefficients in R correspond toformal languages over the alphabet A.

I When A is a singleton set, formal power series on A withcoefficients on R correspond to streams over R.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 4: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

F-Coalgebras

Given a functor F , an F -coalgebra consists of a tuple (X , f ):

I X is a set, the carrier set.

I f is a function from X to FX .

Diagrammatically:

X

FX

f

?

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 5: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Coalgebras representing formal power series (1)

In the this talk, we will be concerned with coalgebras over functorsof the type R × (−)A. But what does this mean?

I R is some semiring.

I × is the cartesian product.

I A is a finite set called the alphabet.

I (−) is a placeholder for the carrier set.

I XA denotes the function space from A to X .

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 6: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Coalgebras representing formal power series (2)

So: a coalgebra (X , f ) over the functor R × (−)A consists of a setX and a function f that maps every x ∈ X to an elementf (x) ∈ R × XA.In this talk, we will use the following notation:

I Given x ∈ X , o(x) (called the output value of x) will be thefirst component of f (x).

I Given x ∈ X , xa (called the a-derivative of x) will be thesecond component of f (x), applied to a.

So: for every x , o(x) is an element of the semiring R, and forevery x and a, xa is an element of X again.When (in the case of streams) the alphabet is a singleton set {a},we usually write x ′ instead of xa.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 7: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Coalgebras representing formal power series (3)

We can extend the notion of derivatives from alphabet symbols(i.e. elements of A) to words (i.e. elements of A∗) inductively:

I xλ = x

I xa·w = (xa)w .

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 8: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Homomorphisms and bisimulations (1)

Given two R × (−)A-coalgebras (X , f ) and (Y , g), a functionh : X → Y is a homomorphism if the following hold:

1. For every x ∈ X , o(x) = o(h(x)).

2. For every x ∈ X and a ∈ A, h(xa) = (h(x))a.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 9: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Homomorphisms and bisimulations (2)

Given two R × (−)A-coalgebras (X , f ) and (Y , g), a relationR ⊆ X × Y is a bisimulation if the following hold:

1. If (x , y) ∈ R, then o(x) = o(y).

2. If (x , y) ∈ R, then for all a ∈ A, (xa, ya) ∈ R.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 10: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Final coalgebras (1)

Consider the 2× (−)A-coalgebra (L, l) defined as follows:

I L is the set of all languages on the alphabet A.I For any L ∈ L:

I o(L) is 1 iff the empty word is in L.I Fa = {w | a · w ∈ L}.

This is a final coalgebra: for every 2× (−)A-coalgebra (X , f ),there is a unique homomorphism h from (X , f ) to (L, l).Given a 2× (−)A-coalgebra (X , f ), and an element x ∈ X , we letJxK denote the value of x under this unique homomorphism.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 11: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Final coalgebras (2)

Generalizing the picture from the previous slide to arbitrarysemirings R, the final coalgebras will sets of formal power series:

I S is the set of all formal power series on A with coefficients inR, i.e. the function space from A∗ to R.

I For any F ∈ S:I o(F ) = F (λ).I Fa = G : G (x) = F (a · x).

Again, we let JxK denote the value of x under the finalhomomorphism.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 12: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

OverviewFormal power series, formal languages, and streamsF-CoalgebrasCoalgebras representing formal power seriesHomomorphisms and bisimulationsFinal coalgebras

Languages and streams

The presentation given above is a generalization of both coalgebrasrepresenting languages and coalgebras representing streams:

I When we choose the Boolean semiring {0, 1}, we getcoalgbras representing languages over the alphabet A.

I When we choose a singleton alphabet A, we get coalgebrasrepresenting streams over the semiring R.

The next few slides will be about coalgebras for the functorD := 2× (−)A of formal languages.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 13: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

The coalgebra of regular expressionsKleene’s theorem, coalgebraically

The coalgebra of regular expressions

The set E of regular expressions over a finite alphabet A and thesemiring (2,+, ·, 0, 1) can be defined as follows:

t ::= a ∈ A | r ∈ 2 | t + t | t · t | t∗

We can assign a D-coalgebra structure to this set of regularexpressions by specifying the output values and derivatives for eachexpression, giving us a D-coalgebra (E , e):

t o(t) tar ∈ 2 r 0b ∈ A 0 if b = a then 1 else 0u + v o(u) + o(v) ua + vau · v o(u) · o(v) ua · v + o(u) · vau∗ 1 ua · u∗

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 14: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

The coalgebra of regular expressionsKleene’s theorem, coalgebraically

Kleene’s theorem, coalgebraically

I For any language L ∈ L, we define the subcoalgebra generatedby L as

〈L〉 := {Lw |w ∈ A∗}

It is easy to see that this indeed generates a subcoalgebra:given any K ∈ 〈L〉, it is easy to see that for every a ∈ A, alsoKa ∈ 〈L〉. In other words, 〈L〉 is closed under takingderivatives to alphabet symbols.

I Kleene’s theorem, coalgebraically (Rutten, 1998): For anyL ∈ L, 〈L〉 is finite iff there is a regular expression t such thatL = JtK.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 15: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

Introduction: context-free grammars and languages

I The ‘next step up’ from regular expressions and languages,and finite automata, in the Chomsky hierarchy, are thecontext-free languages and grammars, and pushdownautomata.

I We will present a format of coinductively defined systems ofequations: it turns out that these systems of equationschracterize precisely the context-free languages.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 16: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

Systems of equations (1)

We will use terms t specified as follows:

t ::= a ∈ A | x ∈ X | r ∈ 2 | t + t | t · t

where X is a finite set of variables, and A, as before, is a finitealphabet. Given X , we let TX denote the set of terms over X .A well-formed system of equations, for a set of variables X ,consists of:

1. For every x ∈ X , exactly one equation of the form o(x) = v ,where v ∈ {0, 1}.

2. For every x ∈ X and a ∈ A, exactly one equation of the formxa = t, where t ∈ TX .

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 17: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

Systems of equations (2)

Alternatively, we can consider a well-formed system of equations asa mapping

f : X → 2× TXA

We can extend such a mapping f to the D-coalgebra (TX , f̄ )generated by (X , f ) as follows:

t o(t) tax ∈ X o(x) xa (as specified by f )r ∈ 2 r 0b ∈ A 0 if b = a then 1 else 0u + v o(u) + o(v) ua + vau · v o(u) · o(v) ua · v + o(u) · va

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 18: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

Systems of equations (3)

I This construction can be summarized diagrammatically:

X ⊂

i- TX

J K- L

2× TXA

f

?-

2× LA

l

?

I Proposition: A language L is context-free iff there is awell-formed system of equations (X , f ) and an x ∈ X , suchthat JxK = L w.r.t. the coalgebra (TX , f̄ ) generated by it.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 19: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

From CFGs to systems of equations (1)

I We say a context-free grammar is in weak Greibach normalform, if every production rule has a right hand side eitherequal to the empty word λ, or of the form a · t.

I As the name implies, this is a weakening of the more familiarGreibach normal form. As a direct result, every CFG can berepresented in weak Greibach normal form.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 20: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

From CFGs to systems of equations (2)

We transform a CFG G in weak Greibach normal form into asystem of equations as follows:

I We let the set X of variables be equal to the set ofnonterminals in the grammar.

I Given a x ∈ X , we set o(x) = 1 iff the grammar contains aproduction rule x → λ.

I Given a x ∈ X and an a ∈ A, we set

xa =∑{w | x → a · w}

Given an initial symbol x0 ∈ X , we now have o((x0)w ) = 1 (and,hence, w ∈ Jx0K) iff w is in the language generated by G .

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 21: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Introduction: context-free grammars and languagesSystems of equationsFrom CFGs to systems of equationsFrom systems of equations to CFGs

From systems of equations to CFGs

Conversely, given a system of equations, we can construct a CFGin weak Greibach normal form:

I We first transform the system of equations to a new,equivalent, system, in which all derivatives are in disjunctivenormal form, and do not contain any superfluous 0s or 1s.

I Derivatives in this new system are disjunctions of sequences ofalphabet symbols and variables.

I We let the grammar include a rule x → λ whenever o(x) = 1.

I We let the grammar include a rule x → a · w , whenever w is asequence of alphabet symbols and variables occurring as adisjunct in xa.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 22: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Context-free streams

Generalizing context-freeness

I Note that most of the definitions on the earlier slides do notnecessarily require the underlying semiring to be the Booleansemiring!

I Taking the definition of regular expressions and applying it toarbitrary semirings, we obtain rational streams and powerseries, which are well-known.

I Furthermore, we can easily generalize the notion above, ofcontext-free languages, to context-free streams andcontext-free power series.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 23: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Context-free streams

Context-free streams (1)

When we take the semiring of natural numbers as underlyingsemiring, the Catalan numbers

1, 1, 2, 5, 14, 42, 132, 429, 1430, . . .

are context-free, and are generated by the following system ofequations:

o(x) = 1 x ′ = x · x

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 24: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Context-free streams

Context-free streams (2)

When we take the Boolean field (with 1 + 1 = 0) as underlyingsemiring, the Prouhet-Thue-Morse sequence

10 01 0110 01101001 . . .

is context-free, and is generated by the following system ofequations:

o(x) = 1 x ′ = y

o(y) = 0 y ′ = z

o(z) = 0 z ′ = x + y + z + z · wo(w) = 1 w ′ = y · w

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 25: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Context-free streams

Context-free streams (3)

Still using the Boolean field as underlying semiring, thecontext-free system of equations

o(x) = 1 x ′ = y

o(y) = 1 y ′ = z

o(z) = 0 z ′ = w

o(w) = 1 w ′ = v

o(v) = 1 v ′ = v + w + v · x + x · x

gives us the paperfolding sequence, or Dragon curve sequence. . .

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 26: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Context-free streams

Context-free streams (4)

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 27: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Conclusions and further work

I There is a very neat coalgebraic representation of regularexpressions, and Kleene’s theorem can be expressed succinctlyin a coalgebraic fashion.

I We have extended this work towards context-free languagesand grammars, and provided a coalgebraic characterizationusing systems of equations.

I The first steps towards a generalization to other functors ofthe type R × (−)A havs been made, and there are some neatexamples of context-free streams.

I Future work: further investigate this notion of ‘generalizedcontext-freeness’ of power series, and see how this relates toother, existing notions.

Joost Winter A Coalgebraic View on Context-Free Languages and Streams

Page 28: A Coalgebraic View on Context-Free Languages and Streamshzantema/strsemjw.pdf · Introduction Regular expressions and languages Context-free languages Generalizing context-freeness

IntroductionRegular expressions and languages

Context-free languagesGeneralizing context-freenessConclusions and further work

Bibliography

[Jacobs/Rutten, 1997] Bart Jacobs, Jan Rutten, A Tutorial on(Co)Algebras and (Co)Induction

[Rutten, 1998] Jan Rutten, Automata and Coinduction (AnExercise in Coalgebra)

[Rutten, 2005] Jan Rutten, A Coinductive Calculus of Streams

[Silva, 2010] Alexandra Silva, Kleene Coalgebra

[Winter/Bonsangue/Rutten, 2011] Joost Winter, MarcelloBonsangue, Jan Rutten, Context-free Languages,Coalgebraically

Joost Winter A Coalgebraic View on Context-Free Languages and Streams