automata and formal languages - tum...1.3. nondeterministic automata we write p u qif there exists a...

68
Automata and Formal Languages Lecture notes WS 2013/2014 Manfred Kufleitner [email protected] January 28, 2014

Upload: others

Post on 30-Dec-2019

5 views

Category:

Documents


0 download

TRANSCRIPT

Automata and Formal LanguagesLecture notes WS 2013/2014

Manfred Kufleitner

[email protected]

January 28, 2014

Contents

I. Finite Words 5

1. Rational Sets 61.1. Rational expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61.2. Closure properties of rational sets . . . . . . . . . . . . . . . . . . . . . . . . . 71.3. Nondeterministic automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.4. Conversion of rational expressions into automata . . . . . . . . . . . . . . . . 101.5. Conversion of automata into rational expressions . . . . . . . . . . . . . . . . 121.6. Removal of epsilon-transitions . . . . . . . . . . . . . . . . . . . . . . . . . . . 141.7. Equivalence of rational expressions and nondeterministic automata . . . . . . 171.8. Semilinear subsets of commutative monoids . . . . . . . . . . . . . . . . . . . 17

2. Recognizable Sets 192.1. Closure properties of recognizable sets . . . . . . . . . . . . . . . . . . . . . . 202.2. Syntactic monoids . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212.3. Deterministic automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222.4. Minimal automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232.5. Transition monoids and Cayley automata . . . . . . . . . . . . . . . . . . . . 242.6. The Myhill-Nerode Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 262.7. Learning Recognizable Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3. Regular Languages 333.1. McKnight’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333.2. The powerset construction and Kleene’s Theorem . . . . . . . . . . . . . . . . 333.3. Nondeterministic automata and Boolean matrices . . . . . . . . . . . . . . . . 353.4. The relation between rational and recognizable sets . . . . . . . . . . . . . . . 37

4. Algorithmic Properties of Automata 424.1. Boolean operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 434.2. Homomorphisms and inverse homomorphisms . . . . . . . . . . . . . . . . . . 454.3. Residuals and quotients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 464.4. Decision problems for automata . . . . . . . . . . . . . . . . . . . . . . . . . . 494.5. Minimization algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

3

Part I.

Finite Words

5

1. Rational Sets

Let M be a monoid. The rational subsets of M are denoted by RAT(M). Their inductivedefinition is as follows.

• Every finite subset of M is in RAT(M).

• If K,L ∈ RAT(M), then K ∪ L ∈ RAT(M) and KL ∈ RAT(M).

• If K,L ∈ RAT(M), then KL ∈ RAT(M).

• If L ∈ RAT(M), then L∗ ∈ RAT(M).

This means that RAT(M) is the closure of the finite subsets of M under union, product, andKleene star. Note that ∅∗ = {1}.

Example 1.1. Consider the monoid M = N×N×N with componentwise addition. The subsetL = { (`,m, n) | 2(`+m) = n, m ≥ 2 } ⊆M is rational since

L = {(0, 2, 4)} {(0, 1, 2), (1, 0, 2)}∗ .

We note that if the operation inM is addition, then the productKL forK,L ⊆M is sometimesdenoted by K+L. In the above case, L could also be written as {(0, 2, 4)}+{(0, 1, 2), (1, 0, 2)}∗.

3

Proposition 1.1. Let M be a monoid. Then every rational subset of M is contained in afinitely generated submonoid of M .

Proof. The proof is by induction on the number of operations used for defining L ∈ RAT(M).If L = {u1, . . . , un}, then L is contained in {u1, . . . , un}∗. Let now K ⊆ {a1, . . . , am}∗ andL ⊆ {b1, . . . , bn}∗. Then both K ∪ L and KL are contained in {a1, . . . , am, b1, . . . , bn}∗,and L∗ is contained in {b1, . . . , bn}∗.

Example 1.2. If Σ is an infinite set, then the free monoid M = Σ∗ is not finitely generated. ByProposition 1.1 we see that in this case we have M 6∈ RAT(M). More generally, a monoid Mis finitely generated if and only if M ∈ RAT(M). 3

1.1. Rational expressions

Let Σ be an arbitrary set. The rational expressions over Σ are purely syntactic object. Theirdefinition is as follows:

• The empty set ∅ and the symbols a ∈ Σ are rational expressions over Σ.

• If s and t are rational expressions over Σ, then so is (s | t).• If s and t are rational expressions over Σ, then so is (st).

• If s is a rational expressions over Σ, then so is s∗.

The size of a rational expression is its length as a word over the alphabet Σ together withthe symbols ‘∅’, ‘ | ’, backets ‘(’ and ‘)’, and the star operator ‘∗’. The operator ‘ | ’ is calledchoice. Whenever possible, the bracketing is omitted and the usual order of operations isapplied (star before concatenation, before choice).

Remark 1.1. The choice operator is often written as +. We use | in order to avoid confusionwith the operation in commutative monoids. 3

6

1.2. Closure properties of rational sets

If Σ is a subset of a monoid M , then every rational expressions s defines a subset L(s) ⊆M .For rational expressions s and t we set

L(∅) = ∅L(a) = {a}

L(s | t) = L(s) ∪ L(t)

L(st) = L(s)L(t)

L(s∗) = L(s)∗,

where L(s) and L(t) are defined inductively. By definition, rational expressions only definerational sets. If Σ is a generating set, then the converse also holds. In particular, rationalexpressions over Σ can be used as finite presentations of all rational sets.

Proposition 1.2. Let M be a monoid and let Σ be a generating set. For every subset L ⊆Mthe following conditions are equivalent:

(a) L ∈ RAT(M).

(b) There exists a rational expression s over Σ such that L(s) = L.

Proof. (a)⇒ (b): Since union (product, Kleene star, respectively) is covered by the operationchoice (concatenation, star, respectively), it suffices to show that there exists a rationalexpression for every finite set. We have ∅ = L(∅). Thus by closure under union it suffices toshow that every singleton set {u} can be defined by a rational expression. Let u = a1 · · · anfor n ≥ 0 and ai ∈ Σ. We proceed by induction over n. If n = 0, then u = 1 and we have{1} = ∅∗ = L(∅∗). Let now n > 0. By induction there exists a rational expression s withL(s) = {a1 · · · an−1}. Then san is the desired rational expression for {u}.

(b)⇒ (a): Both ∅ and {a} are finite sets, and RAT(M) is closed under union, product,and Kleene star. Thus every rational expression over Σ defines a rational subset of M .

The set {1} containing only the neutral element is defined by the regular expression ∅∗.Frequently, one allows an additional symbol 1 in rational expressions as a shortcut for ∅∗, i.e.,the semantics of the rational expression 1 is L(1) = {1}. The idea is that over free monoids,ε is the neutral element in which case we have L(1) = {ε}.

1.2. Closure properties of rational sets

By definition, rational sets are closed under union, product, and Kleene star. The followingproposition shows that the homomorphic image of a rational set is again rational.

Proposition 1.3. Let M and N be monoids, and let ψ : M → N be a homomorphism.If L ∈ RAT(M), then ψ(L) ∈ RAT(N). Moreover, if ψ is surjective, then for everyK ∈ RAT(N) there exists L ∈ RAT(M) with ψ(L) = K.

Proof. By Proposition 1.2 it suffices to show that for every rational expression s over Mthere exists a rational expression s over N with L(s) = ψ(L(s)). We set ∅ = ∅ and a = ψ(a)for a ∈M . For choice, concatenation and star we define

s | t = s | tst = st

s∗ = s∗.

7

1. Rational Sets

The atomic expressions s satisfy L(s) = ψ(L(s)). For the other operations, induction on thesize yields

L( s | t) = L(s | t) = L(s) ∪ L(t) = ψ(L(s)

)∪ ψ(L(t)

)= ψ

(L(s) ∪ L(t)

)= ψ

(L(s | t)

)L( st) = L(st) = L(s)L(t) = ψ

(L(s)

)ψ(L(t)

)= ψ

(L(s)L(t)

)= ψ

(L(st)

)L( s∗ ) = L(s∗) = L(s)∗ = ψ

(L(s)

)∗= ψ

(L(s)∗

)= ψ

(L(s∗)

).

This shows that a rational expression for ψ(L(s)) can be obtained by replacing the atomicterms by their images under ψ and not changing the rest of the expression.

For the second part of the statement, when given a rational expression s over N , then wereplace every atom b ∈ N by some arbitrary preimage b ∈ ψ−1(b). This yields an expressions with ψ(L(s)) = L(s).

Note that if the homomorphism ψ : M → N is surjective and K ∈ RAT(N), thenProposition 1.3 does not yield an expression for ψ−1(K); it just constructs some rationalpreimage of K. The following example shows that ψ−1(K) might be not rational.

Example 1.3. If M is not finitely generated, then by Proposition 1.1 we have M 6∈ RAT(M).When considering the trivial homomorphism ψ : M → {1} with ψ(u) = 1 for all u ∈ M ,then M = ψ−1({1}) is the inverse homomorphic image of the rational set {1}. In particular,rational sets are not closed under inverse homomorphisms. 3

Example 1.4. Let Σ be an infinite set and let M = Σ∗ ∪ {0} be the free monoid over Σtogether with a zero element 0, i.e., we have u0 = 0u = 0 for all u ∈ M . Sets of theform u−1L = { v ∈M | uv ∈ L } and Lu−1 = { v ∈M | vu ∈ L } for some u ∈ M are calledresiduals of L ⊆ M . We have {0} ∈ RAT(M) whereas the residual 0−1 {0} = M is notrational. This shows that rational sets are not closed under residuals. 3

We will see in Section 3.4 that rational sets are neither closed under intersection and norunder complement.

1.3. Nondeterministic automata

Automata are a simple machine model for defining sets. They come in two flavors, deterministicand nondeterministic. In general, finite nondeterministic automata can define more sets thanfinite deterministic automata. It will turn out that finite nondeterministic automata canaccept exactly the rational sets. A nondeterministic M -automaton A = (Q, δ, I, F ) consistsof

• a set of states Q,

• a transition relation δ ⊆ Q×M ×Q,

• a set of initial states I,

• and a set of final states F .

The elements of δ are called transitions. If (p, u, q) is a transition, then u is called its label.The notion of a run is the main tool for defining the set accepted by A. A run of theautomaton A is a sequence

r = q1u1q2 · · ·unqn+1

with (qi, ui, qi) ∈ δ for all 1 ≤ i ≤ n. We say that r is a run from q1 to qn+1 on the elementu = u1 · · ·un. If n = 0, then r is a run on the neutral element. One way of how to thinkabout runs on u ∈M is that the automaton A decorates a factorization of u with states suchthat locally the transition relation holds. The run r is accepting if q1 ∈ I and qn+1 ∈ F . Thesubset of M accepted by A is

L(A) = {u ∈M | there exists an accepting run of A on u } .

8

1.3. Nondeterministic automata

We write p u q if there exists a run from state p to state q on the element u ∈M . Using thisnotation, we have L(A) =

{u ∈M

∣∣ ∃i ∈ I ∃f ∈ F : i u f}

. If p u q and q v q′, thenp uv q′.

Automata can be visualized as edge-labeled graphs. The states of the automaton are thevertices and transition define the edges. Moreover, initial states are marked by an ingoingarrow (from nowhere), and final states have a double circle. Transitions of the form (q, u, q)are called loops.

initial state final state

p qu

transition (p, u, q)

q u

loop (q, u, q)

Example 1.5. Let Σ = {a, b}. Consider the nondeterministic Σ∗-automaton A = (Q, δ, I, F )with states Q = { q1, q2, q3, q4 }, initial and final states I = F = {q1, q3} and transitionrelation

δ = { (q1, a, q2), (q2, a, q2), (q2, a, q3), (q3, b, q4), (q4, b, q4), (q4, b, q1) } .

The automaton A can be visualized as follows:

q1

q2

q3

q4

a

a

a

b

b

b

We will see that it is not suprising that L(A) = L((aaa∗ | bbb∗)∗) can be written as a rationalexpression. We have q2

aaa q2 and q2aaa q3. An accepting run of A on aaabbbaa is

q1 a q2 a q2 a q3 b q4 b q4 b q1 a q2 a q3 .

Runs can be obtained from the visualization of an automaton by chasing paths. 3

Two nondeterministic M -automata are equivalent if they accept the same subset. Anondeterministic M -automaton is deterministic if |I| = 1 and if for every state p ∈ Q andevery element u ∈ M there exists a unique state q ∈ Q with p u q. This means for everystate p ∈ Q and every element u ∈ M , all runs on u starting at p end in a unique state q.This does not mean that there is only one run on u from p to q.

Example 1.6. We consider the integers Z with addition. The following nondeterministicZ-automaton is deterministic.

9

1. Rational Sets

q0

q1

q2

q3

q4

2

2

2

2

2

−1

−1

−1

−1−1

It accepts the subset 5Z = { 5k | k ∈ Z }. The following are some of the accepting runs of0 ∈ Z:

q0

q0 2 q1 (−1) q3 (−1) q0

q0 (−1) q2 2 q3 (−1) q0

q0 2 q1 (−1) q3 2 q4 (−1) q1 (−1) q3 (−1) q0

In particular, the accepting run of 0 is not unique. In fact, there are infinitely many of themsince every presentation of 0 as a sum of 2’s and (−1)’s yields another run. 3

1.4. Conversion of rational expressions into automata

There are many ways for translating rational expressions into automata. The one we usehere is due to Kenneth Lane Thompson who is well-known for his contributions to the unixoperating system. This method is therefore called the Thompson construction.

Theorem 1.4. Let M be a monoid. For every rational expression s over M there exists anondeterministic M -automaton As with O(|s|) transitions such that L(As) = L(s).

Proof. Let 1 ∈M be the neutral element. For every rational expression s we construct anautomaton As with the following properties

• L(As) = L(s).

• As has a unique initial state is, and is has no incoming transitions. Incoming transitionsof a state q are transitions of the form (p, u, q).

• As has a unique final state fs, and fs has no outgoing transitions. Outgoing transitionsof a state p are transitions of the form (p, u, q).

The finite subset L = {u1, . . . , un } of M is accepted by the following two-state automaton:

i f

u1

un

...

This covers the cases s = ∅, s = 1, and s = a. For the remaining operations we assume thatwe have already constructed automata At and At′ . The automaton As for choice s = t | t′ is:

10

1.4. Conversion of rational expressions into automata

At

At′

it, it′ ft, ft′

This means, we take the disjoint union of At and At′ , except that we identity it with it′

(which yields the initial state is of As) and ft with ft′ (which yields the final state fs). Theautomaton As for concatenation s = tt′ is:

At

At′

it ft, it′ ft′

This means we take the union of At and At′ in which we identify the states ft and it′ , otherthan that the union is disjoint. The initial state of As is it and the final state is ft′ .

For s = t∗ the construction is as follows:

Atit ftis fs

1 1

1

1

In all cases we have L(As) = L(s).

Remark 1.2. It is tempting to optimize the construction for the star operator by identifyingis with it and fs with ft. This destroys the invariant that is has no incoming transitions andfs has no outgoing transitions. But without this presumption, none of the constructions forchoice, concatenation, and star is correct. For instance, consider monoid M = {a, b}∗ andthe rational expression s = (a∗b)∗. The resulting automaton would be

a b

1

1

1

1

We have L(s) = {1} ∪ {a, b}∗ b but the above automaton also accepts the word a 6∈ L(s). 3

We note that the Thompson construction does not depend on M . This means that forevery rational expression s over Σ it constructs a nondeterministic automaton As with labelsin Σ ∪ {1} such that over every monoid M with Σ ⊆M , the expression s and the automatonAs define the same subset.

11

1. Rational Sets

Example 1.7. Consider the regular expression s = b(c | a∗)∗ over Σ = {a, b, c}. Then theThompson construction yields the following automaton As:

b 1 1 a 1 1

1

1

c

1

1

We will use this automaton as a running example. 3

1.5. Conversion of automata into rational expressions

For translating nondeterministic automata into rational expressions we use a slightly moregeneral automaton model as an intermediate step. An M-automaton with rational labelsA = (Q, δ, I, F ) satisfies δ ⊆ Q× RAT(M)×Q. By using singleton sets as labels we obtainthe usual nondeterministic M -automata. In order to have finite presentations, the rationallabels are frequently given as rational expressions. The automaton A is finite if δ is finite.Note that if δ is finite, then, for a transition (p, L, q) ∈ δ, the set L can still be infinite. Asequence q1u1q2 · · ·unqn+1 is a run of A if for every i ∈ {1, . . . , n} there exists a transition(qi, Li, qi+1) ∈ δ with ui ∈ Li. The following procedure for translating automata into rationalexpressions is called state elimination.

Theorem 1.5. Let M be a monoid. For every finite M -automaton A with rational labels wehave L(A) ∈ RAT(M).

Proof. Let A = (Q, δ, I, F ) with Q = {q2, . . . , qn}. Since rational sets are closed under union,we can assume that for any pair of states qi, qj there exists at most one transition (qi, Lij , qj) ∈δ. If there is no transition (qi, Lij , qj), then we set Lij = ∅. We introduce a new unique initialstate q0 and a new unique final state q1 by setting Qn = { q0, . . . , qn } = Q ∪ {q0, q1} and

δn = δ ∪ { (q0, {1} , i) | i ∈ I } ∪ { (f, {1} , q1) | f ∈ F } .

The automaton Bn = (Qn, δn, {q0} , {q1}) is equivalent to A. Note that q0 has no incom-ing transitions and that q1 has no outgoing transitions. If n > 1, then we construct anequivalent automaton Bn−1 = (Qn−1, δn−1, {q0} , {q1}) with n states as follows. We setQn−1 = {q0, . . . , qn−1} and

δn−1 = { (qi, Lij ∪ LinL∗nnLnj , qj) | 0 ≤ i, j < n }

This means that we replace every transition (qi, Lij , qj) ∈ δn by (qi, Lij ∪LinL∗nnLnj , qj). Thiscan be visualized as follows:

qi qn qjLin Lnj

Lnn

Lij

In the automaton Bn

qi qjLij ∪ LinL∗nnLnj

In the automaton Bn−1

12

1.5. Conversion of automata into rational expressions

All labels in the automaton Bn−1 are rational, and Bn−1 is equivalent to Bn. Note that allincoming transitions of q0 and all outgoing transitions of q1 have label ∅. By repeated stateelimination we obtain the equivalent automaton B1. If (q0, L01, q1) ∈ δ1, then L(B1) = L01 ∈RAT(M); otherwise we have L(B1) = ∅ ∈ RAT(M).

Remember ∅∗ = {1}. Note that when giving all labels Lij in the state eliminationalgorithm as rational expressions sij , then the above construction does not depend on M .If A is a nondeterministic M -automaton with n states and labels in Σ, then the stateelimination algorithm constructs a rational expression of size |Σ| 2O(n). At the beginning,each language Lij has size at most |Σ|n. Then while eliminating one state, the size of thelongest expression at most quadruples (plus some constant for choice, star, and brackets).This yields |Σ|n2O(n) = |Σ| 2O(n) as an upper bound on the size.

Example 1.8. Consider the following automaton A from Example 1.7.

q0 q6 q2 q5 q4 q3 q1b 1 1 a 1 1

1

1 | c1

1

Since q0 has no incoming transitions and q1 has no outgoing transitions, we do not have toadd additions states. We always apply obvious simplifications involving concatenation with 1.After elimination of q6 we obtain:

q0 q2 q5 q4 q3 q1b 1 a 1 1

1

1 | c1

b

Eliminating q5 yields:

q0 q2 q4 q3 q1b a

a

1 1

1 | c

1

b

After elimination of q4 we have:

13

1. Rational Sets

q0 q2 q3 q1b 1 | c | aa∗ 1

1

b

By elimination of q3 we obtain:

q0 q2 q1b 1 | c | aa∗

1 | c | aa∗

b

Finally eliminating q2 yields:

q0 q1b | b(1 | c | aa∗)∗(1 | c | aa∗)

Therefore, a rational expression for L(A) is b | b(1 | c | aa∗)∗(1 | c | aa∗). Remember that weconstructed A from the expression b(c | a∗)∗ using the Thompson construction. 3

1.6. Removal of epsilon-transitions

Let Σ be a generating set of the monoid M and let A = (Q, δ, I, F ) be a nondeterministicM -automaton. We say that A is a letter-by-letter automaton for Σ if all labels are in Σ, i.e.,the transition relation satisfies δ ⊆ Q×Σ×Q. Every transition with label u = a1 · · · an withn ≥ 1 and ai ∈ Σ can easily be split into n transitions. Avoiding transitions with label u = 1is slightly more involved. Since ε is the neutral element of the free monoid, the followingprocedure is called removal of ε-transitions.

Theorem 1.6. Let M be a monoid with generating set Σ. For every finite nondeterministicM -automaton A there exists an equivalent finite letter-by-letter automaton for Σ with onlyone initial state.

Proof. Let A = (Q, δ, I, F ). By omitting all unreachable states we can assume that Q is finite.As long as there are transitions (p, u, q) with label u ∈M \ (Σ ∪ {1}) we write u = a1 · · · anwith ai ∈ Σ and we replace (p, u, q) by

p q2 q3 · · · qn qa1 a2 an

where q2, . . . , qn are new states. After this step, all labels are in Σ ∪ {1} and the acceptedset did not change. Let q0 be a new state. We add the transitions { (q0, 1, i) | i ∈ I } and wechoose q0 as the only initial state. Next, we make the 1-transitions transitive, i.e., wheneverthere are two consecutive transitions (p, 1, q) and (q, 1, r), then we add the transition (p, 1, r).One can think of this procedure as introducing shortcuts. Let

F ′ = F ∪ { p ∈ Q | (p, 1, q) is a transition with q ∈ F } .

Using F ′ as final sets yields the equivalent automaton B′ = (Q′, δ′, {q0} , F ′). Next, for everytransition (p, 1, q) ∈ δ′ and every generator a ∈ Σ we add the transitions{

(p, a, q′)∣∣ (q, a, q′) ∈ δ′

}.

This defines the equivalent automaton B′′ = (Q′, δ′′, {q0} , F ′).

14

1.6. Removal of epsilon-transitions

p q q′1 a

In the automaton B′

p q q′1 a

a

In the automaton B′′

Finally, let B be obtained from B′′ by removing all transitions with label 1. It remains toshow L(B) = L(B′′). The inclusion L(B) ⊆ L(B′′) is trivial since every accepting run of Bis also an accepting run of B′′. Let now r = q0u1q1 · · ·unqn be an accepting run of B′′ onu = u1 · · ·un ∈ L(B′′) with as few 1-transitions as possible. We can assume u 6= 1 sinceotherwise r = q0 is an accepting run of B, as desired. By construction of F ′ we have un 6= 1.If ui = 1, then i < n and ui+1 6= 1 since the 1-transitions are transitive. Now, by constructionof δ′′ there exists a transition (qi−1, ui+1, qi+1). Replacing the segment qi−1uiqiui+1qi+1 ofthe run r by qi−1ui+1qi+1 we obtain an accepting run on u with fewer 1-transitions. This is acontradiction to the choice of r. Therefore, r uses no 1-transitions, i.e., r is an accepting runof B. This shows L(B′′) ⊆ L(B).

We summarize the algorithm for the removal of ε-transitions. Moverover, we give a roughestimate of the running time of each step.

(a) Replace labels u ∈M \ (Σ ∪ {1}) by several transitions with labels in Σ. The runningtime of this step depends on the number of transitions required for this substitution.

(b) Add a unique initial state. To this end, |I| transitions have to be drawn.

(c) Make the 1-transitions transitive. There are cubic algorithms with respect to thenumber of states for computing transitive closures (e.g. Warshall’s algorithm).

(d) All states which can reach a final state using a 1-transition also become final states.Depending on the implementation of the automaton, this step is linear in the numberof transitions.

(e) For transitions (p, 1, q) and (q, a, r) with a ∈ Σ we introduce a shortcut (p, a, r).Depending on the implementation of the automaton, this step is quadratic in thenumber of transitions.

(f) All 1-transitions are removed. This step is linear in the number of transitions.

(g) As an optimization, one can omit unreachable states in the resulting automaton.

If all labels are in Σ ∪ {1}, then the algorithm can be implemented with a running timeof O(|Q|2 |δ|). This is mainly due to the complexity of step (e) since an upper bound forthe number of 1-transitions after step (c) is |Q|2. Since we can assume that all states arereachable we have O(|Q|3) ⊆ O(|Q|2 |δ|) which covers the running time of step (c). Also notethat if all labels are in Σ ∪ {1}, then, except for the initial state, no other new states arerequired by the construction.

Example 1.9. Let M = {a, b}∗, let Σ = {a, b}, and consider the following automaton fromExample 1.7 in which we replaced c by ab, i.e., the language accepted by the automaton isdefined by the rational expression b(ab | a∗)∗.

b ε ε a ε εε

ε

ab

ε

ε

15

1. Rational Sets

The automaton already has a unique initial state. We proceed by splitting the transitionwith label ab into two transitions. This yields the following automaton.

b ε ε a ε εε

ε

a b

ε

ε

In the next step, we make the ε-transitions transitive by successively adding new ε-transitionsas shortcuts for every two consecutive ε-transitions. We thereby omit ε-loops.

b ε ε a ε εε

ε

a b

ε

ε

ε

ε

ε

ε

ε

ε

Now, every state which can reach a final state using an ε-transition becomes a final stateitself.

b ε ε a ε εε

ε

a b

ε

ε

ε

ε

ε

ε

ε

ε

We combine the next two steps. We introduce new transitions with labels from Σ and weremove all ε-transitions.

b a

a b

a

a

aa

a

aa

16

1.7. Equivalence of rational expressions and nondeterministic automata

Finally, by omitting unreachable states (and after some rearrangement) we obtain the followingautomaton.

bb

aa

a

a

a a

The above automaton is letter-by-letter for {a, b} and it has only one initial state. 3

Remark 1.3. By adapting the algorithm for removal of ε-transitions one can also constructletter-by-letter automata with only one final state; but, in general, these automata then havemore than one initial state. Not for every rational set there exists a letter-by-letter automatawith only one initial state and only one final state. For instance, let Σ = {a, b} and M = Σ∗,and consider the language L = {ε, a, b}. Assume there were a letter-by-letter automaton Awith L(A) = L such that A has only one initial state q0 and only one final state q1. Then,since ε ∈ L, we have q0 = q1. From a, b ∈ L we conclude that there are accepting runs for thewords a and b, i.e., there are transitions (q0, a, q0) and (q0, b, q0). Combining these two runsyields an accepting run for every word in Σ∗. This is a contradiction to L(A) = L. Thereforesuch an automaton A for the language L does not exist. 3

1.7. Equivalence of rational expressions and nondeterministicautomaton models

We combine the findings in the previous sections for obtaining the following characterizationsof rational sets.

Theorem 1.7. Let M be a monoid with generating set Σ and let L ⊆ M . The followingconditions are equivalent:

(a) L ∈ RAT(M), i.e., the set L is rational.

(b) There exists a rational expression s over Σ such that L(s) = L.

(c) L is accepted by a finite nondeterministic M -automaton.

(d) L is accepted by a finite letter-by-letter M -automaton for Σ with a single initial state.

(e) L is accepted by a finite M -automaton with rational labels.

Proof. The equivalence of (a) and (b) is Proposition 1.2. (b)⇒ (c): This follows fromthe Thompson construction in Theorem 1.4. (c)⇒ (d): The removal of ε-transitions inTheorem 1.6 shows this implication. (d)⇒ (e): Trivial. (e)⇒ (a): This translation is achievedby state elimination, see Theorem 1.5.

In Figure 1.1 we summarize the conversions between the different descriptions of rationalsets. The numbers after the name of the construction indicate the maximal blow-up in size.For automata we count the number of states. For better readability the given bounds weassume that the set of generators Σ is fixed.

1.8. Semilinear subsets of commutative monoids

A monoid M is commutative if uv = vu for all u, v ∈ M . The operation in commutativemonoids is usually written as addition. This means we write u + v instead of uv, and

17

1. Rational Sets

rational expressionover Σ

nondeterministic automatonwith labels in Σ ∪ {1}

letter-by-letterautomaton for Σ

Thompson construction, O(n)

state elimination, 2O(n)

stateelim

ination, 2 O(n)

rem

oval

ofε-

tran

sitions

, n

triv

ial

Figure 1.1.: Transformation between rational expressions and nondeterministic automata.

the neutral element is denoted by 0. Similarly, for subsets K,L ⊆ M we write K + L ={u+ v | u ∈ K, v ∈ L } instead of concatenation KL. Typical commutative monoids are Nkand Zk with componentwise addition. Let M be an arbitrary commutative monoid, letu ∈M , and let n ∈ N. Then the term nu denotes the n-fold sum of u, i.e.,

nu = u+ · · ·+ u︸ ︷︷ ︸n terms

In particular, 0u = 0 is the neutral element of M . A subset L ⊆ M is linear if L =u0 + {u1, . . . , uk}∗ for k ≥ 0 and u0, . . . , uk ∈ M . We have u ∈ L if and only if there existn1, . . . , nk ∈ N with

u = u0 + n1u1 + · · ·+ nkuk.

A subset L ⊆M is semilinear if L is a finite union of linear sets. Every semilinear set L canbe written as L =

⋃`j=1(vj + C∗j ) for vi ∈M and finite subsets Ci ⊆M . We show that the

semilinear and the rational subsets coincide.

Theorem 1.8. Let M be a commutative monoid and let L ⊆M . The subset L is semilinearif and only if L ∈ RAT(M).

Proof. By definition, every semilinear set is rational. Every finite subset of M is semilinear;and semilinear sets are closed under union. For the remaining inclusion it therefore suffices toshow that semilinear sets are closed under addition and Kleene star. Let K =

⋃ki=1(ui +B∗i )

and L =⋃`j=1(vj + C∗j ) for ui, vj ∈M and for finite subsets Bi, Cj ⊆M . We have

K + L =k⋃i=1

⋃j=1

((ui + vj) + (Bi ∪ Cj)∗

).

This shows that K +L is semilinear. For all subsets X,Y ⊆M we have (X ∪Y )∗ = X∗+Y ∗.It therefore remains to show that the Kleene star of a linear set yields a semilinear set. Wehave (

u0 + {u1, . . . , uk}∗)∗

= {0} ∪(u0 + {u0, u1, . . . , uk}∗

).

Thus the Kleene star of the linear set u0 + {u1, . . . , uk}∗ is semilinear.

18

2. Recognizable Sets

Throughout this chapter, the letters M and N denote monoids. Let ϕ : M → N be ahomomorphism. The homomorphism ϕ recognizes a subset L ⊆ M if ϕ−1(ϕ(L)) = L. Wealso say that a monoid N recognizes L ⊆ M if there exists a homomorphism ϕ : M → Nwhich recognizes L. A subset L ⊆ M is recognizable if it is recognized by a finite monoid.The class of all recognizable subsets of M is denoted by REC(M). For any homomorphismϕ : M → N and any subset L ⊆M we have L ⊆ ϕ−1(ϕ(L)) since

u ∈ L ⇒ ϕ(u) ∈ ϕ(L).

Thus, if ϕ−1(ϕ(L)) ⊆ L, then ϕ−1(ϕ(L)) = L and the following equivalence holds:

u ∈ L ⇔ ϕ(u) ∈ ϕ(L).

This means that for checking whether or not u ∈ L holds, it suffices to test ϕ(u) ∈ ϕ(L);and the latter can be considerably easier. For example, if N is a finite monoid (i.e., if L isrecognizable), then this is just a lookup in a finite table. Moreover, the computation of ϕ(u)can be easily parallelized if u is given as a sequence of generators u = a1 · · · an:

a1 a2 a3 a4 · · · an

ϕ(a1) ϕ(a2) ϕ(a3) ϕ(a4) · · · ϕ(an)

ϕ(a1)ϕ(a2) ϕ(a3)ϕ(a4)

ϕ(a1) · · ·ϕ(a4)

ϕ(a1) · · ·ϕ(an)

ϕ(u)

=

With this scheme the parallel computation of h(u) with n processors requires log n steps, itcomputes n images of letters under ϕ, and it evaluates n− 1 products in N .

The following lemma shows that recognition can also be defined in terms of accepting setsP ⊆ N . In particular, this gives a way for presenting recognizable subsets of M using thefollowing three ingredients: a finite monoid N , a homomorphism ϕ : M → N , and a subsetP ⊆ N .

Lemma 2.1. Let ϕ : M → N be a homomorphism and let L ⊆M . Then ϕ recognizes L ifand only if there exists P ⊆ N with L = ϕ−1(P ).

Proof. For the implication from left to right we can set P = ϕ(L). Let now L = ϕ−1(P ).Then ϕ−1(ϕ(L)) = ϕ−1(ϕ(ϕ−1(P ))) ⊆ ϕ−1(P ) = L.

If ϕ : M → N is a homomorphism, then ϕ(M) is a submonoid of N . Therefore, by replacingN by ϕ(M) one can assume that ϕ is surjective. If a surjective homomorphism ϕ : M → Nrecognizes L ⊆M , then P = ϕ(L) is the only accepting set for L.

19

2. Recognizable Sets

2.1. Closure properties of recognizable sets

If M is finite, then every subset of M is recognizable since it is recognized by the identitymapping id : M →M . For groups, also the converse statement is true.

Proposition 2.2. A group G is finite if and only if {1} ∈ REC(G).

Proof. It suffices to show the implication from right to left. Let N be a finite monoid andlet ϕ : G → N be a homomorphism recognizing {1}. We can assume that ϕ is surjectiveand hence, N is a group. Suppose ϕ(u) = ϕ(v). Then ϕ(uv−1) = ϕ(u)ϕ(v)−1 = 1 and thusuv−1 ∈ ϕ−1(1). Since ϕ−1(1) = {1} we conclude uv−1 = 1 and thus u = v. Therefore, ϕ isinjective, which shows that G and N are isomorphic. In particular, G is finite.

If M is finitely generated, then only countably many subsets of M are recognizable. Eventhough, quite few subsets of infinite monoids are recognizable, this class provides a nicecollection of closure properties. Let L ⊆ M . A left residual of L is a set of the formu−1L = { v ∈M | uv ∈ L } for u ∈ M . Symmetrically, a right residual of L is Lu−1 ={ v ∈M | vu ∈ L }. The following theorem summarizes some important closure properties ofrecognizable subsets.

Theorem 2.3. Let M and M ′ be monoids and let ψ : M ′ →M be a homomorphism. Thefollowing properties hold:

(a) REC(M) is closed under finite union, finite intersection, and complement.

(b) REC(M) is closed under left and right residuals.

(c) If L ∈ REC(M), then ψ−1(L) ∈ REC(M ′), i.e., the recognizable sets are closed underinverse homomorphisms.

Moreover, for effective presentations of M , M ′ and ψ, all of the above closure properties areeffective.

Proof. (a): If a homomorphism ϕ recognizes L ⊆M , then it also recognizes M \L. The trivialmonoid {1} recognizes both M and ∅. This covers the empty intersection and the emptyunion. Consider homomorphism ϕi : M → Ni with Li = ϕ−1i (Pi). Let ϕ : M → N1×N2 withϕ(u) = (ϕ1(u), ϕ2(u)). Then L1∩L2 = ϕ−1(P1×P2) and L1∪L2 = ϕ−1((P1×N2)∪(N1×P2)).

(b): Let u ∈ M and let ϕ : M → N be a homomorphism with L = ϕ−1(P ) for P ⊆ N .The subset P ′ = ϕ(u)−1P = {x ∈ N | ϕ(u)x ∈ P } satisfies

ϕ(v) ∈ P ′ ⇔ ϕ(u)ϕ(v) ∈ P ⇔ ϕ(uv) ∈ P ⇔ uv ∈ L ⇔ v ∈ u−1L.

This shows u−1L = ϕ−1(P ′). Right residuals are symmetric.(c): Let K = ψ−1(L). If ϕ : M → N recognizes L, then ϕ◦ψ : M ′ → N recognizes K since

u ∈ K ⇔ ψ(u) ∈ L ⇔ ϕ(ψ(u)) ∈ ϕ(L).

We now show that recognizable sets are not closed under homomorphic images.

Example 2.1. Let ψ : {a, b}∗ → Z be the homomorphism defined by a 7→ 1 and b 7→ −1. Wehave {ε} ∈ REC({a, b}∗) but ψ({ε}) = {0} is not in REC(Z) by Proposition 2.2. 3

The following example of Shmuel Winograd shows that, in general, recognizable sets areneither closed under concatenation nor under Kleene-star.

Example 2.2. We extend the addition of Z to M = Z ∪ {e, a} by setting e+m = m for allm ∈M , a+ a = 0 and a+ k = k for all k ∈ Z. Now, M forms a commutative monoid withneutral element e.

We claim that L ∈ REC(M) implies L ∩ Z ∈ REC(Z). Suppose ϕ : M → N is ahomomorphism recognizing L ⊆ M , and let ϕ : Z→ N be the restriction of ϕ to Z. ThenF = ϕ(L) satisfies ϕ−1(F ) = ϕ−1(F ) ∩ Z = L ∩ Z. This proves the claim.

20

2.2. Syntactic monoids

Next we show {a} ∈ REC(M). Let N = {e, a, z} be a commutative monoid with neutralelement e with zero element z and with a+ a = z. Let ϕ : M → N be the homomorphismdefined by e 7→ e, a 7→ a, and k 7→ z for all k ∈ Z. It satisfies ϕ−1(a) = {a}. The productof {a} with itself is {a} + {a} = {0} and its Kleene-star is {a}∗ = {e, a, 0}. Neither {0}nor {e, a, 0} is in REC(M), since in both cases the intersection with Z yields {0} which byProposition 2.2 is not in REC(Z). 3

2.2. Syntactic monoids

The syntactic monoid Synt(L) of a subset L ⊆ M is the unique minimal monoid whichrecognizes L. It is naturally equipped with a homomorphism ϕL : M → Synt(L) whichrecognizes L, the so-called syntactic homomorphism. In order to formally define the syntacticmonoid of L, we first introduce the syntactic congruence ≡L over M . We set u ≡L v if, forall x, y ∈M , the following equivalence holds:

xuy ∈ L ⇔ xvy ∈ L.

One can think of (x, y) as a context, and then u ≡L v means that u and v have the samebehaviour with respect to all contexts. In particular, ≡L forms an equivalence relation andthe equivalence class of u is denoted by [u]. Suppose u ≡L u′ and v ≡L v′. For all x, y ∈Mwe have

xuvy ∈ L ⇔ xu′vy ∈ L ⇔ xu′v′y ∈ L,

where the first equivalence uses the context (x, vy) and the second equivalence uses thecontext (xu′, y). This shows uv ≡L u′v′ and thus ≡L is a congruence. The quotientM/≡L = { [u] | u ∈M } is called the syntactic monoid of L, denoted by Synt(L). It is theset of all equivalence classes and the operation on Synt(L) is defined by [u][v] = [uv]. Thisis well-defined since ≡L is a congruence. The natural projection ϕL : M → Synt(L) withϕL(u) = [u] forms a homomorphism, the syntactic homomorphism of L. It recognizes L since

[u] ∈ ϕL(L) ⇔ u ≡ v for some v ∈ L ⇔ u ∈ L.

The second equivalence uses the context (1, 1). The following theorem shows that Synt(L)indeed is the minimal monoid recognizing L.

Theorem 2.4. Let M and N be monoids, and let ϕ : M → N be a homomorphism recognizingL ⊆M . Then ϕ(u) 7→ [u] defines a surjective homomorphism from the submonoid ϕ(M) ofN onto Synt(L). In particular, the following diagram commutes:

M ϕ(M)

Synt(L)

ϕ

ϕ(u) 7→ [u]ϕL

Proof. We first show that ϕ(u) 7→ [u] is well-defined. Suppose ϕ(u) = ϕ(v). For all x, y ∈Mwe have

xuy ∈ L ⇔ ϕ(xuy) ∈ L ⇔ ϕ(xvy) ∈ L ⇔ xvy ∈ L.

Thus [u] = [v] as desired. Trivially, ϕ(M)→ Synt(L), ϕ(u) 7→ [u] is surjective, and it formsa homomorphism since 1 = ϕ(1) 7→ [1] and ϕ(u)ϕ(v) = ϕ(uv) 7→ [uv] = [u][v].

The above theorem says that Synt(L) is the homomorphic image of a submonoid of everymonoid which recognizes L. If N is a finite monoid, then |Synt(L)| ≤ |N |. Therefore, Synt(L)is the unique minimal monoid which recognizes L.

21

2. Recognizable Sets

Suppose ϕ : M → N is a surjective homomorphism which recognizes L ⊆M . Let P = ϕ(L).The syntactic congruence of P satsifies

u ≡L v ⇔(∀x, y ∈M : xuy ∈ L⇔ xvy ∈ L

)⇔(∀x, y ∈M : ϕ(xuy) ∈ ϕ(L)⇔ ϕ(xvy) ∈ ϕ(L)

)⇔(∀x′, y′ ∈ N : x′ϕ(u)y′ ∈ P ⇔ x′ϕ(v)y′ ∈ P

)⇔ ϕ(u) ≡P ϕ(v).

Here, the second equivalence holds since ϕ recognizes L, and the third equivalence holdssince ϕ is surjective and since P = ϕ(L). This shows that the syntactic monoids Synt(L)and Synt(P ) are identical. Note that Synt(L) is a quotient of M and Synt(P ) is a quotientof N . In particular, the homomorphism N → Synt(L), ϕ(u) 7→ [u] given by Theorem 2.4 isthe syntactic homomorphism of P . As a consequence, there exists a monoid M and a subsetL ⊆M such that N = Synt(L) if and only if there exists P ⊆ N such that N is the syntacticmonoid of P .

2.3. Deterministic automata

There is a tight connection between homomorphisms and deterministic automata. In contrastto nondeterministic M -automata, the transition relation of a deterministic automaton definesa function Q×M → Q; and the easiest way for introducing deterministic M -automata (orjust M -automata for short) is to rely on this function. The main technical advantage ofthis approach is that we do not have to deal with runs of the automaton. An M -automatonA = (Q, ·, q0, F ) consists of a set of states Q, of an initial state q0 ∈ Q, of a set of final statesF ⊆ Q, and of a transition function · : Q×M → Q satisfying

q · 1 = q

(q · u) · v = q · (uv)

for all q ∈ Q and all u, v ∈ M . The subset accepted by A is L(A) = {u ∈M | q0 · u ∈ F }.Remember that the subset accepted by a nondeterministic automaton is defined slightly moretechnical since it relies on the notion of a run. Suppose Σ ⊆M is a generating set. Then wecan define the transition relation

δ = { (p, a, p · a) ∈ Q× Σ×Q | p ∈ Q, a ∈ Σ } .

The transition function · can be retrieved from δ since q · u is the unique state p such thatthere exists a run qa1q1 · · · an−1qn−1anp in the (a priori) nondeterministic M -automaton(Q, δ, {q0} , F ) with u = a1 · · · an, ai ∈ Σ. In particular, M -automata can be graphicallyrepresented in the same way as nondeterministic M -automata. An M -automaton A =(Q, ·, q0, F ) is finite if it can be represented by a finite transition relation δ. If Q is finite, wesay that the set of states of A is finite. If M is finitely generated and the set of states of A isfinite, then A is finite. Conversely, if A is finite, then

{ a ∈M | (p, a, q) ∈ δ for some p, q ∈ Q }

is a finite generating set of M and, moreover, we can replace the set of states Q by the finiteset

{q0} ∪ { p, q ∈ Q | (p, a, q) ∈ δ for some a ∈M } .

Frequently, we assume that all states are reachable, i.e., for all q ∈ Q there exists u ∈ Mwith q0 · u = q (otherwise we can replace Q by the reachable states).

22

2.4. Minimal automata

Example 2.3. The set {1,−1, } generates the integers Z with addition as operation. TheZ-automaton A = ({ 0, 1, 2, 3, 4, 5 } , ·, 0, { 1, 4 }) with q · k = (q + k) mod 6 has the followinggraphical presentation:

0

1 2

3

45

1

1

1

1

1

1

−1

−1

−1

−1

−1

−1

It accepts the set L(A) = { k + 6` | k ∈ {1, 4} , ` ∈ Z }. 3

2.4. Minimal automata

The minimal automaton of a subset L ⊆ M is the unique smallest automaton acceptingL. It can be seen a one-sided version of the syntactic monoid. Let L(u) = u−1L ={ y ∈M | uy ∈ L }. Note that L(1) = L. We have L(u) = L(v) if and only if for all y ∈Mthe following equivalence holds:

uy ∈ L ⇔ vy ∈ L.

The minimal automaton AL = (QL, ·, q0L, FL) of L has states QL = {L(u) | u ∈M }, itsinitial state is q0L = L(1) = L, its set of final states is FL = {L(u) | 1 ∈ L(u) }, and thetransition function is defined by L(u) · v = L(uv). Suppose L(u) = L(u′). Then for all y ∈Mwe have

y ∈ L(uv) ⇔ uvy ∈ L ⇔ vy ∈ L(u) = L(u′) ⇔ u′vy ∈ L ⇔ y ∈ L(u′v).

Thus L(uv) = L(u′v) and the transition function of AL is well-defined. Note that L(u) · 1 =L(u) and (L(u) · v) ·w = L(uvw) = L(u) · (vw), i.e., the mapping · indeed defines a transitionfunction. The minimal automaton AL accepts the set L(AL) = L since L(1) · u = L(u) and

u ∈ L ⇔ 1 ∈ L(u).

Example 2.4. We consider the subset L = { k + 6` | k ∈ { 1, 4 } , ` ∈ Z } from Example 2.3.Adding or subtracting multiples of 3 does not change membership in L. It therefore sufficesto determine L(0), L(1), and L(2):

L(0) = 1 + 3Z = { 1 + 3k | k ∈ Z }L(1) = 3ZL(2) = −1 + 3Z

Thus, the minimal automaton AL of L is given by

L(0)

L(1)

L(2)

1

1

1

−1

−1

−1

23

2. Recognizable Sets

The initial state is L(0) and not L(1) since 0 is the neutral element of (Z,+). Also notethat, for example, the transition L(0) · 1 yields the set L(0)− 1 = L(1) since L(0)− 1 is theresidual of L(0) by 1. 3

The following theorem shows that the minimal automaton of L is the unique minimalautomaton which accepts L.

Theorem 2.5. Let M be a monoid and let A = (Q, ·, q0, F ) be an M-automaton withL(A) = L. Then q0 · u 7→ L(u) defines a surjective mapping from the reachable states of Aonto the states QL of the minimal automaton.

Proof. If p = q0 · u, then L(u) = { v ∈M | p · v ∈ F }. Since { v ∈M | p · v ∈ F } does notdepend on u, the mapping q0 · u 7→ L(u) is well-defined. Some preimage of L(u) is the stateq0 · u.

For an M -automaton A = (Q, ·, q0, F ) and a state p ∈ Q one can define the set L(p) ={ v ∈M | p · v ∈ F }. It is the subset of M accepted by the M -automaton Ap = (Q, ·, p, F )with initial state p. The proof of Theorem 2.5 shows that for every reachable state p of A,the set L(p) is a state of the minimal automaton. The Myhill-Nerode equivalence relation ≡Aon Q is defined by p ≡A q if L(p) = L(q). The minimal automaton for L(A) can be obtainedby identifying Myhill-Nerode equivalent states. This fact is often used by minimizationalgorithms. Moreover, an M -automaton A = (Q, ·, q0, F ) is minimal if and only if thefollowing two properties hold:

• All states in Q are reachable, i.e., we have q0 ·M = Q.

• For any two distinct states states p, q ∈ Q we have L(p) 6= L(q).

2.5. Transition monoids and Cayley automata

In this section we show that every M -automaton A can be transformed into a homomorphismϕA : M → TA which recognizes L(A). Conversely, for every homomorphism ϕ : M → Nand every subset P ⊆ N one can construct an M -automaton Aϕ such that L(Aϕ) = ϕ−1(P ).Moreover, both translations preserve finiteness in the sense that the automaton has finitelymany states if and only if the monoid N (resp. TA) is finite.

Let A = (Q, ·, q0, F ) be an M -automaton. Every element u ∈ M defines a mappingδu : Q → Q by setting δu(q) = q · u. When defining the product of two such functionsas δuδv = δuv = δv ◦ δu, then TA = { δu | u ∈M } forms a monoid, the transition monoidof A. It is a submonoid of the set of all functions Q→ Q. In particular, if Q is finite, thenthe transition monoid of A is finite. The mapping ϕA : M → TA defined by ϕA(u) = δurecognizes L(A):

u ∈ L(A) ⇔ q0 · u ∈ F ⇔ δu(q0) ∈ F ⇔ δu ∈ { δv | δv(q0) ∈ F } .

This shows that L(A) is the inverse image of { δv | δv(q0) ∈ F } ⊆ TA under the homomorphismϕA. Note that both the initial state and the final states are encoded in the accepting set forϕA.

Theorem 2.6. Let L ⊆M , let AL be its minimal M -automaton, and let TL be the transitionmonoid of AL. Then TL → Synt(L), δu 7→ [u] defines an isomorphism between TL and thesyntactic monoid Synt(L).

Proof. By Theorem 2.4 the mapping δu 7→ [u] is a surjective homomorphism. It remains toshow injectivity. Suppose u ≡L v. For x ∈M we have δu(L(x)) = L(x) · u = L(xu). Thus forall y ∈M we have

y ∈ δu(L(x)) = L(xu) ⇔ xuy ∈ L ⇔ xvy ∈ L ⇔ y ∈ L(xv) = δv(L(x))

and thus δu(L(x)) = δv(L(x)). Since this holds for all x ∈M we conclude δu = δv.

24

2.5. Transition monoids and Cayley automata

If A is a finite automaton with n states, then its transition monoid has at most nn elements.The following example of Holzer and Konig shows that this bound is tight [?].

Example 2.5. We define an automaton with states Q = {1, . . . , n}. We use the notation

τ =

(1 2 · · · n

τ(1) τ(2) · · · τ(n)

)to define a mapping τ : Q→ Q. Such mappings are also called transformations on Q. Let

α =

(1 2 · · · n− 1 n2 3 · · · n 1

),

β =

(1 2 3 · · · n2 1 3 · · · n

),

γ =

(1 2 3 · · · n2 2 3 · · · n

).

Every permutation on Q is generated by the two permutations α and β. The mapping γallows to identify two elements. Thus the set of all transformations on Q is generated by{α, β, γ}. Let A = (Q, ·, 1, {1}) be the {a, b, c}∗-automaton defined by i ·a = α(i), i · b = β(i),and i · c = γ(i) for i ∈ Q. Its graphic representations is

1

23

n-1n

················

a, b, c

aa

aa

a

b

c

b, c

b, c

b, c

The automaton A is minimal since all states are reachable, and for states i 6= j in Q wehave i · an−i+1 = 1 and j · an−i+1 6= 1. The transition monoid of A has nn elements becausethere are exactly nn transformations on Q. By Theorem 2.6, the transition monoid of A isthe syntactic monoid of L(A). In particular, by Theorem 2.4 there is no smaller monoidrecognizing L(A). 3

The step from homomorphisms to automata is straightforward. Suppose ϕ : M → N is ahomomorphism and P is a subset of N . The Cayley automaton of ϕ and P is Aϕ = (N, ·, 1, P )with transition function n · u = nϕ(u) for n ∈ N and u ∈M . We have

u ∈ L(Aϕ) ⇔ ϕ(u) = 1 · u ∈ P,

i.e., the M -automaton Aϕ accepts ϕ−1(P ). In particular, for L = ϕ−1(P ) it accepts thesubset L ⊆ M . If ϕ : M → Synt(L) is the syntactic homomorphism of L ⊆ M , then ingeneral the Cayley automaton Aϕ is not the minimal automaton of L. This means that if wefirst construct a homomorphism ϕ for some given automaton A and afterwards translate ϕinto an automaton Aϕ, then in general we have A 6= Aϕ.

Example 2.6. Let M = {a, b}∗ and let L = (ab)∗. The minimal automaton of L is:

25

2. Recognizable Sets

q0 q1

q2

a

bb a

a, b

We have L(q0) = (ab)∗, L(q1) = b(ab)∗ and L(q2) = {a, b}∗. The state q2 is called a sink statesince q2 · u = q2 for all u ∈ {a, b}∗. The syntactic monoid of L can be represented by shortestwords in their respective ≡L-classes. Using this notation we have Synt(L) = { 1, a, b, ab, ba, 0 }.As usual, 1 is the neutral element, and the element 0 satisfies 0x = x0 = 0 for all x ∈ Synt(L).We have 0 = aa = bb. The remaining entries of the multiplication table are given bythe rules aba = a and bab = b since aba ≡L a and bab ≡L b. For instance, we haveab · ba = a · bb · a = a · 0 · a = 0 or ab · ab = aba · b = a · b = ab. Suppose ϕ : {a, b}∗ → Synt(L)is the syntactic homomorphism of L, i.e., we have a 7→ a and b 7→ b. Then L = ϕ−1({1, ab})and the Cayley automaton Aϕ is:

1

a ab

0

bab

a

b

b

b

a

a

a b

ab

a, b

In particular Aϕ is not the minimal automaton of L. 3

On the other hand, let ϕ : M → N be a surjective homomorphism and let Aϕ be its Cayleyautomaton. Then the transition monoid of Aϕ is N since the mapping δu 7→ ϕ(u) is anisomorphism between the transtition monoid of Aϕ and N . This can be seen as follows. Itis well-defined because δu = δv implies ϕ(u) = δu(1) = δv(1) = ϕ(v); it injective becauseϕ(u) = ϕ(v) implies δu(n) = n · u = nϕ(u) = nϕ(v) = n · v = δv(n) for all n ∈ N , i.e., wehave δu = δv. The mapping δu 7→ ϕ(u) trivially defines a homomorphism and it is surjectivebecause ϕ is surjective.

Remark 2.1. Let N be a monoid and let Σ ⊆ N be a generating set. Then the Cayley graphof N is GN = (N,E) with Σ-labeled edges { (n, a, n · a) | n ∈ N, a ∈ Σ }. Sometimes GN iscalled the right Cayley graph of N , whereas the edges { (n, a, a · n) | n ∈ N, a ∈ Σ } definethe left Cayley graph. If id : N → N is the identity mapping, then (after forgetting aboutinitial and final states) the Cayley automaton Aid can be identified with the Cayley graph GN .If we only draw Σ-transitions, then the graph structure of Aid is GN . 3

2.6. The Myhill-Nerode Theorem

We are now ready to prove that recognition by finite monoids and acceptance by deterministicautomata with finitely many states define the same subsets. Moreover, one can restate thisproperty in terms of minimal automata and syntactic monoids. This following result is knownas the Myhill-Nerode Theorem.

26

2.7. Learning Recognizable Sets

Theorem 2.7 (Myhill, Nerode). Let M be monoid and let L ⊆M . The following conditionsare equivalent:

(a) L ∈ REC(M), i.e., L is recognized by a finite monoid.

(b) L is accepted by an M -automaton with finitely many states.

(c) The minimal automaton of L has finitely many states.

(d) The syntactic monoid of L is finite.

Proof. (a)⇒ (b): If the homomorphism ϕ : M → N for some finite monoid N recognizes L,then the Cayley automaton Aϕ has finitely many states and it accepts L. (b)⇒ (c): ByTheorem 2.5, the minimal automaton of L is the smallest automaton recognizing L. (c)⇒ (d):By Theorem 2.6, the syntactic monoid is the transition monoid of the minimal automaton.(d)⇒ (a): The syntactic monoid Synt(L) recognizes L.

As a corollary we obtain the following characterization of subsets accepted by finiteM -automata.

Corollary 2.8. Let M be monoid and let L ⊆M . The following conditions are equivalent:

(a) M is finitely generated and L ∈ REC(M).

(b) L is accepted by a finite M -automaton.

Proof. (a)⇒ (b): Suppose M is generated by a finite set Σ. By the Myhill-Nerode Theorem,the set L is accepted by an M -automaton A = (Q, ·, q0, F ) with finitely many states. Byrepresenting the transition function · by δ = { (p, a, p · a) | p ∈ Q, a ∈ Σ } we see that A isfinite.

(b)⇒ (a): If L is accepted by a finite M -automaton A = (Q, δ, q0, F ), then by replacing Qby the reachable states q0 ·M we obtain an automaton for L with finitely many states. Bythe Myhill-Nerode Theorem, we have L ∈ REC(M). Moreover, the finite set

Σ = { a ∈M | (p, a, q) ∈ δ for some p, q ∈ Q }

generates M (since, for instance, δ defines the transition function q0 · u for all u ∈M).

Figure 2.1 summarizes the transformations between automata and monoids; the terms nand nn indicate the maximal blow-up relative to the number of states and the size of themonoid.

automatonmonoid

minimal automatonsyntactic monoid

transition monoid, nn

Cayley automaton, n

transition monoid, nn

Cayleyautomaton, n

Theorem 2.5trivialTheorem 2.4 trivial

Figure 2.1.: Transformation between automata and monoids.

2.7. Learning Recognizable Sets

We present Angluin’s L∗-learning algorithm [?]. The setting is as follows. Let M be generatedby Σ. A teacher knows a recognizable set L ⊆M and a learner wants to learn the minimal

27

2. Recognizable Sets

automaton of L. The learner knows M and Σ and he can ask two kinds of questions to theteacher:

Membership queries: For u ∈ M the learner can ask whether u ∈ L. The answer of theteacher is either yes or no.

Equivalence queries: For an M -automaton A the learner can ask whether L(A) = L. Theanswer of the teacher is either yes or he provides a element u = a1 · · · ak with ai ∈ Σ inthe symmetric difference of L(A) and L.

Let n be the number of states of the minimal automaton of L. A naive approach for thelearner could be enumerating all M -automata and using each of them for an equivalencequery; eventually, the teacher will say yes. This approach is exponential in n due to the largenumber of automata with at most n states. Angluin’s L∗-algorithm yields an algorithm usinga polynomial number of queries (at least if the teacher provides counter-examples of smalllength in his answers to equivalence queries). For elements u, v ∈M we define u ≈L v if forall y ∈M we have

uy ∈ L ⇔ vy ∈ L.The relation ≈L defines an equivalence relation on M and its classes are the states of theminimal automaton. The learner tries to approximate ≈L. A set of extensions is a subsetE ⊆M with 1 ∈ E. For a set of extensions E we set u ∼E v if for all e ∈ E we have

ue ∈ L ⇔ ve ∈ L.

Note that u ≈L v implies u ∼E v for all sets of extensions E. If u ∼E v, then

u ∈ L ⇔ v ∈ L

since 1 ∈ E. A subset S ⊆ M is a set of samples if 1 ∈ S and S prefix-closed. A subsetS ⊆M is prefix-closed if every element u ∈ S can be written as u = a1 · · · ak for ai ∈ Σ suchthat { a1 · · · ai | 0 ≤ i ≤ k } ⊆ S. A pair P = (S,E) consisting of samples S and extensions Eis complete if for all elements s ∈ S and for all generators a ∈ Σ there exists t ∈ S with

sa ∼E t.

The pair P is consistent if for all elements s, t ∈ S and for all generators a ∈ Σ we have

s ∼E t ⇒ sa ∼E ta.

For a set of extensions E and an element u ∈ M we define [u]E = { v ∈M | u ∼E v }. IfP = (S,E) is both complete and consistent, then we obtain an automatonAP = (QP , ·, qP , FP )with

QP = { [s]E | s ∈ S } ,[s]E · a = [sa]E for s ∈ S,

qP = [1]E ,

FP = { [s]E | s ∈ S ∩ L } .

If s ∈ S and a ∈ Σ, then [sa]E ∈ QP since P is complete. If [s]E = [t]E for s, t ∈ S, thenby consistency of P we obtain [sa]E = [ta]E . This shows that the transition function iswell-defined.

Lemma 2.9. If P = (S,E) is both complete and consistent, then L(AP ) ∩ S = L ∩ S.

Proof. Let u ∈ S. Since S is prefix-closed we can write u = a1 · · · ak with ai ∈ Σ such thata1 · · · ai ∈ S for all 0 ≤ i ≤ k. Therefore, there exists a run

[1]Ea1 [a1]E

a2 [a1a2]Ea3 · · · ak−1 [a1 · · · ak−1]E

ak [a1 · · · ak]E

in AP . This run is accepting if and only u ∈ L.

28

2.7. Learning Recognizable Sets

We can consider the states QP = { [u]E | u ∈ S } also for pairs P = (S,E) which are notcomplete or not consistent. If P = (S,E) and P ′ = (S′, E) for S ⊆ S′, then QP ⊆ QP ′ . IfP = (S,E) and P ′ = (S,E′) for E ⊆ E′, then for every [s]E′ ∈ QP ′ with s ∈ S we have[s]E ∈ QP and [s]E′ ⊆ [s]E , i.e., QP ′ refines QP . This shows that by increasing S or E, thenumber of states never decreases.

Lemma 2.10. Let P = (S,E) be complete and consistent, and let u = a1 · · · ak withai ∈ Σ be an element in the symmetric difference of L(AP ) and L. Then P ′ = (S′, E) withS′ = S ∪ { a1 · · · ai | 1 ≤ i ≤ k } is not consistent.

Proof. Suppose P ′ were consistent. For all i ∈ {1, . . . , k} there exists si ∈ S with

[1]E · (a1 · · · ai) = [si]E .

In particular, we have siai ∼E si+1. By induction on i, consistency of P ′ yields si ∼E a1 · · · ai.In particular sk ∼E u. This leads to the following contradiction:

u ∈ L(AP ) ⇔ sk ∈ L(AP ) since [1]E · u = [1]E · sk⇔ sk ∈ L by Lemma 2.9

⇔ u ∈ L by sk ∼E u and 1 ∈ E⇔ u 6∈ L(AP ) by choice of u.

Therefore, P ′ is not consistent.

The algorithm

Angluin’s learning algorithm now proceeds as follows. We start with some arbitrary pairP = (S,E) of samples and extensions; for instance we could use S = E = {1}. Then wesuccessively replace P by the pair P ′ defined by the following procedure:

The pair P is not complete: Let s ∈ S and a ∈ Σ with [sa]E 6∈ QP . Then we set P ′ = (S′, E)with S′ = S ∪ {sa}. This yields |QP ′ | > |QP | since [sa]E ∈ QP ′ .

The pair P is not consistent: Let s, t ∈ S and a ∈ Σ with s ∼E t and sa 6∼E ta. Thereexists e ∈ E with sae ∈ L ⇔ tae 6∈ L. Then we set P ′ = (S,E′) with E′ = E ∪ {ae}.This yields |QP ′ | > |QP | since [s]E′ 6= [t]E′ .

The pair P is complete and consistent, but L(AP ) 6= L: Let u = a1 · · · ak with ai ∈ Σ bean element in the symmetric difference of L(AP ) and L. Then we set P ′ = (S′, E)with S′ = S ∪ { a1 · · · ai | 1 ≤ i ≤ k }. By Lemma 2.10, the pair P ′ is not consistent.Therefore, the set of states increases in the next iteration.

Note that we preserve the invariant that the set of samples S is prefix-closed. It is sufficientfor the learner to represent the restriction of the relation ∼E to the set S ∪ SΣ.

The number of queries

The learner asks membership queries for all elements se with s ∈ S ∪ SΣ and e ∈ E. Thissuffices for checking whether the current pair P is complete and consistent and for computingthe automaton AP . In the case that the current pair is both complete and consistent, heasks an equivalence query.

We assume that the learner starts with the pair ({1} , {1}). Throughout the number ofstates |QP | is bounded by n. Since increasing the extensions also increases the numberof states, we have |E| ≤ n. If all counter-examples u = a1 · · · ak of the teacher satisfyk ≤ m, then after adding at most m+ 1 elements to S, the number of states increases. Thus|S| ≤ (m+ 1)n. This shows that the learner asks at most(

(m+ 1)n+ |Σ| (m+ 1)n)n ∈ O(|Σ|mn2)

29

2. Recognizable Sets

membership queries. If the teacher provides counter-examples of minimal length, then byCorollary 4.6 we will see that m ≤ |QP |+n−2 ≤ 2n−2. In particular, the algorithm requiresat most O(|Σ|n3) membership queries. The number of equivalence queries is at most n sinceevery equivalence query causes an increase in the number of states.

Example 2.7. We apply Angluin’s algorithm for learning the language L = b {ab, a}∗ ⊆ Σ∗

for Σ = {a, b}, see Example 1.7. We represent the restriction of the relation ∼E to the setS ∪ SΣ as a table. The top rows correspond to S, the bottom rows correspond to SΣ \ S,and the columns correspond to E. At coordinate (s, e) for s ∈ S ∪ SΣ and e ∈ E we write 1if se ∈ E and 0 otherwise. Two elements in S ∪ SΣ are ∼E-equivalent if and only if theyhave identical 0-1 rows. The initial table for the pair P = ({ε} , {ε}) is as follows.

ε

ε 0

a 0b 1

The pair P is not complete since the 0-1-row for b in the lower half does not occur in theupper half. We thus add b to S which yields the following situation.

ε

ε 0b 1

a 0ba 1bb 0

Now, the pair is complete and consistent and the learner asks an equivalence query with thefollowing automaton.

[ε] [b]b

b

a

a

Since the automaton is not correct, the teacher returns the counter-example ab. Incorporatingab and its prefix a into S yields the following table.

ε

ε 0a 0b 1ab 0

aa 0ba 1bb 0aba 0abb 0

The current pair P is not consistent since ε and a are ∼E-equivalent (they have the same0-1-row) but ε · b 6∼E a · b. Therefore, we add b to E.

30

2.7. Learning Recognizable Sets

ε b

ε 0 1a 0 0b 1 0ab 0 0

aa 0 0ba 1 1bb 0 0aba 0 0abb 0 0

This table is not complete since the 0-1-row of ba does not occur in the upper half. Wetherefore add ba to S.

ε b

ε 0 1a 0 0b 1 0ab 0 0ba 1 1

aa 0 0bb 0 0aba 0 0abb 0 0baa 1 1bab 1 0

This table is both complete and consistent. Therefore, the learner uses the following automatonfor an equivalence query.

[ε] [b] [ba]

[a]

ba

b

a

ab

a, b

The teacher answers yes since this automaton accepts L. 3

Learning Boolean functions with few terms

Let B = {1, 0} be the Boolean lattice. As usual, 1 is for true and 0 is for false. A Booleanfunction is a mapping of the form f : Bn → B for some n ∈ N. Boolean formulas are oneway of defining Boolean functions. We assume to have a set of variables x1, . . . , xn. Aliteral is a variable or the negation of a variable. A term is a conjunction of literals, but itnever contains both a variable and its negation. We consider formulas ϕ which are Booleancombinations of terms t1, . . . , tk. For b1, . . . , bn ∈ B we let ϕ(b1, . . . , bn) be the value obtainedwhen interpreting the variable xi by bi. This way, every formula ϕ defines a Boolean functionfϕ : b1 · · · bn 7→ ϕ(b1, . . . , bn). Suppose the teacher knows a Boolean function fϕ for someBoolean combination ϕ of terms t1, . . . , tk and the learner wants to learn the function fϕ.The learner can ask two kinds of queries.

• He can ask the teacher for the value fϕ(b1, . . . , bn) for bi ∈ B.

31

2. Recognizable Sets

• Or he can ask whether g = fϕ for some Boolean function g. We do not care about thepresentation of g at this point. If g 6= fϕ, then the teacher provides b1 · · · bn ∈ Bn withg(b1, . . . , bn) 6= fϕ(b1, . . . , bn).

We present Kushilevitz’s reduction to Angluin’s L∗-algorithm [?]. To this end, we representg : Bn → B as the language

Lg = { b1 · · · bn ∈ Bn | g(b1, . . . , bn) = 1 } ⊆ B∗.

If g = fϕ, then we write Lϕ instead of Lfϕ . Now, the learner wants to learn the minimalautomaton of Lϕ. This uniquely defines fϕ. Membership queries and equivalence queriesimmediately translate into queries of the above type. Note that the answer to membershipqueries for u ∈ B∗ \ Bn is always no and that the learner can provide counter-examples forhimself if the current language is not contained in Bn.

Next, we give an automaton Aϕ = (Q, ·, q0, F ) for the language Lϕ. Let Q be the disjointunion {0, . . . , n} × 2{1,...,k} and {qsink}. The semantics of a state (i, R) ∈ Q is that theautomaton has read i bits and that the terms in R are not yet false. The state qsink forprocessing more than n bits. The initial state is q0 = (0, {1, . . . , k}) and the final states Fare those pairs (n,R) such ϕ is true when setting ti = 1 if i ∈ R. The transition function isdefined by

(i− 1, R) · a =

{(i, R \ { j | tj contains the literal xi }

)if a = 0(

i, R \ { j | tj contains the negation of xi })

if a = 1

and qsink ·a = qsink for all a ∈ B. The automaton Aϕ accepts Lϕ and it has (n+1)2k+1 states.By Angluin’s L∗-algorithm, the number queries is polynomial in n and 2k. In particular, ifk ∈ O(log n), then the resulting algorithm requires only polynomially many queries.

32

3. Regular Languages

Kleene’s Theorem states that, for free monoids over finite alphabets, the rational and therecognizable languages coincide. It is therefore common to refer to this language class as theregular languages. Rational sets have some closure properties and recognizable sets have otherclosure properties, see Table 3.1. An important consequence of Kleene’s Theorem is thatregular languages have both the closure properties of rationals sets and the closure propertiesof recognizable sets.

3.1. McKnight’s Theorem

Essentially, McKnight’s Theorem gives the first half of Kleene’s Theorem. Every finitedeterministic M -automaton is also a finite nondeterministic M -automaton. This does notimmediately show that all recognizable sets are rational since the automaton model forrecognizability relies on a slightly different notion of finiteness, namely on a finite numberof states. On the other hand, finite nondeterministic automata have a finite number oftransitions. We have already seen that for finitely generated monoids these two aspects offiniteness coincide. This observations yields the following theorem.

Theorem 3.1 (McKnight). Let M be a monoid. Then M is finitely generated if and only ifREC(M) ⊆ RAT(M).

Proof. Suppose M is finitely generated. If L ∈ REC(M), then by Corollary 2.8 the set L isaccepted by a finite M -automaton. Theorem 1.7 then yields L ∈ RAT(M).

For the other inclusion let REC(M) ⊆ RAT(M). Since M ∈ REC(M), this yieldsM ∈ RAT(M). Now, by Proposition 1.1 the monoid M is finitely generated.

3.2. The powerset construction and Kleene’s Theorem

We will give two different proofs of Kleene’s Theorem. The first proof relies on the so-calledpowerset construction for nondeterministic automata whereas the second proof uses animplementation of nondeterministic automata in terms of Boolean matrices. This secondproof will be given in the next section.

Here, we show that for every nondeterministic word automaton there exists an equivalentdeterministic automaton, the so-called powerset construction. Let Σ be an alphabet and letA = (Q, δ, I, F ) be a nondeterministic Σ∗-automaton with δ ⊆ Q× Σ×Q. We introduce atransition function on the powerset 2Q of Q as follows. For P ⊆ Q and a ∈ Σ we define

P · a = { q ∈ Q | ∃p ∈ P : (p, a, q) ∈ δ } .

That is, P · a consists of the states which are reachable from P using an a-transition. Thetransition function is extended to words by defining P · au = (P · a) · u for a ∈ Σ and u ∈ Σ∗.The powerset construction of A is P(A) = (Q′, ·, I, F ′) with states

Q′ = {P ⊆ Q | ∃u ∈ Σ∗ : I · u = P }

and final states F ′ = {P ∈ Q′ | P ∩ F 6= ∅ }. This means that the states of P(A) are thereachable subsets of Q and the final states are those which contain some final state of A.

33

3. Regular Languages

Proposition 3.2. Let Σ be an alphabet and let A = (Q, δ, I, F ) be a nondeterministic Σ∗-automaton with δ ⊆ Q × Σ × Q. Then the powerset construction P(A) is a deterministicΣ∗-automaton with L(P(A)) = L(A).

Proof. Let P(A) = (Q′, ·, I, F ′). Since the first letter in every nonempty word is unique, thetransition function · : Q′×Σ∗ → Q′ is well-defined. Moreover, P ·ε = P and P ·uv = (P ·u) ·v.Thus P(A) is a Σ∗-automaton. By induction on the length of u we show

I · u ={q ∈ Q

∣∣ ∃q0 ∈ I : q0u q

}.

For u = ε, the claim is true. Let now u = u′a for u′ ∈ Σ∗ and a ∈ Σ. By induction, I · u′ hasthe desired form. This yields

I · u = (I · u′) · a ={q′ ∈ Q

∣∣∣ ∃q0 ∈ I : q0u′ q′

}· a

={q ∈ Q

∣∣∣ ∃q0 ∈ I ∃q′ ∈ Q : q0u′ q′ and (q′, a, q) ∈ δ

}={q ∈ Q

∣∣ ∃q0 ∈ I : q0u q

}.

Therefore I · u ∈ F ′ if and only if A has an accepting run on u. This shows that P(A) and Aaccept the same language.

The proof of Proposition 3.2 would not work for arbitrary monoids M generated by Σsince the factorization of u ∈M \ {1} in terms of u = u′a with u′ ∈M and a ∈ Σ might notbe unique. Also note that we did not require that Σ are A are finite. If A has n states, thenits powerset construction P(A) has at most 2n states.

Theorem 3.3 (Kleene). For every finite alphabet Σ we have REC(Σ∗) = RAT(Σ∗).

Proof. The inclusion REC(Σ∗) ⊆ RAT(Σ∗) follows from McKnight’s Theorem 3.1. Considera language L ∈ RAT(Σ∗). By Theorem 1.7 there exists a finite letter-by-letter Σ∗-automatonA = (Q, δ, {q0} , F ) with L(A) = L. The powerset construction P(A) has finitely many states,and it satisfies L(P(A)) = L by Proposition 3.2. By the Myhill-Nerode Theorem 2.7 weconclude L ∈ REC(Σ∗).

Example 3.1. Let Σ = {a, b}. The automaton on the left is the Σ∗-automaton from Exam-ple 1.9. On the right is its powerset construction.

1 2

3

4

5

bb

aa

a

a

a a

{1}

{2}

{3, 5}

{4}

b

a

b

a

b

ba

a, b

a

Note that the empty set ∅ also occurs as a state in the powerset construction. Otherwise thepowerset construction would not be complete. 3

Example 3.2. Let Σ = {a, b} and consider the language L = Σ∗aΣn−1. The followingnondeterministic automaton with n+ 1 states accepts L.

0 1 2 3 · · · n-1 n

a, b

a a, b a, b a, b a, b a, b

34

3.3. Nondeterministic automata and Boolean matrices

The powerset construction of the above automaton has 2n states. For any word u = an · · · a1with ai ∈ Σ we have 0 u j for j ∈ {1, . . . , n} if and only if aj = a. Therefore every wordof length n uniquely determines a subset P ⊆ {1, . . . , n}. Thus the powerset constructionhas at least 2n states (given by the sets P ∪ {0}). Since 0 is contained in all states of thepowerset construction, the powerset construction cannot have more than 2n states.

Consider two distinct states P, P ′ ⊆ Q of the powerset construction. Without loss ofgenerality, there exists j ∈ P \ P ′. We have 1 ≤ j ≤ n. Thus n = j · bn−j ∈ P · bn−j , butn 6∈ P ′ · bn−j . This shows that the powerset construction yields the minimal automaton ofthe language L. 3

The above example shows that in the worst case, the powerset construction of an n-stateautomaton can have at least 2n−1 states. We refer to Example 4.6 for a proof the bound 2n

on the powerset construction of an n-state automaton is tight.

3.3. Nondeterministic automata and Boolean matrices

By the Myhill-Nerode Theorem 2.7 we have two equivalent characterizations of the recognizablelanguages, finite monoids and deterministic automata with finitely many states. The powersetconstruction translates a nondeterministic automaton with n states into a deterministicdeterministic one with up to 2n states. The transition monoid of a deterministic n-stateautomaton can have up to nn elements. Therefore, when starting with a nondeterministicn-state automaton, using the powerset construction, and building its transition monoid, thenthe resulting upper bound for the size of this monoid is (2n)2

n= 2n2

n. In this section we

show that, instead of this doubly exponential bound, the singly exponential bound 2n2

issufficient.

There are two different views on the following result. First, it can be seen as a generalizationof transition monoids to nondeterministic automata. And second, it can be interpreted as animplementation of nondeterministic automata using Boolean matrices.

Theorem 3.4. Let Σ be an alphabet. If L ⊆ Σ∗ is accepted by a nondeterministic Σ∗-automaton A = (Q, δ, I, F ) with |Q| = n and δ ⊆ Q × Σ × Q, then L is recognized by amonoid with at most 2n

2elements.

Proof. We assume Q = { 1, . . . , n }. Let B = {1, 0} denote the Boolean lattice. The n × nBoolean matrices are denoted by Bn×n. There are 2n

2Boolean matrices. The entry of the

matrix A ∈ Bn×n in row i and column j is written as Aij . For every letter a ∈ Σ we definethe matrix Aa ∈ Bn×n by

Aaij = 1 ⇔ (i, a, j) ∈ δ.

This yields the homomorphism ψ : Σ∗ → Bn×n with

ψ(a1 · · · an) = Aa1 · · ·Aan

for letters ai ∈ Σ. The term Aa1 · · ·Aan denotes the matrix product of Aa1 , . . . , Aan in whichwe use the bitwise OR (denoted by ∨) for addition, and the bitwise AND (denoted by ∧) formultiplication. Suppose ψ(u) = B. We claim that

Bij = 1 ⇔ i u j.

The proof of the claim is by induction on the length of u. If u = ε, then B is the identitymatrix and the claim is true since A has no ε-transitions. Let now u = u′a for a ∈ Σ. Byinduction, we can assume that the matrix B′ = ψ(u′) has the desired property for u′. We

35

3. Regular Languages

have B = B′Aa and thus

Bij = 1 ⇔n∨k=1

(B′ik ∧Aakj

)⇔ i u

′k and (k, a, j) ∈ δ for some k ∈ {1, . . . , n}

⇔ i u j.

This proves the claim. We conclude L = ψ−1(P ) for

P ={B ∈ Bn×n

∣∣ ∃i ∈ I ∃j ∈ F : Bij = 1}.

This shows that ψ recognizes the language L.

Several remarks regarding the proof of Theorem 3.4 are in order. First, note that whenreplacing Σ∗ by some arbitrary Σ-generated monoid M , then ψ would not be well-defined ingeneral: An element u ∈M could be written as both u = a1 · · · an and u = b1 · · · bm whichmight result in two different matrices for u. Second, instead of considering all Boolean matricesone usually obtains considerably smaller monoids by only using the image ψ(Σ∗) ⊆ Bn×n. Thethird remark is that Boolean matrices can be used as a way of implementing nondeterministicautomata; instead of handling the automaton as a labeled graph one could alternatively relyon the matrices {Aa ∈ Bn×n | a ∈ Σ } together with the initial states I and the final states F .

Example 3.3. Let Σ = {a, b}. We again consider the Σ∗-automaton from Example 1.9.

1 2

3

4

5

bb

aa

a

a

a a

We have

Aa =

0 0 0 0 00 0 1 0 10 0 1 0 10 0 1 0 10 0 0 0 0

Ab =

0 1 0 0 00 0 0 0 00 0 0 0 00 0 0 0 00 0 0 1 0

By matrix multiplication one obtains for instance the following matrices.

AbAa =

0 0 1 0 10 0 0 0 00 0 0 0 00 0 0 0 00 0 1 0 1

AaAb =

0 0 0 0 00 0 0 1 00 0 0 1 00 0 0 1 00 0 0 0 0

AbAaAaAb =

0 0 0 1 00 0 0 0 00 0 0 0 00 0 0 0 00 0 0 1 0

Note that the matrix B = AbAaAaAb was obtained by multiplication of the two matricesAbAa and AaAb. The entry 1 of the matrix B in row 1, column 4 shows that baab is acceptedby the automaton since it indicates a run from the initial state 1 to the final state 4. 3

The following example shows that the bound 2n2

in Theorem 3.4 is tight.

Example 3.4. Let Σ = {a, b, c, d}. We extend the automaton in Example 2.5 by somenondeterministic action d. This leads to the following n-state automaton A.

36

3.4. The relation between rational and recognizable sets

1

23

n-1n

················

a, b, c, d

aa

aa

a

b

d

c, d

b, c, d

b, c, d

b, c, d

Let L = L(A). The automaton A generates all binary relations R ⊆ {1, . . . , n}2; that is, forevery such relation R there exists a word uR ∈ Σ∗ such that i

uR j if and only if (i, j) ∈ R.There are 2n

2binary relations on {1, . . . , n}2. Therefore, the monoid M constructed in the

proof of Theorem 3.4 has size 2n2. For showing that this bound cannot be improved, we

verify that the syntactic congruence ≡L of the language L satisfies uR 6≡L uS for all differentrelations R,S ⊆ {1, . . . , n}2. Without loss of generality, we can assume (i, j) ∈ R \ S. Thenai−1uRa

j ∈ L but ai−1uS aj 6∈ L. Hence uR 6≡L uS as desired. 3

Remark 3.1. An immediate consequence of Theorem 3.4 is another proof of Kleene’s Theo-rem 3.3, which states that REC(Σ∗) = RAT(Σ∗) for finite alphabets Σ. The inclusion fromleft to right is McKnight’s Theorem 3.1. Let now L ∈ RAT(Σ∗). By Theorem 1.7 there existsa finite letter-by-letter automaton accepting L. By Theorem 3.4, the language L is recognizedby a finite monoid, i.e., we have L ∈ REC(Σ∗). 3

3.4. The relation between rational and recognizable sets

We have more tools at hand for proving that sets are not recognizable than for proving thatsets are not rational. For showing that a given set is not recognizable it suffices to show thatits syntactic monoid is infinite. Kleene’s Theorem therefore also yields an additional methodfor showing that some given subset (of a finitely generated free monoid) is not rational, simplyby showing that it is not recognizable. We apply this technique in the next example forshowing that, in general, rationals sets are not closed under intersection.

Example 3.5. Consider the monoid M = {a, b}∗× c∗. We define the following rational subsetsof M :

K1 = {(a, c), (b, ε)}∗

K2 = {(a, ε), (b, c)}∗ .

The set K1 consists of the elements (u, cn) such that the word u ∈ {a, b}∗ has exactly noccurrences of the letter a. Similarly, K2 contains those elements (u, cn) such that u hasexactly n occurrences of the letter b. We want to show that K1 ∩K2 is not rational. Bycontradiction, we assume that K1 ∩K2 is rational. We have

K1 ∩K2 = { (u, cn) | u has n occurrences of the letter a and n occurrences of b } .

Consider the homomorphism ψ : M → {a, b}∗ defined by ψ(u, cn) = u. Since rational sets areclosed under homomorphic images (Proposition 1.3), the language L = ψ(K1 ∩K2) ⊆ {a, b}∗is rational. Note that

L = {u ∈ {a, b}∗ | u has the same number of a’s as b’s } .

37

3. Regular Languages

rational sets recognizable sets regular languages

union: " (by definition) " (Theorem 2.3) "

intersection: # (Example 3.5) " (Theorem 2.3) "

complement: # (Example 3.5) " (Theorem 2.3) "

residuals: # (Example 1.4) " (Theorem 2.3) "

inverse homomorphisms: # (Example 1.3) " (Theorem 2.3) "

homomorphisms: " (Proposition 1.3) # (Example 2.1) "

product: " (by definition) # (Example 2.2) "

Kleene star: " (by definition) # (Example 2.2) "

Table 3.1.: Closure properties of rational sets, recognizable sets, and regular languages.

The syntactic monoid of L is infinite. For instance, we have am 6≡L an for m 6= n sinceambm ∈ L and anbm 6∈ L. This shows that L is not recognizable, and by Kleene’s Theorem Lis not rational. This is a contradiction. Therefore, the assumption is wrong and K1 ∩K2 isnot rational. In particular, RAT(M) is not closed under intersection. Moreover, RAT(M)cannot be closed under complement since otherwise by DeMorgan’s law it would also beclosed under intersection (by definition RAT(M) is closed unter union). 3

We have seen in Example 1.3 that in general RAT(M) is not closed under inverse homo-morphism. The example used non-finitely generated monoids. The following example showsthat RAT(M) is not closed under inverse homomorphisms even if M is finitely generated.

Example 3.6. Let M = N× N with componentwise addition and consider the set {(1, 1)}∗ ={ (n, n) | n ∈ N }. The mapping ψ : {a, b}∗ → N with ψ(a) = (1, 0) and ψ(b) = (0, 1) definesa homomorphism. We have ψ−1(L) = {u ∈ {a, b}∗ | u has the same number of a’s as b’s }which is not rational (as we have seen in Example 3.5. 3

Table 3.1 summarizes the closure properties of rational sets, recognizable sets, and regularlanguages. By Kleene’s Theorem, the regular languages have the closure properties of boththe rational and the recognizable sets.

For regular languages we can combine Figures 1.1 and 2.1 into Figure 3.1. The terms afterthe names of the constructions indicate the maximal blop-up relative to the number of states,the size of expressions, and the size of monoids. As in Figure 1.1, we assume that the set ofgenerators Σ is fixed.

Recognizable and rational sets in direct products

Suppose M = M1 ×M2. We are interested in the relation between REC(M) and REC(Mi)for i ∈ {1, 2}. As it turns out, REC(M) is fully determined by REC(M1) and REC(M2).

Theorem 3.5 (Mezei). Let M1 and M2 be monoids. Then L ∈ REC(M1 ×M2) if and onlyif L is a finite union of sets of the form K1 ×K2 for K1 ∈ REC(M1) and K2 ∈ REC(M2).

Proof. ⇒: Let ϕ : M1×M2 → N be a homomorphism to a finite monoid N which recognizesthe set L. We define homomorphisms ψ1 : M1 → N and ψ2 : M2 → N by ψ1(m1) = ϕ(m1, 1)and ψ2(m2) = ϕ(1,m2). This yields the homomorphism ψ : M1 × M2 → N × N withψ(m1,m2) = (ψ1(m1), ψ2(m2)). The homomorphism ψ recognizes L since the set P ={ (n1, n2) ∈ N ×N | n1n2 ∈ ϕ(L) } satisfies

ψ−1(P ) = { (m1,m2) | ψ(m1,m2) ∈ P }= { (m1,m2) | ψ1(m1)ψ2(m2) ∈ ϕ(L) }= { (m1,m2) | ϕ(m1, 1)ϕ(1,m2) ∈ ϕ(L) }= { (m1,m2) | ϕ(m1,m2) ∈ ϕ(L) } = ϕ−1

(ϕ(L)

)= L.

38

3.4. The relation between rational and recognizable sets

rational expressionover Σ

nondeterministic automatonwith labels in Σ ∪ {1}

letter-by-letterautomaton for Σ

Thompson construction, O(n)

state elimination, 2O(n)

state elimination, 2 O

(n)

rem

oval

of

ε-tr

ansit

ions,ntri

vial

automatonmonoid

minimal automatonsyntactic monoid

transition monoid, nn

Cayley automaton, n

transition monoid, nn

Cayleyautomaton, n

Theorem 2.5trivialTheorem 2.4 trivial

powerset construction, 2 n

trivialBoo

lean

mat

rices

, 2n2

Figure 3.1.: Transformations between descriptions of regular languages.

39

3. Regular Languages

We conclude

L = ψ−1(ψ(L)

)=

⋃(n1,n2)∈ψ(L)

ψ−1(n1, n2) =⋃

(n1,n2)∈ψ(L)

ψ−11 (n1)× ψ−12 (n2)

showing that L has the desired form.

⇐: Let L = K1 ×K2, and for i ∈ {1, 2} let ϕi : Mi → Ni to a finite monoid Ni whichrecognizes Ki. We define ψ : M1 ×M2 → N1 ×N2 by ψ(m1,m2) = (ϕ1(m1), ϕ2(m2)). Then

ψ−1(ψ(L)

)= ψ−1

(ϕ1(K1)× ϕ2(K2)

)= ϕ−11

(ϕ1(K1)

)× ϕ−12

(ϕ2(K2)

)= K1 ×K2 = L.

Thus L ∈ REC(M1 ×M2). The implication from right to left now follows from the fact thatREC(M1 ×M2) is closed under union.

The following example shows that the implication from left to right in Mezei’s Theorem 3.5becomes false when replacing REC by RAT.

Example 3.7. Consider M = a∗ × b∗. The subset L = (a, b)∗ = { (an, bn) | n ≥ 0 } is rational.Assume L can be written as a finite union of sets of the form K1 ×K2 with K1 ∈ RAT(a∗)and K2 ∈ RAT(b∗). By Kleene’s Theorem, the languages Ki are recognizable and by Mezei’sTheorem L is recognizable. This is a contradiction since the syntactic monoid of L is infinite(and isomorphic to Z). Therefore, L has no presentation of the above form. 3

The implication from right to left in Mezei’s Theorem can be extended to rational sets.

Proposition 3.6. Let M1 and M2 be monoids. If K1 ∈ RAT(M1) and K2 ∈ RAT (M2),then K1 ×K2 ∈ RAT(M1 ×M2).

Proof. Let si be a rational expression for Ki. We replace every atom a ∈ M1 in theexpression s1 by (a, 1). This yields the expression t1. Symmetrically, we replace every atomb ∈ M2 in the expression s2 by (1, b) which yields t2. We have u ∈ L(s1) if and only if(u, 1) ∈ L(t1) ⊆ M1 ×M2. Similarly, v ∈ L(s2) if and only if (1, v) ∈ L(t2). Thus theconcatenation t1t2 defines K1 ×K2.

Even though REC(M) is not closed under product in general, the following consequence ofMezei’s Theorem shows that direct products preserve this closure property.

Proposition 3.7. Let M1 and M2 be monoids such that both REC(M1) and REC(M2) areclosed under product. Then REC(M1 ×M2) is closed under product, too.

Proof. Let K,L ∈ REC(M1 ×M2). By Mezei’s Theorem, we can write K =⋃kK1,k ×K2,k

and L =⋃` L1,` × L2,` for languages Ki,k, Li,` ∈ REC(Mi) and each of the unions is finite.

Since REC(Mi) is closed under product, we have Ki,kLi,` ∈ REC(Mi). Again by Mezei’sTheorem we see that the products (K1,k ×K2,k)(L1,` × L2,`) = K1,kL1,` ×K2,kL2,` are inREC(M1 ×M2). This yields

KL =⋃k,`

(K1,kL1,` ×K2,kL2,`

)∈ REC(M1 ×M2).

In particular, by Kleene’s Theorem and Proposition 3.7, for finite alphabets Σi we see thatREC(Σ∗1 × · · · × Σ∗n) is closed under product. We note that if both REC(M1) and REC(M2)are closed under Kleene star, then REC(M1 ×M2) does not have to be closed under Kleenestar. For example, we have {(a, b)} ∈ REC(a∗ × b∗) but (a, b)∗ is not recognizable.

40

3.4. The relation between rational and recognizable sets

Intersections of rational and reconizable sets

We have seen in Example 3.5 that rational sets are not closed under intersection. Under thisaspect, it is surprising that the intersection of a rational set and a recognizable set is rationalagain.

Proposition 3.8. Let M be a monoid, let K ∈ RAT(M), and let L ∈ REC(M). ThenK ∩ L ∈ RAT(M).

Proof. By Proposition 1.1, the set K is contained in a finitely generated submonoid M ′. LetM ′ ⊆M be generated by the finite set Σ. The inclusion Σ ⊆M ′ induces a homomorphismψ : Σ∗ → M (with ψ(a) = a for all a ∈ Σ). Note that Σ∗ denotes the free monoid over Σ.Since the restriction ψ : Σ∗ →M ′ is surjective, by Proposition 1.3 there exists K ′ ∈ RAT(Σ∗)with ψ(K ′) = K. Let L′ = ψ−1(L). Theorem 2.3 shows L′ ∈ REC(Σ∗). By Kleene’s Theorem,both K ′ and L′ are regular. Hence K ′ ∩ L′ ∈ RAT(Σ∗), and by Proposition 1.3 we obtainψ(K ′ ∩ L′) ∈ RAT(M). We obtain

ψ(K ′ ∩ L′) = ψ(K ′ ∩ ψ−1(L)

)= ψ

(K ′ ∩ ψ−1(L ∩M ′)

)= ψ(K ′) ∩ (L ∩M ′)= K ∩ L ∩M ′ = K ∩ L.

The third equality holds since the restriction ψ : Σ∗ →M ′ is surjective.

41

4. Algorithmic Properties of Automata

Regular languages have a great number of important closure properties. One reason forthe success of the regular languages is that these closure properties are effective. Thismeans that for given automata one can actually compute an automaton for the union (resp.intersection, complement, residual, homomorphism, inverse homomorphism, concatenation,Kleene star). In addition, for many decision problems such as emptiness, universality,inclusion, or equivalence there exist efficient algorithms when the input languages are givenas deterministic automata.

As operations in an automaton (Q, δ, I, F ) with δ ⊆ Q× Σ×Q we consider membershipqueries of the form “Is state q in F ?” or “Is (p, a, q) a transition in δ?”. Moreover, we assumethat we can efficiently enumerate all states in Q, I, or F as well as all transitions in δ. In thecase of deterministic automata, the computation of the transition function q · a is a typicaloperation.

In some situations it is important that automata are “deadlock-free”. This is formalizedby the following notion. An automaton A = (Q, δ, I, F ) with δ ⊆ Q× Σ×Q is complete iffor every state p ∈ Q and every letter a ∈ Σ there exists a transition of the form (p, a, q) ∈ δ,i.e., we always have an outgoing transition for every letter. An automaton can always bemade complete by adding a new state and redirecting all missing transtions to this new state.By definition, deterministic automata are complete.

Example 4.1. Let A be the following {a, b}∗-automaton for L = b(ab)∗.

q0

q1 q2

q5

q4q3

a

b

a

b

ab

ab

It is not complete since, for instance, there is no outgoing a-transition in state q1. An equivalentcomplete automaton can be obtained by adding an new sink state ⊥ and introducing additionaltransitions as depicted below.

⊥q0

q1 q2

q5

q4q3

a

b

a

b

a b

ab

a b

b a

a, b

This complete automaton contains several “unproductive” states which do not contribute toaccepting runs. In particular the state ⊥ is unproductive in this sense. 3

42

4.1. Boolean operations

The counterpart of being complete is being trim. A state which is not reachable from aninitial state or which cannot reach a final state cannot occur on an accepting run. Thusremoving such a state does not change the accepted language. This leads to the followingdefinition. A Σ∗-automaton A = (Q, δ, I, F ) is trim if for every state p there exist wordsu, v ∈ Σ∗, an initial state i ∈ I and a final state f ∈ F such that i u p v f . Everyautomaton can be trimmed by removing all states which either are unreachable from aninitial state or which cannot reach a final state.

Example 4.2. The automaton A in Example 4.1 is not trim. When trimming A (or equivalentlyits complete counterpart) we obtain the following automaton.

q0 q3 q4

b

a

b

Note that this automaton is not complete. 3

The automaton A in Example 4.1 is neither complete nor trim. Of course there also existautomata which are both complete and trim at the same time. The simplest example is theone state automaton for the language L = Σ∗.

4.1. Boolean operations

Complementing automata

Complementing an automaton over Σ∗ is the computation of the following function.

Input: A finite Σ∗-automaton A.

Output: A finite automaton B with L(B) = Σ∗ \ L(A).

Suppose the given automaton A = (Q, ·, q0, F ) is deterministic. Its complement automa-ton A is defined as

A = (Q, ·, q0, Q \ F ),

and we have L(A) = Σ∗ \L(A). Note that this construction works for deterministic automataover arbitrary monoids. Depending on the implementation of A, computing the complementautomaton usually is linear (or even constant with an adequate data structure for sets ofstates; for instance, one could use an additional bit indicating whether or not to invert theoutput of a membership query).

In general, RAT(M) is not closed under complement. Therefore, this is a strong indicationthat when A is nondeterministic, then applying Kleene’s Theorem in some form cannot reallybe avoided. In particular, for complementing a nondeterministic automaton over Σ∗, wecompute the complement automaton of the powerset construction of A. The following exampleshows that simply interchanging final and non-final states does not work for nondeterministicautomata.

Example 4.3. Consider the following two nondeterministic a∗-automata.

a

a

A :a

a

B :

The automaton B is obtain from A by interchanging final and non-final states. We haveL(A) = a∗a and L(B) = a∗. In particular, L(A) is a subset of L(B) whereas the complementis a∗ \ L(A) = {ε}. 3

43

4. Algorithmic Properties of Automata

Complementing the powerset construction can yield a deterministic automaton of expo-nential size. The following example is due to Holzer and Kutrib [?]; it shows that even fornondeterministic outputs, one cannot avoid the exponential blow-up.

Example 4.4. Let Σ = {a, b}. We show that for every n ≥ 1 there exists a language L ⊆ Σ∗

accepted by a nondeterministic automaton with n+ 2 states such that every letter-by-letterautomaton for the complement of L has at least 2n − 1 states. Let L = Σ∗aΣn−1bΣ∗. Thelanguage L is accepted by the following nondeterministic automaton with n+ 2 states.

0 1 2 · · · n-1 n n+1

Σ

a Σ Σ Σ Σ b

Σ

Let B = (Q, δ, I, F ) be a nondeterministic automaton with δ ⊆ Q× Σ×Q accepting Σ∗ \ L.We consider words in K = Σn \ {bn}. For every word u ∈ K we have uu ∈ L(B). Thus forevery word u ∈ K there exist states q0,u ∈ I, qu ∈ Q, and q1,u ∈ F such that q0,u

u quu q1,u.

Suppose qu = qv for two different words u, v ∈ K. By looking at the first difference betweenu and v, we see that there exists a position i such that one of the following holds:

(a) Either the i-th letter in u is a and the i-th letter of v is b,

(b) or the i-th letter in u is b and the i-th letter of v is a.

We can assume that (a) holds (since otherwise we interchange u and v). In particular, uv ∈ L.We have q0,u

u qu = qvv q1,v and thus uv ∈ L(B). This is a contradiction. Therefore, for

different words u, v ∈ K we have qu 6= qv, which yields

2n − 1 = |K| = |{ qu ∈ Q | u ∈ K }| ≤ |Q| .

This also shows that the minimal deterministic automaton of L has at least 2n − 1 statessince the minimal automaton of L and the minimal automaton of its complement Σ∗ \L onlydiffer in the final states. 3

The union operation for automata

We consider the following problem for automata over Σ∗.

Input: Two finite Σ∗-automata A1 and A2.

Output: A finite automaton B with L(B) = L(A1) ∪ L(A2).

Let A1 = (Q1, δ1, I1, F1) and A2 = (Q2, δ2, I2, F2) with Q1 ∩Q2 = ∅. There are two differentconstructions for the union of L(A1) and L(A2). The first construction is the union automaton

A1 ∪ A2 = (Q1 ∪Q2, δ1 ∪ δ2, I1 ∪ I2, F1 ∪ F2).

We trivially have L(A1 ∪A2) = L(A1)∪L(A2). With |Q1|+ |Q1| states, the size of the unionautomaton is linear. Also note that the union automaton construction works for automataover arbitrary monoids. The union automaton A1 ∪ A2 is nondeterministic even if A1 andA2 are deterministic.

Next, we consider product automata. This construction yields bigger automata but itpreserves determinism. Moreover, by an appropriate choice of the final states it can be usedfor other Boolean operations, too. If δi ⊆ Qi × Σ×Qi for i ∈ {1, 2}, then one can also usethe product automaton

A1 ×A2 = (Q1 ×Q2, δ, I1 × I2, F )

where((p1, p2), a, (q1, q2)

)∈ δ if both (p1, a, q1) ∈ δ1 and (p2, a, q2) ∈ δ2. It is often convenient

to write a state (p, q) ∈ Q1 ×Q2 as[pq

]. There exists a run[

p1q1

]a1[p2q2

]a2[p3q3

]a3 · · · an−1

[pnqn

]an[pn+1

qn+1

]

44

4.2. Homomorphisms and inverse homomorphisms

in the product automaton A1 ×A2 if and only if there exist runs

p1a1 p2

a2 p3a3 · · · an−1 pn

an pn+1 in A1,

q1a1 q2

a2 q3a3 · · · an−1 qn

an qn+1 in A2.

For the union L(A1) ∪ L(A2) of complete automata A1 and A2, we can use the productautomaton construction with final states F = (F1 ×Q2) ∪ (Q1 × F2). If A1 = (Q1, ·, q01, F1)and A2 = (Q2, ·, q02, F2) are given as deterministic automata, then the transistion function· : Q1 × Q2 × Σ∗ → Q1 × Q2 of the product automaton is defined componentwise by(p, q) · u = (p · u, q · u), and the initial state is (q01, q02). The product automaton constructionover nondeterministic automata only works for free monoids whereas over deterministicautomata, one can apply this construction for arbitrary monoids. The size of the productautomaton is quadratic. As usual, it suffices to consider reachable states, only. This canresult in smaller automata.

Intersections of regular languages

There exist monoids M such that the rational subsets of M are not closed under intersection,see Example 3.5. Therefore, we only consider the intersection of regular languages. Theintersection problem is the following.

Input: Two finite Σ∗-automata A1 and A2.

Output: A finite automaton B with L(B) = L(A1) ∩ L(A2).

Let Ai = (Qi, δi, Ii, Fi) with δi ⊆ Qi×Σ×Qi. Let B be the product automaton A1×A2 withfinal states F1 × F2. Then L(B) accepts L(A1) ∩ L(A2), even if A1 and A2 are not complete.Note that in the case of deterministic automata A1 and A2, the product automaton A1 ×A2

can accept any Boolean combination of L(A1) and L(A2), depending on the choice of thefinal states.

4.2. Homomorphisms and inverse homomorphisms

Homomorphic images

The computation of homomorphic images of automata is the following problem.

Input: A finite Σ∗-automaton A and a homomorphism ψ : Σ∗ → Γ∗.

Output: A finite Γ∗-automaton B with L(B) = ψ(L(A)).

As usual, ψ(L) = {ψ(u) | u ∈ L }. We mimic the proof of Proposition 1.3. For a nondeter-ministic automaton A = (Q, δ, I, F ) we set B = (Q, δ′, I, F ) with

δ′ ={ (p, ψ(u), q

) ∣∣ (p, u, q) ∈ δ}.

Now, if q1u1q2 · · ·unqn+1 is a run of A on u if and only if q1 ψ(u1) q2 · · ·ψ(un) qn+1 is a runof B on ψ(u). Thus L(B) = ψ(L(A)) as desired. If δ ⊆ Q × Σ × Q, then, by replacing asingle transition by many as in the proof of Theorem 1.6, we obtain an automaton B withtransition relation δ ⊆ Q× Γ×Q. If n = |Q| and m = max |ψ(a)|a ∈ Σ, then B has at mostnm states. If A is deterministic, then B does not have to be deterministic. Moreover, even ifψ(a) ∈ Γ for all a ∈ Σ, the minimal automaton of ψ(L(A)) for a deterministice automaton Acan have exponential size.

Example 4.5. Let Γ = {a, b}, let Σ = {a, b, c}, and consider the following deterministicΣ∗-automaton A with n+ 2 states.

45

4. Algorithmic Properties of Automata

0 1 2 3 · · · n-1 n n+1

b, c

a Σ Σ Σ Σ Σ Σ

Σ

Let ψ : Σ∗ → Γ∗ be the homomorphism defined by ψ(a) = ψ(c) = a and ψ(b) = b. Thenψ(L(A)) ⊆ {a, b}∗ is the language from Example 3.2. Since its minimal deterministicautomaton has 2n states, this shows that one cannot avoid the exponential blow-up whencomputing deterministic automata for homomorphic images, even if the input automaton isdeterministic. 3

Inverse homomorphisms

We consider the following problem.

Input: A finite Σ∗-automaton A and a homomorphism ψ : Γ∗ → Σ∗.

Output: A finite Γ∗-automaton B with L(B) = ψ−1(L(A)).

As usual, ψ−1(L) = {u | ψ(u) ∈ L }. Let A = (Q, δ, I, F ) be a nondeterministic automatonwith δ ⊆ Q× Σ×Q. We set B = (Q, δ′, I, F ) with

δ′ =

{(p, a, q) ∈ Q× Γ×Q

∣∣∣∣ p ψ(a)q in A

}.

Let u = a1 · · · an with ai ∈ Γ. There exists a run q1a1q2 · · · anqn+1 in B if and only if we have

q1ψ(a1)

q2ψ(a2) · · · qn

ψ(a2)qn+1

in A. The latter condition is equivalent to q1ψ(u)

qn+1 in A. Note that the implicationfrom right to left of this equivalence would not hold if we had labels of length greater than 1;moreover, this implication does not hold for automata over arbitrary monoids. We concludethat L(B) = ψ−1(L(A)) as desired.

If A = (Q, ·, q0, F ) is a deterministic Σ∗-automaton, then translating the above constructionfrom transition relations to transition functions yields the Γ∗-automaton B = (Q, ◦, q0, F )with q ◦ u = q · ψ(u). In particular, B is deterministic, too.

4.3. Residuals and quotients

We have already seen in Theorem 2.3 that regular languages are closed under residuals.Quotients are generalizations of residuals. In this section we give constructions on automatafor computing residuals and quotients.

Residuals

The languages u−1L = { v ∈ Σ∗ | uv ∈ L } is called the left residual of L ⊆ Σ∗ by the wordu ∈ Σ∗. Symmetrically, Lu−1 = { v ∈ Σ∗ | vu ∈ L } is a right residual. We say that K is aresidual of L if it is either a left or a right residual. Computing residuals is the followingproblem.

Input: A finite Σ∗-automaton A and a word u ∈ Σ∗.

Output: A finite Σ∗-automaton B with L(B) = u−1L(A) in the case of left residuals (orwith L(B) = L(A)u−1 in the case of right residuals).

We first consider left residuals. If A = (Q, δ, I, F ) is a nondeterministic automaton withδ ⊆ Q× Σ×Q, we set B = (Q, δ, I ′, F ) with

I ′ ={q ∈ Q

∣∣ ∃p ∈ I : p u q}.

46

4.3. Residuals and quotients

Let r ∈ Q and v ∈ Σ∗. Then the following conditions are equivalent:

• There exists q ∈ I ′ with q v r.

• There exist q ∈ I ′ and p ∈ I with p u q v r.

• There exists p ∈ I with p uv r.

This shows v ∈ L(B) if and only if uv ∈ L(A) and thus L(B) = u−1L(A). If A = (Q, ·, q0, F )is deterministic, then the above constructions yields the automaton B = (Q, ·, q′0, F ) withinitial state q′0 = q0 · u.

Next we consider right residuals. Suppose A = (Q, δ, I, F ) is a nondeterministic automatonwith δ ⊆ Q× Σ×Q. In this case we define B = (Q, δ, I, F ′) with

F ′ ={p ∈ Q

∣∣ ∃q ∈ F : p u q}.

By a similar reasoning as for left residuals we see that L(B) = L(A)u−1. In summary, theconstructions for residuals are obtained by adapting initial and final states; moreover, if A isdeterministic, then B is deterministic, too.

Quotients

The left quotient of a language L ⊆ Σ∗ by K ⊆ Σ∗ is

K−1L = { v ∈ Σ∗ | uv ∈ L for some u ∈ K } .

Thus, quotients are generalizations of residuals. Right quotients LK−1 are defined symmet-rically, that is LK−1 = {u ∈ Σ∗ | uv ∈ L for some v ∈ K }. We say that a language L′ is aquotient of L if it is either a left or a right quotient of L. We have

K−1L =⋃u∈K

u−1L. (4.1)

Every regular language has only finitely many left residuals (and symmetrically it has only afinite number of right residuals) since all residuals are accepted by the same automaton withvarying initial states. Thus, if L is regular, then the union in Equation (4.1) is finite. Thisshows that for a regular language L and an arbitrary language K the quotient K−1L (andsymmetrically LK−1) is regular. This does not mean that we can always compute K−1Lsince the condition “u ∈ K ” in Equation (4.1) might be undecidable. As an intermediatestep between left residuals and arbitrary left quotients we consider the following problem.

Input: Finite Σ∗-automata A1 and A2.

Output: A finite Σ∗-automaton B with L(B) = L(A2)−1L(A1).

Let Ai = (Qi, δi, Ii, Fi) with δi ⊆ Qi × Σ ×Qi. Consider the product automaton A1 × A2.Using

I ′ = { q ∈ Q1 | ∃f ∈ F2 : (q, f) is reachable in A1 ×A2 }

as the initial states we obtain the automaton B = (Q1, δ1, I′, F1), i.e., B is obtained from

A1 by modifying the initial states and the product automaton A1 ×A2 was only used forcomputing the new set of initial states. Let r ∈ Q and v ∈ Σ∗. Then the following conditionsare equivalent:

• There exists q ∈ I ′ with q v r in B.

• There exists q ∈ Q1 with q v r in A1 and there exists f ∈ F2 such that (q, f) is areachable state of the product automaton A1 ×A2.

• There exist pairs of states (q, f) ∈ Q1×F2 and (i1, i2) ∈ I1× I2 and there exists a wordu ∈ Σ∗ such that (i1, i2)

u (q, f) in A1 ×A2 and q v r in A1.

• There exists u ∈ L(A2) and there exists i1 ∈ I1 with i1uv q.

47

4. Algorithmic Properties of Automata

This shows L(B) = L(A2)−1L(A1). If A1 is deterministic, then L(B) can be nondeter-

ministic due to multiple initial states. Thus, for computing a deterministic automaton forL(A2)

−1L(A1) one might need to apply the powerset construction which for an n-stateautomaton A1 yields a deterministic automaton for the quotient L(A2)−1L(A1) with at most2n − 1 states (the empty set ∅ is no state of the powerset construction of B). The followingexample shows that this bound is tight. It is a slight modification of an example of Yu,Zhuang and Salomaa [?].

Example 4.6. Let Σ = {a, b} and Q = {1, . . . , n}. We consider the following deterministicn-state Σ∗-automaton A = (Q, ·, n, {n}) for the language L =

(b∗ ∪ aΣn−1)∗.

n

12

n-2n-1

················

a

ΣΣ

ΣΣ

Σ

b

A nondeterministic automaton for the left quotient (Σ∗)−1L of L by Σ∗ is obtained from A bymaking all states initial, i.e., we consider the nondeterministic automaton B = (Q, δ,Q, {n})with transition relation δ = { (i, u, j) ∈ Q× Σ×Q | i · u = j } such that every state of B isinitial. We claim that the powerset construction P(B) of B has 2n − 1 states and that it isthe minimal automaton of (Σ∗)−1L. Consider a word u = an · · · a1 6= bn with ai ∈ Σ andsuppose Q · u = P in the powerset construction P(B). Then we have j ∈ P if and only ifaj = a. Therefore, every nonempty subset of Q is a reachable state in P(B). Consider nowtwo distinct states P, P ′ ⊆ Q of P(B). Without loss of generality there exists j ∈ P \ P ′.Then n ∈ P · an−j whereas n 6∈ P ′ · an−j . This shows that P(B) is minimal. Its states are thenonempty subsets of Q. In particular, the minimal automaton of (Σ∗)−1L has 2n − 1 states.

We also note the interpretation of B as an automaton over {a, b, c}∗ yields a powersetconstruction with 2n states since ∅ becomes a reachable state (for instance, using the word cof length 1). Therefore, the bound 2n on the number of states in the powerset construction istight. 3

Computing automata for right quotients is easier in the sense that there is no exponentialblow-up for deterministic automata. In the remainder of this section we consider the followingproblem:

Input: Finite Σ∗-automata A1 and A2.

Output: A finite Σ∗-automaton B with L(B) = L(A1)L(A2)−1.

Let Ai = (Qi, δi, Ii, Fi) with δi ⊆ Qi×Σ×Qi. Consider the product automaton A1×A2 withstates Q1 ×Q2 (even if not all states are reachable). We say that a state (p1, p2) ∈ Q1 ×Q2

is co-reachable if there exists (f1, f2) ∈ F1 × F2 and v ∈ Σ∗ such that (p1, p2)v (f1, f2) in

A1 ×A2. Using

F ′ = { q ∈ Q1 | ∃i2 ∈ I2 : (q, i2) is co-reachable in A1 ×A2 }

as the final states we obtain the automaton B = (Q1, δ1, I1, F′). As for left quotients one

can verify that L(B) = L(A1)L(A2)−1. The main difference to left quotients is that B is

deterministic whenever A1 is deterministic.

48

4.4. Decision problems for automata

Other operations

We do not discuss algorithmic aspects of the operations concatenation and Kleene star. Thereason is that the approach suggested by the Thompson construction (followed by removal ofε-transitions and possibly determinization) yields automata of asymptotically optimal sizefor these constructions, see e.g. [?].

4.4. Decision problems for automata

The word problem

For an automaton A over Σ∗ the word problem is the following.

Input: A word u = a1 · · · an with ai ∈ Σ.

Question: Is u ∈ L(A)?

If A is deterministic, then we compute the state p = q0 · u letter by letter and then we checkwhether or not p ∈ F . We have p ∈ F if and only if u ∈ L(A). In pseudocode, the algorithmis as follows.

Algorithm 1 The word problem for a deterministic automaton on input a1 · · · an1: p← q02: for i← 1, . . . , n do p← p · ai3: if p ∈ F then return true4: else return false

The algorithm also works if Σ is the generating set of an arbitrary monoid M , and A is adeterministic M -automaton. The running time of the algorithm can be estimated by O(n)operations in the automaton A.

If A = (Q, δ, I, F ) is a nondeterministic automaton with δ ⊆ Q × Σ × Q, then we canimplicitely use the powerset construction, i.e., we set P · a = { q ∈ Q | ∃p ∈ P : (p, a, q) ∈ δ }.The word problem for A is solved by computing P = I · u, and then checking P ∩ F 6= ∅.Thus the algorithm is as follows.

Algorithm 2 The word problem for a nondeterministic automaton on input a1 · · · an1: P ← I2: for i← 1, . . . , n do P ← P · ai3: if P ∩ F 6= ∅ then return true4: else return false

If |Q| = m, then this algorithm requires at most O(m2n) operations in the automaton A.We note that the above algorithm for nondeterministic automata does not work for arbitrarymonoids. There exist monoids M such that when given two sequences of generators a1 · · · anand b1 · · · bm it is undecidable whether a1 · · · an and b1 · · · bm represent the same elementin M . Note that some factorization of an element might yield an accepting path whereasanother factorization has no accepting path.

The emptiness problem

The emptiness problem is the following.

Input: A finite nondeterministic automaton A.

Question: Is L(A) = ∅?

49

4. Algorithmic Properties of Automata

For every initial state we check if there exists a path to some final state. This can be donein linear time, e.g., by depth-first search. We note that non-emptiness is essentially graphreachability which is complete for nondeterministic logarithmic space (nl), a complexity classwithin deterministic polynomial time. The nl-algorithm for the Σ∗-automatonA = (Q, δ, I, F )with δ ⊆ Q× Σ×Q can sketched as follows:

Algorithm 3 The emptiness problem for a nondeterministic automaton

1: nondeterministically guess some initial state p ∈ I2: while p 6∈ F do3: nondeterministically guess a ∈ Σ and q ∈ Q with (p, a, q) ∈ δ4: p← q

5: return “L(A) is nonempty”

For a deterministic automaton A = (Q, ·, q0, F ), the algorithm can easily be adapted:

Algorithm 4 The emptiness problem for a deterministic automaton1: p← q02: while p 6∈ F do3: nondeterministically guess a ∈ Σ4: p← p · a5: return “L(A) is nonempty”

Since only the graph structure of the automaton matters, each of the algorithms can alsobe applied to automata over arbitrary monoids.

The intersection of a list of deterministic automata

When proving pspace lower bounds for decision problems on automata, the following theoremis very useful.

Theorem 4.1. The following problem is pspace-complete.

Input: A list of deterministic {a, b}∗-automata A1, . . . ,A`.Question: Is L(A1) ∩ · · · ∩ L(A`) 6= ∅?

Proof. For membership in pspace we consider the following algorithm. The algorithm worksfor automata Ai over some arbitrary alphabet Σ. Let qi0 be the initial state of Ai and let Fibe its set of final states.

Algorithm 5 The intersection problem for a list of deterministic Σ∗-automata A1, . . . ,A`1: (p1, . . . , p`)← (q10, . . . , q

`0)

2: while (p1, . . . , p`) 6∈ F1 × · · · × F` do3: nondeterministically guess c ∈ Σ4: (p1, . . . , p`)← (p1 · c, . . . , p` · c)5: return “L(A1) ∩ · · · ∩ L(A`) 6= ∅”

For showing pspace-hardness we use a reduction from the word problem for linearlybounded Turing machines, a pspace-complete problem. A linearly bounded Turing machineis a nondeterministic Turing machine which, on its tape, only uses the space of the input.Let M be a linearly space bounded nondeterministic Turing machine with input alphabetΣ and tape alphabet Γ. Let Q be the states of M and let δ ⊆ Q× Γ×Q× Γ× {L,R} bethe transition relation. As usual, the letters L and R indicate the direction of the head.Suppose M has a unique initial state qinit and a unique accepting state qacc.

50

4.4. Decision problems for automata

Consider some input word u ∈ Σ∗ of length n− 1. A configuration of M can be writtenas c1 · · · cn with ci ∈ Γ ∪Q such that cj ∈ Q for exactly one index j, meaning that M is instate cj when reading symbol cj+1 at position j of the tape. Consider two configurationsC = c1 · · · cn and D = d1 · · · dn with C `M D. Then the validity of the letter di only dependson ci−1cici+1 and di−1, di+1 (where ci−1, ci+1, di−1, or di+1 might be empty), i.e., for verifyingC `M D at position i it suffices to remember a fixed number of elements in Γ ∪Q.

C

D

a p ba b′ q

· · · · · ·· · · · · ·

transition (p, b, q, b′, R)

C

D

a p bq a b′

· · · · · ·· · · · · ·

transition (p, b, q, b′, L)

By using some binary encoding of elements in Γ∪Q by words in {a, b}k for k ∈ O(log(|Σ|+|Q|)),we can write down configurations as elements of {a, b}m form = kn. We consider computationsas sequences of (encodings of) configurations C1, . . . , C` with Cj `M Cj+1 for all j < `. Theinteger ` is unknown. We use the concatenation C1 · · ·C` ∈ {a, b}∗ as the input for someautomata. Let Ai be a deterministic automaton which checks consitency at position i, thatis, the automaton Ai verifies Cj `M Cj+1 at position i for all j ≥ `. This can be done with

an automaton of polynomial size since it only has to remember some word in {a, b}3k frompositions i− 1, i, i+ 1 in Cj until reading these positions in Cj+1. This can be achieved byan automaton of polynomial size. Let Ainit be a deterministic automaton for encodings ofqinitu (Γ∪Q)∗ and let Aacc be a deterministic automaton for encodings of (Γ∪Q)∗ qacc (Γ∪Q)∗;both automata have polynomial size.

· · · · · ·c11 c1i c1nC1

· · · · · ·c21 c2i c2nC2

......

...

· · · · · ·cj1 cji cjnCj

· · · · · ·cj+11 cj+1

i cj+1nCj+1

......

...

· · · · · ·c`1 c`i c`nC`

A1 Ai An

Ainit

Aacc

The list of automata Ainit,A1, . . . ,An,Aacc has the following property:

u ∈ L(M) ⇔ there exists an accepting computation of M on input u

⇔ L(Ainit) ∩ L(A1) ∩ · · · ∩ L(An) ∩ L(Aacc) 6= ∅.

Therefore, deciding the emptiness of the intersection of Ainit,A1, . . . ,An,Aacc cannot beeasier than deciding whether u ∈ L(M).

A naive approach to solving the problem in Theorem 4.1 might be the application of theproduct automaton construction, but this would yield an automaton which is exponential inthe parameter `.

51

4. Algorithmic Properties of Automata

The universality problem for deterministic automata

The universality problem for automata over Σ∗ is defined as follows.

Input: A finite automaton A over Σ∗.

Question: Is L(A) = Σ∗?

An automaton A with L(A) = Σ∗ is called universal. If A is deterministic, then we cansolve the emptiness problem for the complement automaton, i.e., we check if there exists apath from the initial state to some non-final state. Therefore, the universality problem fordeterministic automata can be solved in nondeterministic logarithmic space (i.e. in nl) aswell as in deterministic linear time.

The universality problem for nondeterministic automata

Let A = (Q, δ, I, F ) with δ ⊆ Q × Σ × Q be a nondeterministic Σ∗-automaton. The non-universality problem on input A can be solved using the following nondeterministic polynomialspace (pspace) algorithm:

Algorithm 6 The non-universality problem for a nondeterministic automaton

1: P ← I2: while P ∩ F 6= ∅ do3: nondeterministically guess a ∈ Σ4: P ← P · a5: return “A is not universal”

Remember that P · a = { q ∈ Q | (p, a, q) ∈ δ for some p ∈ P }. This algorithm is nothingbut the nl-algorithm for non-emptiness of P(A), but polynomial space is required forwriting down some state of the powerset construction. Note that nondeterministic pspaceand deterministic pspace coincide by Savitch’s Theorem, see e.g. [?]. Thus universality(and not just non-universality) of nondeterministic automata is in pspace, even if thealphabet Σ is part of the input. The following Theorem is an intermediate step towardspspace-completeness of the universality problem for nondeterministic automata.

Theorem 4.2. The universality problem for finite nondeterministic automata over {a, b}∗ ispspace-complete.

Proof. We have seen in Algorithm 6 that the problem is in pspace. For showing pspace-hardness we use a reduction from problem in Theorem 4.1. In fact we use the emptinessrather than non-emptiness, but this problem is also pspace-hard since pspace is closedunder complemenation. Consider a list A1, . . . ,A` of deterministic {a, b}∗-automata. LetB = A1 ∪ · · · ∪ A` be the union automaton of the respective complement automata. Theautomaton B has polynomial size and it satisfies:

L(A1) ∩ · · · ∩ L(An) ∩ L(Aacc) = ∅⇔ L(A1) ∪ · · · ∪ L(A`) = {a, b}∗

⇔ L(B) = {a, b}∗ .

Note that, due to the union automaton construction, B is nondeterministic even though thecomplement automata Ai are deterministic.

We note that the algorithm above and the hardness result in Theorem 4.2 immediatelyyields pspace-completeness of the universality problem for nondeterministic automata evenif the alphabet is part of the input. This result is surprising since for deterministic automatathe problem is in nl and it is known that nl is a proper subset of pspace (we have

52

4.4. Decision problems for automata

nl ⊆ p ⊆ np ⊆ pspace, but to date only the inclusion nl ( pspace is known to bestrict). In particular, the universality problem for nondeterministic automata is provablymore difficult than the universality problem for deterministic automata.

The inclusion problem

The inclusion problem for automata over Σ∗ is defined as follows.

Input: Two finite automata A and B over Σ∗.

Question: Is L(A) ⊆ L(B)?

Suppose A and B have m and n states, respectively. By using the deterministic single-state automaton for Σ∗ as A, we see that the inclusion problem cannot be easier than theuniversality problem. If B is deterministic, then one can check for emptiness of L(A) ∩ L(B)on the product automaton A × B. This can be done using O(mn) automaton operations.If B is nondeterministic, then the problem is pspace-hard by Theorem 4.2 and the abovereduction. The following nondeterministic pspace-algorithm for non-inclusion shows thatthe problem is pspace-complete if B is allowed to be nondeterministic. Let A = (Q, δ, I, F )and let B = (Q′, δ′, I ′, F ′) such that all labels in δ and δ′ are in Σ.

Algorithm 7 The non-inclusion problem for nondeterministic automata

1: (P, P ′)← (I, I ′)2: while P ∩ F = ∅ or P ′ ∩ F ′ 6= ∅ do3: nondeterministically guess a ∈ Σ4: (P, P ′)← (P · a, P ′ · a)

5: return “L(A) 6⊆ L(B)”

Since pspace is closed under complement, this also yields an algorithm for inclusion (andnot just “non-inclusion”).

The equivalence problem

The equivalence problem for automata over Σ∗ is the following.

Input: Two finite automata A and B over Σ∗.

Question: Is L(A) = L(B)?

One can think of the equivalence problem as both directions L(A) ⊆ L(B) and L(B) ⊆ L(A).In particular, the equivalence problem is at most twice as difficult as the inclusion problem.Moreover, in the case that at least one of the automata is nondeterministic, Theorem 4.2shows that equivalence is pspace-complete; for completeness, we explicitely give the pspace-algorithm for inequivalence on automata A = (Q, δ, I, F ) and B = (Q′, δ′, I ′, F ′).

Algorithm 8 The inequivalence problem for nondeterministic automata

1: (P, P ′)← (I, I ′)2: while P ∩ F = ∅ ⇔ P ′ ∩ F ′ = ∅ do3: nondeterministically guess a ∈ Σ4: (P, P ′)← (P · a, P ′ · a)

5: return “L(A) 6= L(B)”

Naturally, the same approach leads to an nl-algorithm for deterministic automata; andthis algorithm can easily be implemented as a reachability problem on the product automatonA × B. One just hast to answer the question, whether there exists a reachable state in(F ×Q′ \ F ′) ∪ (Q \ F × F ′). If A has m states and B has n states, then the running timewould be O(mn), i.e. quadratic.

53

4. Algorithmic Properties of Automata

The Hopcroft-Karp equivalence test

In this section, we consider the “almost linear” equivalence test by Hopcroft and Karp fordeterministic automata [?]. It relies on two operations on sets: Union and Find. The idea isthat the algorithm starts with the partition where every element forms a singleton, and duringthe run of the algorithm partition classes are successively united. Then Find(x) = Find(y)holds if and only if, in the current situation, x and y belong to the same class. The operationUnion(x, y) unites the classes of x and y. Suppose A = (Q, ·, q0, F ) and B = (Q′, ·, q′0, F ′)are the given deterministic automata. Let Q = Q ∪ Q′ be the disjoint union of the statesets. The algorithm starts with the partition where every element has its own class. Thealgorithm either returns that the automata are equivalent, or it computes a word u witheither u ∈ L(A) \ L(B) or u ∈ L(B) \ L(A). Let L be a list of elements (p, p′, u) with p ∈ Q,p′ ∈ Q′ and u ∈ Σ∗.

Algorithm 9 Hopcroft-Karp equivalence test for deterministic automata

1: L ← {(q0, q′0, ε)}2: while L 6= ∅ do3: Choose (p, p′, u) ∈ L and remove this triple from L4: if Find(p) 6= Find(p′) then5: if (p, p′) ∈ (F ×Q′ \ F ′) ∪ (Q \ F × F ′) then6: return u7: else8: Union(p, p′)9: for all a ∈ Σ do add (p · a, p′ · a, ua) to L

10: return “L(A) = L(B)”

Note that the algorithm executes Union(p, p′) only for states with Find(p) 6= Find(p′).An important invariant is that all triples (p, p′, u) which at some point occur in L satisfyp = q0 · u and p′ = q′0 · u. Hence, if the algorithm returns u ∈ Σ∗, then we have u ∈(L(A) \ L(B)

)∪(L(B) \ L(A)

)and thus L(A) 6= L(B). For the correctness of the above

algorithm it therefore remains to prove the following proposition.

Proposition 4.3. If the Hopcroft-Karp equivalence test returns “L(A) = L(B)”, then theautomata A and B are equivalent.

Proof. We write Findi(p) = Findi(p′), if we have Find(p) = Find(p′) after the i-th iteration

of the while-loop. Similarly, Find(p) = Find(p′) says that this query is true at the end of thealgorithm.

Claim 1. If Find(p) = Find(p′), then Find(p · a) = Find(p′ · a) for all a ∈ Σ.

Proof of Claim 1: Suppose the contrary. Let i be minimal such that for some (p, p′, a) ∈Q×Q′×Σ we have Findi(p) = Findi(p

′) and Find(p · a) 6= Find(p′ · a). Note that i > 0 sinceotherwise p = p′ and this would contradict Find(p · a) 6= Find(p′ · a). Since Findi−1(p) 6=Findi−1(p

′), we have Findi(p) = Findi(p′) due to some instruction Union(r, s) in the i-th

iteration. We assume Findi−1(r) = Findi−1(p) and Findi−1(s) = Findi−1(p′) since otherwise

we may interchange the roles of r and s. After the operation Union(r, s) we add some element(r · a, s · a, ua) to the list L, and for all elements (q, q′, u) on the list L we eventually haveFind(q) = Find(q′). This shows Find(r · a) = Find(s · a). By choice of i we have

Find(p · a) = Find(r · a) = Find(s · a) = Find(p′ · a),

contradicting the assumption. This concludes the proof of Claim 1.

As usual, let L(q) = {u ∈ Σ∗ | q · u is a final state }. This allows us to describe an impor-tant property of the state partition computed by the algorithm.

54

4.4. Decision problems for automata

Claim 2. If Find(p) = Find(p′), then L(p) = L(p′).

Proof of Claim 2: For all states p, p′ with Find(p) = Find(p′) we show u ∈ L(p) ⇔ u ∈ L(p′)by induction on |u|. If u = ε, then this is true since we never unite final and non-final states.Let now u = au′ with a ∈ Σ and consider the states q = p · a and q′ = p′ · a. Claim 1 yieldsFind(q) = Find(q′), and by induction we see that u′ ∈ L(q) ⇔ u′ ∈ L(q′). Thus

au′ ∈ L(p) ⇔ u′ ∈ L(q) ⇔ u′ ∈ L(q′) ⇔ au′ ∈ L(p′).

This concludes the proof of Claim 2.

Since (q0, q′0, ε) is an element in the list L, we eventually have Find(q0) = Find(q′0). By

Claim 2 we obtain L(A) = L(q0) = L(q′0) = L(B) as desired.

The following proposition yields an upper bound on the number of Union and Findoperations. Suppose |Q| = m and |Q′| = n.

Proposition 4.4. The algorithm executes at most

(a) m+ n− 1 Union operations, and

(b) 1 + |Σ| (m+ n− 1) Find comparisons.

Moreover, if it returns a word u ∈ Σ∗ after k Union operations, then |u| ≤ k.

Proof. Property (a) is trivial since we only unite disjoint subsets. Property (b): Every Findcomparison is due to some triple in the list L. After every Union operations, we add |Σ|triples to L. Together with the initial element (q0, q

′0, ε) in the list L, this yields the desired

bound. For the last statement, note that every Union operation increases the length of thelongest word in the list L at most by 1.

Proposition 4.4 immediately yields the following corollary.

Corollary 4.5. The Hopcroft-Karp equivalence test requires at most O(|Σ| (m+ n)) Union-Find operations.

Another consequence of Proposition 4.4 is a linear bound on the length of a shortest witnessfor the inequivalence of automata. Moreover, this bound does not depend on the size of thealphabet.

Corollary 4.6. Let A and B be deterministic Σ∗-automata with m and n states, respectively.If L(A) 6= L(B), then there exists a word u ∈ Σ∗ of length at most m + n − 2 with u ∈(L(A) \ L(B)

)∪(L(B) \ L(A)

).

Proof. If L(A) 6= L(B), then by Proposition 4.3 the Hopcroft-Karp equivalence test returns aword u ∈

(L(A) \ L(B)

)∪(L(B) \ L(A)

). This can only happen before the (m− n+ 1)-th

Union operation (since after m−n+ 1 Union operations, all states have the same Find-value).Proposition 4.4 yields the desired bound.

The following example shows that the bound in Corollary 4.6 is tight.

Example 4.7. Let m ≤ n be two positive integers. First suppose m < n. Then we considerthe following two deterministic a∗-automata A and B with m+ 2 and n states, respectively.The automaton A is:

0 1 · · · m -1 m m+1a a a a a

a

It accepts the singleton language {am}. The automaton B is given by:

55

4. Algorithmic Properties of Automata

n

1

n -1

m -1

m

m+1

· · · · ·

· · · · ·

a

a a

a

a

aa

a

We have L(B) ={am+kn

∣∣ k ≥ 0}

. In particular L(A) ( L(B) and am+n is the shortestword in L(B) \ L(A). In the case m = n, we also choose the state 0 of the automaton A isfinal, i.e., we then have L(A) = {ε, an} and L(B) = (an)∗. 3

In the remainder of this section, we briefly sketch one possible implementation of theoperations Union and Find. Every class of the partition is represented by a tree in which thenodes have a pointer to their parents. Moreover, at every root we remember the number ofnodes in the tree. The operation Find(x) returns the root of the tree containing x. All wehave to do in order to compute Find(x) is to chase the pointers from nodes to their parent,starting with node x. For instance, in the following example, Find(x) would return r.

r

• •

• •

• •

• •

• • •

• x

• •

• •

• •

Let size(x) denote the number of nodes in the tree of x. Remember that we store size(x)at the node Find(x). Consider elements x, y with Find(x) = r 6= s = Find(y) and supposesize(x) ≥ size(y). Then for the operation Union(x, y) we introduce a pointer from s to r,i.e., we attach the smaller tree to the bigger one. In particular, after this operation we haveFind(y) = r.

Let height(x) denote the length of a longest path in the tree of x. We claim that whenstarting with singleton sets for all elements and then executing successive Union operations,then at any time 2height(x) ≤ size(x). This is true at the beginning since singletons have height0. Let now size(x) ≥ size(y) and consider the operation Union(x, y). Let height′ and size′

denote the respective values after the execution of this operation. If height′(x) = height(x),then there is nothing to show. Let now height′(x) > height(x). Then height′(x) = height(y)+1. Since size(x) ≥ size(y), we have

2height′(x) = 2 · 2height(y) ≤ 2 · size(y) ≤ size(x) + size(y) = size′(x),

as desired.Let n be the number of elements. The above estimate shows that (at any time) the running

time of both operations Union(x, y) and Find(x) is in O(log n).

56

4.5. Minimization algorithms

There is a big improvement to the above algorithm which is called path compression. Theidea is that after every execution of the procedure Find, we redirect the pointers of allnodes we see during this execution directly to the root. This is done by chasing the patha second time. The effect is that the subsequent Find operations for some elements havinga predecessor on this path will be faster. An amortized analysis shows that this yields theinverse Ackermann function as an asymptotic running time for both Union and Find [?].Since for all reasonable values, this function is less than 5, this coins the term “almost linear”in the running time of the Hopcroft-Karp equivalence test.

Identity testing

Identity testing addresses the question whether two given automata are identical, i.e., canone automaton be obtained from the other by renaming the states. For two deterministicΣ∗-automata with n states each, this can be solved in time O(|Σ|n): We can assume thatΣ = {a1, . . . , am}. Now, we number the states from 1 to n using a depth-first traversal; duringthis traversal, we always visit outgoing ai-transition before considering ai+1-transitions. Thenfor the two states p, q with number i in both automata, we have to check whether the numberof p · a and of q · a is the same for all a ∈ Σ, and whether p is final if and only if q is final.

We note that this algorithm would not work for non-deterministic automata since fordifferent outgoing a-transitions there is no uniform tie-breaker which one to process first ina depth-first traversal. If all transitions have the same label all states are both initial andfinal, then identity testing is nothing but graph isomorphism, a problem with an unresolvedcomplexity to date.

The main application of identity testing for deterministic automata is the use of minimizationalgorithms for equivalence tests. This can be done as follows. Given two deterministicautomata A and B, we minimize both A and B and then we check whether the resultingautomata are identical. The complexity of this procedure is dominated by the complexity ofthe minimization algorithm in use. In particular, minimizing automata cannot be faster thanequivalence testing.

4.5. Minimization algorithms

The problem of minimization is the following. Given a finite deterministic automaton Awe want to compute the minimal automaton accepting L(A). As it turns out, many of theknown algorithms can also be applied in the general setting of M -automata.

Moore’s algorithm

Let M be an arbitrary monoid, let A = (Q, ·, q0, F ) be an M -automaton, and let L = L(A).We can assume that all states are reachable, i.e., we have Q = q0 ·M . For a state p ∈ Qwe define L(p) = { v ∈M | p · v ∈ F }. For p ∈ Q we set [p] = { q ∈ Q | L(p) = L(q) }. Wedefine the automaton BA with states { [p] | p ∈ Q }. The initial state of BA is [q0] and thefinal states are { [f ] | f ∈ F }. Note that L(p) = L(q) implies L(p ·u) = L(q ·u) for all u ∈M .Therefore, the transition function [p] · u = [p · u] is well-defined. Remember that the states ofthe minimal automaton AL are the sets L(u) = { v ∈M | uv ∈ L }. For every u ∈M we have

v ∈ L(q0 · u) ⇔ q0 · uv ∈ F ⇔ uv ∈ L ⇔ v ∈ L(u)

and thus L(q0 · u) = L(u). In particular, BA is the the minimal automaton of L; the onlydifference is that we use different names for the states. In order to minimize A, it remainsto identify those states p, q ∈ Q with L(p) = L(q). This can be done with the followingalgorithm. Let M be generated by Σ.

57

4. Algorithmic Properties of Automata

(a) We start with the complete graph with vertices Q and edges E = { {p, q} | p 6= q ∈ Q }.(b) We mark all edges {p, q} ∈ E with p ∈ F and q 6∈ F .

(c) As long as there exist marked edges in E, we repeat the following procedure. We choosesome marked edge {p′, q′} ∈ E. Then we mark all unmarked edges {p, q} ∈ E with{p′, q′} = {p · a, q · a} for some a ∈ Σ. Afterwards we remove the edge {p′, q′} from E.

Note that all marked edges are eventually removed.

Lemma 4.7. We have L(p) = L(q) if and only if {p, q} is an edge in the remaining graph.

Proof. ⇒: We show that whenever an edge {p, q} is marked, then we have L(p) 6= L(q). If{p, q} is marked in step (b), then 1 ∈ L(p) \L(q). Let now {p, q} be marked in step (c). Thenthere exists a previously marked edge {p′, q′} and a generator a ∈ Σ with {p · a, q · a} = {p′, q′}.By induction we have L(p′) 6= L(q′). W.l.o.g. there exits v ∈ L(p · a) \ L(q · a); otherwise weinterchange the roles of p and q. It follows av ∈ L(p) \ L(q) and L(p) 6= L(q).

⇐: We say that an element v ∈M is a witness, if v ∈ L(p) \ L(q) for some edge {p, q} inthe remaining graph. By contradiction, we assume that there exists a witness. Let n ≥ 0be minimal such that v = a1 · · · an for some generators ai ∈ Σ is a witness. Then thereexists an edge {p, q} in the remaining graph with v ∈ L(p) \ L(q). The case n = 0 is notpossible since then v = 1 and {p, q} would have been marked in step (b). Thus n > 0 anda2 · · · an ∈ L(p · a1) \ L(q · a1). By minimality of n, the edge {p · a1, q · a1} was removed.Before removal it was marked, and therefore in step (c) the edge {p, q} was marked, too.Since marked edges are eventually removed, {p, q} is not an edge in the remaining graph.This is a contradiction, showing that there exist no witnesses. This shows that every edge{p, q} with L(p) 6= L(q) is eventually removed.

One can think of the above algorithm as a traversal of the following graph. The set ofvertices is { {p, q} | p 6= q ∈ Q }, and for every generator a ∈ Σ and every vertex {p, q} thereexists an edge from {p · a, q · a} to {p, q}. If Q and Σ are finite, then this graph has O(|Q|2 |Σ|)edges. Therefore the running time of the above algorithm is O(|Q|2 |Σ|).

When implementing automata, one usually does not want to have sets [p] as states. Instead,in every class [p] one chooses some state as a representative. For any state p ∈ Q let p be therepresentative of [p]. The transition function is then given by p · a = p · a. With O(|Q| |Σ|),the running time of this step is negligible.

We note that we do not have to prove that the resulting automaton indeed is an M -automaton since it coincides with the minimal automaton of L which is already known to bean M -automaton.

Example 4.8. In Example 3.1 we constructed the following deterministic {a, b}∗-automatonfor the language L = b {ab, b}∗.

1

2

5

3

4

b

a

b

a

b

ba

a, b

a

We apply Moore’s minization algorithm to this automaton. The procedure starts with thecomplete graph on the set of states:

58

4.5. Minimization algorithms

5

4

32

1

In the beginning we mark all edges between final and non-final states.

5

4

32

1

In the next step, we (arbitrarily) choose the marked edge {4, 5} which causes us to markedges {2, 3} and {3, 4} since (2 · b, 3 · b) = (5, 4) and (3 · b, 4 · b) = (4, 5). Then we remove theedge {4, 5}. This leads to the following graph:

5

4

32

1

In the next iteration, we choose {2, 5} and we mark the edge {1, 5} since (1 · b, 5 · b) = (2, 5);then we remove the edge {2, 5}.

5

4

32

1

In the remaining iterations, all marked edges are removed without causing additional markededges.

5

4

32

1

Finally, the minimal automaton is obtained by combining the states 2 and 4 into a singlestate.

1

2,4

5

3b

a

b

b

a

a, b

a

Note that only the “future” behavior of states counts for minimization (even though Moore’salgorithm looks for predecessor edges). The states 2 and 4 have the same outgoing transitionswhich makes them identical in some rather obvious way; on the other hand the two states donot share the same incoming edges. 3

59

4. Algorithmic Properties of Automata

Hopcroft’s algorithm

We present Hopcroft’s algorithm for minimizing a Σ∗-automatonA = (Q, ·, q0, F ) with n states.Even though the algorithm works for deterministic automata over arbitrary Σ-generatedmonoids, for simplicity we only present it for finitely generated free monoids. The runningtime of Hopcroft’s minimization algorithm is in O(|Σ|n log n); to date, no asymptoticallyfaster algorithm is known. The algorithm computes the Myhill-Nerode equivalence relation≡A. Remember that reachable states p, q ∈ Q satisfy p ≡A q if L(p) = L(q), i.e., if we have

p · u ∈ F ⇔ q · u ∈ F

for all u ∈ Σ∗. Let [p] = { q ∈ Q | p ≡A q } be the equivalence class of p. Representing ≡A asa subset of Q×Q is quadratic. Instead, it is more efficient to use the partition { [p] | p ∈ Q }which can be stored in space O(n). In order to avoid unnecessary case distinctions, wefrequently identify { ∅, P1, . . . , Pk } with {P1, . . . , Pk }. For P = {P1, . . . , Pk } and R ={R1, . . . , R` } with Pi, Rj ⊆ Q we define

P ∧R = {Pi ∩Rj | 1 ≤ i ≤ k, 1 ≤ j ≤ `, Pi ∩Rj 6= ∅ } .

If P and R are partitions, then so is P ∧R. A partition R refines P if for all R ∈ R thereexists P ∈ P with R ⊆ P , i.e., if R∧ P = R.

Lemma 4.8. If P = P1 ∪P2 for P ⊆ Q, then {P1, Q \ P1}∧ {P2, Q \ P2} refines {P,Q \ P}.

Proof. This immediately follows from P1, P2 ⊆ P and (Q \ P1) ∩ (Q \ P2) = Q \ P .

A set S splits a set P if ∅ 6= P ∩ S 6= P . The split of P by S is (P1, P2) with {P1, P2} ={P ∩ S, P \ S} and |P1| ≤ |P2|, i.e., P1 is the smaller one of the sets P∩S and P \S; if both setshave the same size, then the choice is arbitrary. For S ⊆ Q let S · a−1 = { q ∈ Q | q · a ∈ S }be the states which have an outgoing a-transition to some state in S. For a partition P of Q,a subset S ⊆ Q and a letter a ∈ Σ we define

(S, a) | P ={S · a−1, Q \ S · a−1

}∧ P.

Another view on (S, a) | P is as follows. The pair (S, a) defines the set S · a−1 which is usedfor splitting each class of P. Note that Q \ (S · a−1) = (Q \ S) · a−1 and thus Q \ S · a−1 iswell-defined. Let A be a Σ∗-automaton such that all states are reachable. Then Algorithm 10is Hopcroft’s minimization algorithm on input A.

Algorithm 10 Hopcroft’s algorithm for minimizing a Σ∗-automaton A = (Q, ·, q0, F )

1: P ← {F,Q \ F}2: L ← { (F, a) | a ∈ Σ }3: while L 6= ∅ do4: Remove some pair (S, a) from L5: for all P ∈ P which are split by S · a−1 do6: Let (P1, P2) be the split of P by S · a−1 . P = P1 ∪ P2, |P1| ≤ |P2|7: P ← (P \ {P}) ∪ {P1, P2} . Replace P by P1, P2

8: for all b ∈ Σ do9: if (P, b) ∈ L then

10: L ← (L \ {(P, b)}) ∪ {(P1, b), (P2, b)} . Replace (P, b) by (P1, b), (P2, b)11: else12: L ← L ∪ {(P1, b)}13: return PP

←(S,a

)|P

;U

pd

ateL

As indicated, the instructions in lines 5 to 12 refine P by (S, a) | P and update the list L.At some point, the current partition cannot be refined any further, and then the instruction in

60

4.5. Minimization algorithms

line 4 empties the list L. Thus the while-loop terminates on all inputs. After every iterationof the while-loop one can consider the partition one obtains when using all pairs in L forrefining P . An important property of the resulting partitions is given by the following lemma.

Lemma 4.9. Let Pi be the partition P after the i-th iteration of the while loop; in particularP0 = {F,Q \ F}. Let Li be the list L after the i-th iteration, and let Ri =

∧(S,a)∈Li(S, a) | Pi.

Then Ri refines Ri−1 for all i ≥ 1.

Proof. The instruction in line 4 removes a pair (S, a) from L, but then P is refined by thispair. The instruction in line 12 only increases the list L. Therefore it suffices to showthat updates of L using line 10 yield a refinement of the partition

∧(S,a)∈L(S, a) | P for the

current values of L and P. Let L′ = { (S1, a1), . . . , (Sj , aj), (P, b) } be the list L before theexecution of line 10 and let L′′ = { (S1, a1), . . . , (Sj , aj), (P1, b), (P2, b) } be the list after it.

Let R =∧ji=1(Si, ai) | P, let R′ =

∧(S,a)∈L′(S, a) | P, and let R′′ =

∧(S,a)∈L′′(S, a) | P. Then

R′ ={P · b−1, Q \ P · b−1

}∧R,

R′′ ={P1 · b−1, Q \ P1 · b−1

}∧{P2 · b−1, Q \ P2 · b−1

}∧R.

The partition{P1 · b−1, Q \ P1 · b−1

}∧{P2 · b−1, Q \ P2 · b−1

}refines

{P · b−1, Q \ P · b−1

}by Lemma 4.8 because P = P1 ∪ P2 (and thus P · a−1 = P1 · a−1 ∪ P2 · a−1). Therefore, R′′refines R′ as desired.

Note that if the algorithm uses k iterations, then it returns the partition Rk. Another viewon this partition is as follows. Let (Si, ai) be the pair removed from L in line 4 in the i-thiteration of the while-loop and suppose the algorithm runs the while-loop k times. Hopcroft’salgorithm then computes the partition{

Sk · a−1k , Q \ Sk · a−1k}∧ · · · ∧

{S1 · a−11 , Q \ S1 · a−11

}∧ {F,Q \ F} .

In particular, the order of the application of the pairs (Si, ai) does not matter for any fixed setof pairs (S1, a1), . . . , (Sk, ak). On the other hand, different choices for (S, a) in line 4 of thealgorithm can yield different entries in the list L and therefore different intermediate partitionscan occur. For instance, L could be implemented either as a stack or as a queue. The followingproposition shows that, independent of the implementation, Hopcroft’s algorithm alwayscomputes the partition defining the minimal automaton.

Proposition 4.10. On input A, Hopcroft’s algorithm computes the Myhill-Nerode equivalencerelation for A.

Proof. An important invariant of the algorithm is that before and after every iteration of thewhile-loop, all pairs (S, a) ∈ L satisfy S ∈ P . For the correctness, we show that the partitioncomputed by the algorithm represents the Myhill-Nerode equivalence relation ≡A. We firstshow that ≡A refines the partition of the algorithm. Since ≡A refines the initial partition{F,Q \ F}, it suffices to prove the following claim: If ≡A refines P, then ≡A also refines(S, a) | P for all (S, a) ∈ P × Σ. Let p ≡A q. Then we have p, q ∈ P for some P ∈ P and

p ∈ S · a−1 ⇔ p · a ∈ S ⇔ q · a ∈ S ⇔ q ∈ S · a−1,

where the second equivalence relies on p · a ≡A q · a and S ∈ P. This shows that p and q arein the same class of the partition (S, a) | P.

Next we show that Myhill-Nerode inequivalent states p, q are separated by the algorithm.Consider states p, q with p · u ∈ F ⇔ q · u 6∈ F for some word u ∈ Σ∗. By induction on u weshow that p and q end up in different sets of the final partition. For u = ε this is true sincethe final partition refines the initial partition {F,Q \ F}. Let now p · a 6≡A q · a for a ∈ Σ

61

4. Algorithmic Properties of Automata

such that p · a, q · a are in different classes of the final partition. We show that p, q end up indifferent classes, too. Let Pi denote the partition P computed by the algorithm after the i-thiteration of the while-loop; for i = 0 we obtain the initial partition. Similarly, let Li be thelist L after the i-th iteration. After some iteration i ≥ 0 of the while-loop we have:

• p · a ∈ P and q · a ∈ P ′ for disjoint classes P, P ′ ∈ P, and

• (P, a) ∈ L or (P ′, a) ∈ L.

This holds at the beginning (i.e. i = 0) or immediately after separating p · a from q · a.Without loss of generality, let (P, a) ∈ L. Since p ∈ P · a−1 and q 6∈ P · a−1, the states p, qare in different classes of the partition (P, a) | P. Thus, by Lemma 4.9, the final partitionseparates p and q.

Next, we analyze the running time of Hopcroft’s algorithm. The presentation is based onthe textbook by Beauquier, Berstel, and Chretienne [?], see also [?]. We assume that thepartition P can be implemented such that the following properties hold:

• Accessing the class of a state q ∈ Q is in O(1).

• The size of a class P ∈ P can be determined in O(1).

• Enumerating the elements of a class P ∈ P is in O(|P |).• Adding and removing an element to/from a class is in O(1).

We describe how to compute P ← (S, a) | P in O(∣∣S · a−1∣∣). To this end, the classes of P in

the for-loop in line 5 of Hopcroft’s algorithm are processed simultaneously. First, for eachq ∈ S · a−1 we do the following:

• We mark the class P ∈ P with q ∈ P .

• We increment a counter nP for the class P .

• We add q to a list RP .

Of course, we initially have nP = 0 and RP = ∅ for all marked classes P . In a second phase,we consider the marked classes P . If nP < |P |, then for every state q ∈ RP we remove q fromP (which yields P \RP as a class), and we include the class RP in the partition P.

For the implementation of the list L, we require that one can add and remove pairs inconstant time. Moreover, one can check whether or not (S, a) ∈ L in constant time. Naturally,we represent the class S by a reference instead of copying the corresponding class in thepartition.

Let (S1, a1), . . . , (Sk, ak) be the pairs removed from L in line 4. Then the above imple-mentation yields a running time of O(

∑ki=1

∣∣Si · a−1i ∣∣) for Hopcroft’s algorithm. Thus thefollowing proposition yields the desired bound of O(|Σ|n log n).

Proposition 4.11. We havek∑i=1

∣∣Si · a−1i ∣∣ ≤ |Σ|n log n.

Proof. We fix some letter a ∈ Σ. We say that (S, a) is a q-splitter if q ∈ S. At any time, forevery state q there is at most one q-splitter (S, a) in the list L. When a q-splitter (S, a) isremoved from L using the instruction in line 4 and later (not necessarily during the sameiteration) a q-splitter (P1, a) is added in line 12, then 2 |P1| ≤ |S|: In the setting of thealgorithm, we have P ⊆ S since P is currently the class of P which contains q and S haspreviously been a class of P containing q; thus 2 |P1| ≤ |P1|+ |P2| = |P | ≤ |S|. We concludethat at most log n pairs (Si, ai) form a q-splitter with ai = a. Note that the size of a q-splittercould decrease during its stay in the list L by the instruction in line 10, but this does notaffect the argument. For every state p ∈ Q we have

p ∈ S · a−1 ⇔ p · a ∈ S ⇔ (S, a) is a p · a-splitter.

62

4.5. Minimization algorithms

There are at most log n many pairs (Si, ai) which form a p ·a-splitter with ai = a. By varyingthe letter a, we see that every state p is considered in at most |Σ| log n sets Si · a−1i . Takingthe sum over all n states yields the desired bound.

A slight optimization of Hopcroft’s algorithm is to replace the initialization of the list L inline 2 by the following piece of code:

if |F | ≤ |Q \ F | thenL ← { (F, a) | a ∈ Σ }

elseL ← { (Q \ F, a) | a ∈ Σ }

That is, we use the class Q \ F instead of F whenever there are more final than non-finalstates. This neither affects the correctness of the algorithm nor its asymptotic running time,but it slightly improves the constants hidden in the notation O(|Σ|n log n).

There are several ways of implementing the list L and the partition P with the desiredcomplexities, see e.g. [?]. For instance one could use a doubly linked list for representinga class P ∈ P together with a counter for |P |. Every state has a reference to the positionwithin its class. The list L could also be organized as a doubly linked list. Every partitionclass P has an array of up to |Σ| references to elements of L, the entry for (P, a) is eithernull or it points to its position in L.

Example 4.9. We revisit the deterministic {a, b}∗-automaton A for L = b {ab, b}∗ fromExample 3.1.

1

2

5

3

4

b

a

b

a

b

ba

a, b

a

Hopcroft’s minimization algorithm on A starts with the partition P = { {1, 5} , {2, 3, 4} }. Asmentioned above, in this case it is slightly better to initialize L using the non-final states, i.e.,at the beginning we have L = { ({1, 5} , a), ({1, 5} , b) }. Then a possible run of the algorithmcould be:

1. In the first iteration we remove the pair ({1, 5} , a) from L. Note that {1, 5} · a−1 ={1, 5} ∈ P . Thus this does not affect the partition P . The list is now L = { ({1, 5} , b) }.

2. In the second iteration we remove the pair ({1, 5} , b) from L. We have {1, 5} · b−1 ={2, 4, 5}. This leads to the partition P = { {1} , {5} , {2, 4} , {3} } and to the listL = { ({1} , a), ({1} , b), ({3} , a), ({3} , b) } after the second iteration.

3. The remaining iterations do not affect P.

Thus, not surprisingly, Hopcroft’s algorithm computes the same minimal automaton asMoore’s algorithm in Example 4.8. 3

Remark 4.1. One can consider Hopcroft’s algorithm as a particular way of implementingMoore’s algorithm. In this setting, Hopcroft’s algorithm combines two advantages. First, byusing partitions it relies on a highly compressed representation of the graph used in Moore’salgorithm. And second, when removing an element from a class P it deletes many edges fromthe graph in only one step. 3

63

4. Algorithmic Properties of Automata

Brzozowski’s algorithm

While Moore’s algorithm and Hopcroft’s algorithm work for deterministic automata overarbitrary monoids, the following algorithm due to Brzozowski only works for automata overfree monoids Σ∗. One advantage of Brzozowski’s algorithm is that it can also be appliedif the input automaton is nondeterministic; as a drawback, it can construct intermediateautomata of exponential size even if the input is deterministic.

The reversal of a word u = a1 · · · an with ai ∈ Σ is uρ = an · · · a1. The reversal isobtained by reading the word from right to left. An obvious but crucial property of thereversal is (uρ)ρ = u for all u ∈ Σ∗, i.e., reveral is an involution. The reversal of a languageL ⊆ Σ∗ is Lρ = {uρ | u ∈ L }. Let A = (Q, δ, I, F ) be a nondeterministic Σ∗-automaton withδ ⊆ Q × Σ × Q. The reversal of A is Aρ = (Q, δρ, F, I) with δρ = { (q, a, p) | (p, a, q) ∈ δ },i.e., we reverse all transitions and we interchange initial and final states. Now, a word u ∈ Σ∗

has an accepting run in A if and only if uρ has an accepting run in Aρ. Thus L(Aρ) = L(A)ρ.If A is deterministic, then Aρ might be nondeterministic.

Theorem 4.12. Let B = (Q, ·, q0, F ) be a deterministic Σ∗-automaton such that all statesare reachable. Then P(Bρ) is the minimal Σ∗-automaton of L(B)ρ.

Proof. Remember that the powerset construction P(Bρ) only contains the reachable subsetsof Q, the initial state is F and P ⊆ Q is final if q0 ∈ P . We have L(P(Bρ)) = L(Bρ) = L(B)ρ.Thus it remains to show that P(Bρ) is minimal.

Claim. For any state P ⊆ Q of P(Bρ) we have q0 · u ∈ P if and only if uρ ∈ L(P ).

Proof of the claim: Let q = q0 · u in B and let P ′ = P · uρ in P(Bρ). We have q uρq0 in Bρ.

Therefore, q ∈ P imlies q0 ∈ P ′ and thus uρ ∈ L(P ). Conversely, if q0 ∈ P ′, then there exists

a state q′ ∈ P with q′ uρq0. Since B is deterministic, we have q′ = q0 · u = q; in particular,

q ∈ P . This concludes the proof of the claim.

Consider two different states P, P ′ ⊆ Q. Let q ∈ P \ P ′. Since all states in B arereachable, there exists a word u ∈ Σ∗ with q0 · u = q in B. By the above claim we haveuρ ∈ L(P ) \ L(P ′).

Let A = (Q, δ, I, F ) be a nondeterministic Σ∗-automaton with δ ⊆ Q×Σ×Q. If we applyTheorem 4.12 with B = P(Aρ), we see that P

(P(Aρ)ρ

)is the minimal automaton of L(A).

This means that “reverse-determinize-reverse-determinize” yields the minimal automaton.Even though there are two powerset constructions involved, the intermediate automata haveat most exponential size since the minimal automaton of a nondeterministic automaton is atmost exponential.

Example 4.10. We apply Brzozowski’s minimization algorithm to the automaton constructedin Example 1.9. Let A be the following {a, b}∗-automaton for the language L = b {ab, a}∗.

1 2

3

4

5

bb

aa

a

a

a a

Its reversal Aρ is:

64

4.5. Minimization algorithms

1 2

3

4

5

bb

aa

a

a

a a

The powerset construction P(Aρ) yields:

234 15 ∅b

a

b

a

a, b

After omitting the unreachable state ∅ in P(Aρ)ρ, we obtain the following nondeterministicautomaton for L:

y xb

a

a

For better readability we used the name x for the state 15 and y for the state 234. Thepowerset constructing of P(Aρ)ρ produces the following automaton:

x y xy

ba

b

a

ab

a, b

This is the minimal deterministic automaton of L. It is identical to the automaton constructedin Example 4.8. 3

State reduction in nondeterministic automata

In general, there is no unique nondeterministic automaton with a minimal number of states.For instance, the following three a∗-automata for L = aa∗ are alle different (non-isomorphic):

a

a

a

a

a

a a

There is no one-state automaton for L. Therefore, both automata have the minimal possiblenumber of states. On the other hand non-deterministic automata can be much smaller thanthe minimal deterministic automaton, see e.g. Examples 3.2 and 4.6. Therefore, one mightbe interested in computing some nondeterministic automaton of minimal size. The followingresult shows that one cannot hope for an efficient algorithm for minimizing nondeterministicautomata.

65

4. Algorithmic Properties of Automata

Theorem 4.13. The following problem is pspace-complete.

Input: A finite nondeterministic automaton A over {a, b}∗ and an integer k ≥ 1.

Question: Is there an equivalent nondeterministic automaton with at most k states.

Proof. We show that the problem is in pspace for arbitrary alphabets Σ. Let A be anondeterministic Σ∗-automaton and let k ≥ 1. Then a pspace-algorithm can guess anondeterministic Σ∗-auotomaton B and then check whether L(A) = L(B) using Algorithm 8.

For pspace-hardness, we use a reduction from universality of nondeterministic automata,see Theorem 4.2. Let A be a nondeterministic {a, b}∗-automaton. We use the followinprocedure for deciding whether A is universal. First, we check that a, b ∈ L(A). If this isthe case, then we let A′ = A, otherwise we let A′ be some fixed automaton such that anyautomaton for L(A′) requires at least two states. Now, A is universal if and only if thereexists an equivalent 1-state automaton for A′. This is because there is only one 1-stateautomaton B with L(B) 6= ∅ having both an a- and a b-transition, and this automaton satisfiesL(B) = {a, b}∗.

Next, we describe an efficient heuristic for reducing the number of states in a nondetermin-istic automaton. Let A = (Q, δ, I, F ) be a nondeterministic automaton with δ ⊆ Q× Σ×Q.Then as in the case of deterministic automata, we can define L(p) as the language acceptedby the automaton (Q, δ, p, F ). For a partition P of Q we can define the quotient automatonA/P = (P, δ′, I ′, F ′) with

δ′ ={

(P, a, P ′) ∈ P × Σ× P∣∣ (p, a, p′) ∈ δ for p ∈ P , p′ ∈ P ′

},

I ′ = {P ∈ P | P ∩ I 6= ∅ } ,F ′ = {P ∈ P | P ∩ F 6= ∅ } .

In general, we have L(A) 6= L(A/P); for instance, let L(A) be non-trivial and let P = {Q}.We say that a partition P is consistent if for all classes P ∈ P and all states p, q ∈ P we haveL(p) = L(q). For consistent partitions P we have L(A) = L(A/P). A partition P is stable ifthe following two properties hold:

(a) Every class P ∈ P either only contains final states or it only contains non-final states.

(b) Whenever p a r is a transition in A and p, p′ are in the same class, then there exists astate r′ in the class of r such that p′ a r′ is a transtion.

Lemma 4.14. Every stable partition is consistent.

Proof. Let P be a stable partition, let P ∈ P be a class, let p, p′ ∈ P , and let u ∈ L(p). Wewant to show u ∈ L(p′). The proof is by induction on |u|. If u = ε, then p is a final state ofA. Thus, by property (a) in the definition of stability, p′ is a final state, too. In particular,u = ε is in L(p′).

Let now u = au′ for a ∈ Σ. Let p a r u′ q with q ∈ F be a run on u. By property (b) ofstability, there exists a state r′ in the class of r such that p′ a r′. Induction yields u′ ∈ L(r′).Thus u = au′ ∈ L(p′) as desired.

Remark 4.2. Let A = (Q, ·, q0, F ) be a deterministic Σ∗-automaton. An equivalence relation≡ on Q is a congruence if for all p, q ∈ Q and all letters a ∈ Σ the following two implicationshold:

p ≡ q ⇒ (p ∈ F ⇔ q ∈ F ) ,

p ≡ q ⇒ p · a ≡ q · a.

A partition of Q is stable if and only if it induces a congruence. For instance, both theidentity relation and the Myhill-Nerode equivalence form congruences. Moreover, the Myhill-Nerode congruence is the coarsest congruence, i.e., all other congruences on Q refine theMyhill-Nerode congruence. 3

66

4.5. Minimization algorithms

For S ⊆ Q and a ∈ Σ let S · a−1 = { p ∈ Q | δ(p, a) ∈ S } be the states which havean a-transition to S. For a partition P, a subset S ⊆ Q and a letter a ∈ Σ we de-fine (S, a) | P =

{S · a−1, Q \

(S · a−1

)}∧ P as the coarsest partition which refines both{

S · a−1, Q \(S · a−1

)}and P . If P 6= (S, a) | P , then we say that (S, a) splits P . A partition

P of Q is stable if and only if the following two properties hold:

(a) P refines {F,Q \ F}.(b) For all S ∈ P and all a ∈ Σ, the pair (S, a) does not split P.

This immediately leads to the following algorithm for computing a stable partition:

(a) Start with the partition P = {F,Q \ F}.(b) As long as there exists S ∈ P and a ∈ Σ such that (S, a) splits P , replace P by (S, a) | P .

By Lemma 4.14, the resulting partition P of the algorithm satisfies L(A) = L(A/P). Wenote that one cannot apply Hopcroft’s algorithm in this setting since in general we haveQ \

(S · a−1

)6= (Q \ S) · a−1. On the other hand, if A is deterministic, then the above

algorithm computes the minimal automaton of L(A). The two main differences in the settingof nondeterministic automata are:

• We do not find all pairs of states (p, q) with L(p) = L(q).

• We could combine two states p and q even if L(p) 6= L(q) .

Example 4.11. Consider the following nondeterministic automaton for aa∗:

q0

q1

q2

q3

a

a

a

a

The above algorithm would compute the identity relation, even though the non-stablepartition {F,Q \ F} would yield an equivalent smaller automaton. The algorithm splitsq0 and q1 even though we have L(q0) = L(q1), and we could combine q2 and q3 despite ofL(q2) 6= L(q3). For S = {q3} we have S · a−1 = {q1} and (Q \ S) · a−1 = {q0, q1, q2}; inparticular Q \

(S · a−1

)6= (Q \ S) · a−1. Also note that the above algorithm computes a

nondeterministic automaton which is larger than the minimal automaton of aa∗. 3

Example 4.12. We apply the above state reduction algorithm to the automaton constructedin Example 1.9. Let A be the following {a, b}∗-automaton for the language L = b {ab, a}∗.

1 2

3

4

5

bb

aa

a

a

a a

The algorithm starts with the partition P = { {1, 5} , {2, 3, 4} }. We have

{1, 5} · a−1 = {2, 3, 4}{1, 5} · b−1 = ∅

{2, 3, 4} · a−1 = {2, 3, 4}{1, 5} · b−1 = {1, 5}

67

4. Algorithmic Properties of Automata

and therefore, P is stable. The resulting automaton is:

15 234

b

aa

In this case, the algorithm actually computes an equivalent nondeterministic automaton ofminimal size. It is smaller than the corresponding minimal (deterministic) automaton. 3

68