the cayley-hamilton theorem for finite...

Radu Grosu

SUNY at Stony Brook

The Cayley-Hamilton Theorem

For Finite Automata

How did I get interested in this topic?

• Hybrid Systems Computation and Control:

- convergence between control and automata theory.

• Hybrid Automata: an outcome of this convergence

- modeling formalism for systems exhibiting both

discrete and continuous behavior,

- successfully used to model and analyze embedded

and biological systems.

Convergence of Theories

Lack of Common Foundation for HA

• Mode dynamics:

- Linear system (LS)

• Mode switching:

- Finite automaton (FA)

• Different techniques:

- LS reduction

- FA minimization

Stimulated

U

Vs v

U

Vv

E

Vv

0 /R

V tv

R

x Ax Bu

v

v V

Cx

/ di ts

vo

ltag

e(m

v)

time(ms)

• LS & FA taught separately: No common foundation!

• Finite automata can be conveniently regarded as

time invariant linear systems over semimodules:

- linear systems techniques generalize to automata

• Examples of such techniques include:

- linear transformations of automata,

- minimization and determinization of automata as

observability and reachability reductions

- Z-transform of automata to compute associated

regular expression through Gaussian elimination.

Main Conjecture

Minimal DFA are Not Minimal NFA(Arnold, Dicky and Nivat’s Example)

x1

a

bx3

x4

x2

c

b

c

x1

ax2

x3a

b

c

L = a (b* + c*)

c a

x1

x2

x3

b

a b

a

x4

x5c

c

b

abx2

x3

b

cx4

x5a

a

c

b

c

x1

Minimal NFA: How are they Related?(Arnold, Dicky and Nivat’s Example)

L = ab+ac + ba+bc + ca+cb

No homomorphism of either automaton onto the other.

c a

x1

x2

x3

b

a b

a

x4

x8c

c

b

abx5

x6

b

cx7

x8a

a

c

b

c

x1

Minimal NFA: How are they Related?(Arnold, Dicky and Nivat’s Example)

Carrez’s solution: Take both in a terminal NFA.

Is this the best one can do?

No! One can use use linear (similarity) transformations.

Observability Reduction HSCC’09(Arnold, Dicky and Nivat’s Example)

Define linear transformation x t = x tT:

T

x1

x2

x3

x4

x5

x1

1 0 0 0 0

x2

0 1 1 0 0

x3

0 1 0 1 0

x4

0 0 1 1 0

x5

0 0 0 0 1

A = [AT]T (T1

AT)

x0

t = x0

t T

C = [C]T (T1C)

abx23

x24

b

cx34

x5a

a

c

b

c

x1

Ac a

x1

x2

x3

b

a b

a

x4

x5c

c

bA

Reachability Reduction HSCC’09(Arnold, Dicky and Nivat’s Example)

Define linear transformation x t = x tT:

T

x1

x2

x3

x4

x5

x1

1 0 0 0 0

x2

0 1 1 0 0

x3

0 1 0 1 0

x4

0 0 1 1 0

x5

0 0 0 0 1

At = [A

tT]

T (T1

AtT)

x0

t = x0

t T

C = [C]T (T1C)

abx2

x3

b

cx4

x5a

a

c

b

c

x1

Ac a

x1

x23

x24

b

a b

a

x34

x5c

c

bA

Observability and minimization

Finite Automata as Linear Systems

Consider a finite automaton M = (X,,,S,F) with:

- finite set of states X, finite input alphabet ,

- transition relation X X,

- starting and final sets of states S,F X


Consider a finite automaton M = (X,,,S,F) with:

- finite set of states X, finite input alphabet ,

- transition relation X X,

- starting and final sets of states S,F X

Let X denote row and column indices. Then:

- defines a matrix A,

- S and F define corresponding vectors


Now define the linear system LM

= [S,A,C]:

x

t(n+1) = x

t(n)A, x

0= S

yt(n) = x

t(n)C, C = F



= [S,A,C]:

x

t(n+1) = x

t(n)A, x

0= S

yt(n) = x

t(n)C, C = F

Example: consider following automaton:

x3 x2x1

a

ab

b

A = 0 a b0 a 00 0 b

x0 =

00

C = 0

Semimodule of Languages

(*) is an idempotent semiring (quantale):

- ((*),+,0) is a commutative idempotent monoid (union),

- ((*),,1) is a monoid (concatenation),

- multiplication distributes over addition,

- 0 is an annihilator: 0 a = 0

((*))

n is a semimodule over scalars in (*):

- r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),

- 1x = x, 0x = 0

Note: No additive and multiplicative inverses!

Observability

Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t[C AC ... A

n-1C] = x

0

tO (1)

If L operates on a vector space:

- L is observable if: x0 is uniquely determined by (1),

- Observability matrix O: has rank n,

- n-outputs suffice: AnC = s

1A

n-1C + s

2A

n-2C + ... + s

nC

If L operates on a semimodule:

- L is observable if: x0 is uniquely determined by (1)

Observability

Let L = [S,A,C]. Observe its output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t[C AC ... A

n-1C] = x

0

tO (1)





1A

n-1C + s

2A

n-2C + ... + s

nC

(Cayley-Hamilton Theorem)



Observability

Let L = [S,A,C]. Observe the output upto n-1:

[y(0) y(1) ... y(n-1)] = x0

t[C AC ... A

n-1C] = x

0

tO (1)





1A

n-1C + s

2A

n-2C + ... + s

nC



The Cayley-Hamilton Theorem

( An = s

1An-1 + s

2An-2 + ... + s

nI )

Permutations are bijections of {1,...,n}:

- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}

The graph G() of a permutation :

- G() decomposes into: elementary cycles,

The sign of a permutation:

- Pos/Neg: even/odd number of even length cycles,

- Pn

/ P

n

: all positive/negative permutations.

Permutations


- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}


- G() decomposes into: elementary cycles

The sign of a permutation:

- Pos/Neg: even/odd number of even length cycles

- Pn

/ P

n

: all positive/negative permutations.

Permutations

34

21

75 6


- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}


- G() decomposes into: elementary cycles

The sign of a permutation :

- Pos/Neg: even/odd number of even length cycles

- Pn

/ P

n

: all positive/negative permutations

Permutations

34

21

75 6

Eigenvalues in Vector Spaces

The eigenvalues of a square matrix A:

- Eigenvector equation: xtA = x

ts

The characteristic equation of A:

- The characteristic polynomial: cpA(s) = |sI-A|

- The characteristic equation: cpA(s) = 0

The determinant of A:

- The determinant: |A| = (A)Pn

- (A)Pn

,

- Permutation application: (A) = A(i,(i))i1

n

eigenvalueeigenvector

Matrix-Eigenspaces in Vector Spaces


- Eigenvector equation: xt(sI-A) = 0






- (A)Pn

,

- Permutation application: (A) = A(i,(i))i1

n

Matrix-Eigenspaces in Vector Spaces


- Eigenvector equation: xt(sI-A) = 0






- (A)Pn

,

- Weight of a permutation: (A) = A(i,(i))i1

n

A satisfies its characteristic equation: cpA

(A) = 0

A =

0 a12

0

a21

0 a23

a31

0 a33

sI-A =

s -a12

0

-a21

s -a23

-a31

0 s-a33

|sI-A| = s3 - a

33s

2 - a

12a

21s + a

12a

21 a

33- a

12a

23a

31 = 0

s3 + a

12a

21 a

33 = a

33s

2 + a

12a

21s + a

12a

23a

31

A3 + a

12a

21 a

33I = a

33A

2 + a

12a

21A + a

12a

23a

31I

3

21

a12

a31 a23

a33

a21

The Cayley-Hamilton Theorem (CHT)

A =

0 a12

0

a21

0 a23

a31

0 a33

sI-A =

s -a12

0

-a21

s -a23

-a31

0 s-a33

|sI-A| = s3 - a

33s

2 - a

12a

21s + a

12a

21 a

33- a

12a

23a

31 = 0

s3 + a

12a

21 a

33 = a

33s

2 + a

12a

21s + a

12a

23a

31

A3 + a

12a

21 a

33I = a

33A

2 + a

12a

21A + a

12a

23a

31I

3

21

a12

a31 a23

a33

a21



(A) = 0

A =

0 a12

0

a21

0 a23

a31

0 a33

sI-A =

s -a12

0

-a21

s -a23

-a31

0 s-a33

|sI-A| = s3 - a

33s

2 - a

12a

21s + a

12a

21 a

33- a

12a

23a

31 = 0

s3 + a

12a

21 a

33 = a

33s

2 + a

12a

21s + a

12a

23a

31

A3 + a

12a

21 a

33I = a

33A

2 + a

12a

21A + a

12a

23a

31I

3

21

a12

a31 a23

a33

a21

cyclecycle cycle cycle cycle



(A) = 0


(A) = 0

Implicit assumptions in CHT:

- Subtraction is available

- Multiplication is commutative

Does CHT hold in semirings?

- Subtraction not indispensible (Rutherford, Straubing)

- Commutativity still problematic




(A) = 0

Implicit assumptions in CHT:

- Subtraction is available

- Multiplication is commutative

Does CHT hold in semirings?

- Subtraction not indispensible (Rutherford, Straubing)

- Commutativity problematic

CHT in Commutative Semirings(Straubing’s Proof)

Lift original semiring to the semiring of paths:

- Matrix A is lifted to a matrix GA

of paths

A =

0 a12

0

a21

0 a23

a31

0 a33

GA

=

0 (1,2) 0

(2,1) 0 (2,3)

(3,1) 0 (3,3)



of paths

- Permutation cycles lifted cyclic paths

= {(1,2),(2,1)}

= (1,2)(2,1)




of paths


Prove CHT in the semiring of paths:

G

A

n-q

Pq

q0

n

= G

A

n-q

Pq

q0

n

(CHT holds?)




of paths



- Show bijection between pos/neg products

G

A

0

P3

= G

A

2

P1

3

21

(1,2)

(2,1)

(2,3)(3,1)

(3,3)3

21

(1,2)

(2,1)

(2,3)(3,1)

(3,3) (3,3)(1,2)(2,1) (3,3)(1,2)(2,1)




of paths



- Show bijection between pos/neg products

Port results back to the original semiring:

- Apply products: (A)

- Path application: (1...

n)(A) = A(

1)...A(

n)


CHT in Idempotent Semirings


- Matrix A: order in paths important

- Permutation cycles: rotations are distinct





= {(1,2),(2,1)}

= (1,2)(2,1) 0 0

0 (2,1)(1,2) 0

0 0 0





- Products Gn-| |

: cycles to be properly inserted






- Products Gn-| |


Gn-| | =

Gn-| | + G

Gn-| |-1 +...+ Gn-| |






- Products Gn-||


Port results back to the original semiring:

- Apply products: Gn-||

(A)


Theorem: Gn =

Pq

n

q1

n

GA

n-||

Proof:

LHS RHS: Let LHS

- Pidgeon-hole: has at least one cycle

in s

- Structural:

is a simple cycle of length k

- Remove in : [s/

] is in G

n-||

- Shuffle-product: Gn-|| reinserts

RHS LHS: Let RHS

- No wrong path: The shuffle is sound

- Idempotence: Takes care of multiple copies


Theorem: Gn =

Pq

n

q1

n

GA

n-||

Proof:

LHS RHS: Let LHS

- Pidgeon-hole: has at least one cycle

in s

- Structural:

is also a simple cycle

- Remove in : [s/

] is in G

n-||

- Shuffle-product: Gn-|| reinserts

RHS LHS: Let RHS

- No wrong path: The shuffle is sound

- Idempotence: Takes care of multiple copies


Define: (i, i) =

if (i, i) = 0

0 if (i, i) =

Theorem: classic CHT can be derived by using:

- Gn-|| = G

n-|| + G

n-||

- application of CHT to G

n-|| and G

n-||

Matrix CHT: can be regarded as a constructive

version of the pumping lemma.




= [S,A,C]:

x

t(n+1) = x

t (n)A, x0

= S()

yt (n) = x

t (n)C, C = F()

Example: consider following automaton:

x3 x2x1

a

ab

b

0

0 1 0 1

A(a) = 0 1 0 , x ( ) = 0

0 0 0 0

0 0 1 0

A(b) = 0 0 0 , C( ) = 1

0 0 1 1

L1

A = A(a)a + A(b)b

Observability

t n-1 t

0 0

0

[y(0) y(1) ... y(n-1)] = x C AC ... A C] = x (1)

Let L = [S,A,C] be an n-state automaton. It's output:

[ O

L is observable if x is uniquely determin

Exampl t

ed by (

e: obserh v

1).

le abi i

n

1

2

3

1

b bb b ba aε a a a

x

A C

0 1 1 1 0 0 1 O =1 1 0 1 0 0 0

1 0 1 0

x

x 0 0 1

ty matrix of O L is:

x3 x2x1

a

ab

b

L1

the cayley-hamilton theorem for finite...

Documents