the cayley-hamilton theorem for finite...
TRANSCRIPT
Radu Grosu
SUNY at Stony Brook
The Cayley-Hamilton Theorem
For Finite Automata
How did I get interested in this topic?
• Hybrid Systems Computation and Control:
- convergence between control and automata theory.
• Hybrid Automata: an outcome of this convergence
- modeling formalism for systems exhibiting both
discrete and continuous behavior,
- successfully used to model and analyze embedded
and biological systems.
Convergence of Theories
Lack of Common Foundation for HA
• Mode dynamics:
- Linear system (LS)
• Mode switching:
- Finite automaton (FA)
• Different techniques:
- LS reduction
- FA minimization
Stimulated
U
Vs v
U
Vv
E
Vv
0 /R
V tv
R
x Ax Bu
v
v V
Cx
/ di ts
vo
ltag
e(m
v)
time(ms)
• LS & FA taught separately: No common foundation!
• Finite automata can be conveniently regarded as
time invariant linear systems over semimodules:
- linear systems techniques generalize to automata
• Examples of such techniques include:
- linear transformations of automata,
- minimization and determinization of automata as
observability and reachability reductions
- Z-transform of automata to compute associated
regular expression through Gaussian elimination.
Main Conjecture
Minimal DFA are Not Minimal NFA(Arnold, Dicky and Nivat’s Example)
x1
a
bx3
x4
x2
c
b
c
x1
ax2
x3a
b
c
L = a (b* + c*)
c a
x1
x2
x3
b
a b
a
x4
x5c
c
b
abx2
x3
b
cx4
x5a
a
c
b
c
x1
Minimal NFA: How are they Related?(Arnold, Dicky and Nivat’s Example)
L = ab+ac + ba+bc + ca+cb
No homomorphism of either automaton onto the other.
c a
x1
x2
x3
b
a b
a
x4
x8c
c
b
abx5
x6
b
cx7
x8a
a
c
b
c
x1
Minimal NFA: How are they Related?(Arnold, Dicky and Nivat’s Example)
Carrez’s solution: Take both in a terminal NFA.
Is this the best one can do?
No! One can use use linear (similarity) transformations.
Observability Reduction HSCC’09(Arnold, Dicky and Nivat’s Example)
Define linear transformation x t = x tT:
T
x1
x2
x3
x4
x5
x1
1 0 0 0 0
x2
0 1 1 0 0
x3
0 1 0 1 0
x4
0 0 1 1 0
x5
0 0 0 0 1
A = [AT]T (T1
AT)
x0
t = x0
t T
C = [C]T (T1C)
abx23
x24
b
cx34
x5a
a
c
b
c
x1
Ac a
x1
x2
x3
b
a b
a
x4
x5c
c
bA
Reachability Reduction HSCC’09(Arnold, Dicky and Nivat’s Example)
Define linear transformation x t = x tT:
T
x1
x2
x3
x4
x5
x1
1 0 0 0 0
x2
0 1 1 0 0
x3
0 1 0 1 0
x4
0 0 1 1 0
x5
0 0 0 0 1
At = [A
tT]
T (T1
AtT)
x0
t = x0
t T
C = [C]T (T1C)
abx2
x3
b
cx4
x5a
a
c
b
c
x1
Ac a
x1
x23
x24
b
a b
a
x34
x5c
c
bA
Observability and minimization
Finite Automata as Linear Systems
Consider a finite automaton M = (X,,,S,F) with:
- finite set of states X, finite input alphabet ,
- transition relation X X,
- starting and final sets of states S,F X
Finite Automata as Linear Systems
Consider a finite automaton M = (X,,,S,F) with:
- finite set of states X, finite input alphabet ,
- transition relation X X,
- starting and final sets of states S,F X
Let X denote row and column indices. Then:
- defines a matrix A,
- S and F define corresponding vectors
Finite Automata as Linear Systems
Now define the linear system LM
= [S,A,C]:
x
t(n+1) = x
t(n)A, x
0= S
yt(n) = x
t(n)C, C = F
Finite Automata as Linear Systems
Now define the linear system LM
= [S,A,C]:
x
t(n+1) = x
t(n)A, x
0= S
yt(n) = x
t(n)C, C = F
Example: consider following automaton:
x3 x2x1
a
ab
b
A = 0 a b0 a 00 0 b
x0 =
00
C = 0
Semimodule of Languages
(*) is an idempotent semiring (quantale):
- ((*),+,0) is a commutative idempotent monoid (union),
- ((*),,1) is a monoid (concatenation),
- multiplication distributes over addition,
- 0 is an annihilator: 0 a = 0
((*))
n is a semimodule over scalars in (*):
- r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
- 1x = x, 0x = 0
Note: No additive and multiplicative inverses!
Semimodule of Languages
(*) is an idempotent semiring (quantale):
- ((*),+,0) is a commutative idempotent monoid (union),
- ((*),,1) is a monoid (concatenation),
- multiplication distributes over addition,
- 0 is an annihilator: 0 a = 0
((*))
n is a semimodule over scalars in (*):
- r(x+y) = rx + ry, (r+s)x = rx + sx, (rs)x = r(sx),
- 1x = x, 0x = 0
Note: No additive and multiplicative inverses!
Observability
Let L = [S,A,C]. Observe its output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t[C AC ... A
n-1C] = x
0
tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: AnC = s
1A
n-1C + s
2A
n-2C + ... + s
nC
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
Observability
Let L = [S,A,C]. Observe its output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t[C AC ... A
n-1C] = x
0
tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: AnC = s
1A
n-1C + s
2A
n-2C + ... + s
nC
(Cayley-Hamilton Theorem)
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
Observability
Let L = [S,A,C]. Observe the output upto n-1:
[y(0) y(1) ... y(n-1)] = x0
t[C AC ... A
n-1C] = x
0
tO (1)
If L operates on a vector space:
- L is observable if: x0 is uniquely determined by (1),
- Observability matrix O: has rank n,
- n-outputs suffice: AnC = s
1A
n-1C + s
2A
n-2C + ... + s
nC
If L operates on a semimodule:
- L is observable if: x0 is uniquely determined by (1)
The Cayley-Hamilton Theorem
( An = s
1An-1 + s
2An-2 + ... + s
nI )
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles,
The sign of a permutation:
- Pos/Neg: even/odd number of even length cycles,
- Pn
/ P
n
: all positive/negative permutations.
Permutations
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles
The sign of a permutation:
- Pos/Neg: even/odd number of even length cycles
- Pn
/ P
n
: all positive/negative permutations.
Permutations
34
21
75 6
Permutations are bijections of {1,...,n}:
- Example: = {(1,2),(2,3),(3,4),(4,1),(5,7),(6,6),(7,5)}
The graph G() of a permutation :
- G() decomposes into: elementary cycles
The sign of a permutation :
- Pos/Neg: even/odd number of even length cycles
- Pn
/ P
n
: all positive/negative permutations
Permutations
34
21
75 6
Eigenvalues in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xtA = x
ts
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| = (A)Pn
- (A)Pn
,
- Permutation application: (A) = A(i,(i))i1
n
eigenvalueeigenvector
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| = (A)Pn
- (A)Pn
,
- Permutation application: (A) = A(i,(i))i1
n
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| = (A)Pn
- (A)Pn
,
- Permutation application: (A) = A(i,(i))i1
n
Matrix-Eigenspaces in Vector Spaces
The eigenvalues of a square matrix A:
- Eigenvector equation: xt(sI-A) = 0
The characteristic equation of A:
- The characteristic polynomial: cpA(s) = |sI-A|
- The characteristic equation: cpA(s) = 0
The determinant of A:
- The determinant: |A| = (A)Pn
- (A)Pn
,
- Weight of a permutation: (A) = A(i,(i))i1
n
A satisfies its characteristic equation: cpA
(A) = 0
A =
0 a12
0
a21
0 a23
a31
0 a33
sI-A =
s -a12
0
-a21
s -a23
-a31
0 s-a33
|sI-A| = s3 - a
33s
2 - a
12a
21s + a
12a
21 a
33- a
12a
23a
31 = 0
s3 + a
12a
21 a
33 = a
33s
2 + a
12a
21s + a
12a
23a
31
A3 + a
12a
21 a
33I = a
33A
2 + a
12a
21A + a
12a
23a
31I
3
21
a12
a31 a23
a33
a21
The Cayley-Hamilton Theorem (CHT)
A =
0 a12
0
a21
0 a23
a31
0 a33
sI-A =
s -a12
0
-a21
s -a23
-a31
0 s-a33
|sI-A| = s3 - a
33s
2 - a
12a
21s + a
12a
21 a
33- a
12a
23a
31 = 0
s3 + a
12a
21 a
33 = a
33s
2 + a
12a
21s + a
12a
23a
31
A3 + a
12a
21 a
33I = a
33A
2 + a
12a
21A + a
12a
23a
31I
3
21
a12
a31 a23
a33
a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA
(A) = 0
A =
0 a12
0
a21
0 a23
a31
0 a33
sI-A =
s -a12
0
-a21
s -a23
-a31
0 s-a33
|sI-A| = s3 - a
33s
2 - a
12a
21s + a
12a
21 a
33- a
12a
23a
31 = 0
s3 + a
12a
21 a
33 = a
33s
2 + a
12a
21s + a
12a
23a
31
A3 + a
12a
21 a
33I = a
33A
2 + a
12a
21A + a
12a
23a
31I
3
21
a12
a31 a23
a33
a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA
(A) = 0
A =
0 a12
0
a21
0 a23
a31
0 a33
sI-A =
s -a12
0
-a21
s -a23
-a31
0 s-a33
|sI-A| = s3 - a
33s
2 - a
12a
21s + a
12a
21 a
33- a
12a
23a
31 = 0
s3 + a
12a
21 a
33 = a
33s
2 + a
12a
21s + a
12a
23a
31
A3 + a
12a
21 a
33I = a
33A
2 + a
12a
21A + a
12a
23a
31I
3
21
a12
a31 a23
a33
a21
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA
(A) = 0
A =
0 a12
0
a21
0 a23
a31
0 a33
sI-A =
s -a12
0
-a21
s -a23
-a31
0 s-a33
|sI-A| = s3 - a
33s
2 - a
12a
21s + a
12a
21 a
33- a
12a
23a
31 = 0
s3 + a
12a
21 a
33 = a
33s
2 + a
12a
21s + a
12a
23a
31
A3 + a
12a
21 a
33I = a
33A
2 + a
12a
21A + a
12a
23a
31I
3
21
a12
a31 a23
a33
a21
cyclecycle cycle cycle cycle
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA
(A) = 0
A satisfies its characteristic equation: cpA
(A) = 0
Implicit assumptions in CHT:
- Subtraction is available
- Multiplication is commutative
Does CHT hold in semirings?
- Subtraction not indispensible (Rutherford, Straubing)
- Commutativity still problematic
The Cayley-Hamilton Theorem (CHT)
The Cayley-Hamilton Theorem (CHT)
A satisfies its characteristic equation: cpA
(A) = 0
Implicit assumptions in CHT:
- Subtraction is available
- Multiplication is commutative
Does CHT hold in semirings?
- Subtraction not indispensible (Rutherford, Straubing)
- Commutativity problematic
CHT in Commutative Semirings(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA
of paths
A =
0 a12
0
a21
0 a23
a31
0 a33
GA
=
0 (1,2) 0
(2,1) 0 (2,3)
(3,1) 0 (3,3)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA
of paths
- Permutation cycles lifted cyclic paths
= {(1,2),(2,1)}
= (1,2)(2,1)
CHT in Commutative Semirings(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA
of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
G
A
n-q
Pq
q0
n
= G
A
n-q
Pq
q0
n
(CHT holds?)
CHT in Commutative Semirings(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA
of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
- Show bijection between pos/neg products
G
A
0
P3
= G
A
2
P1
3
21
(1,2)
(2,1)
(2,3)(3,1)
(3,3)3
21
(1,2)
(2,1)
(2,3)(3,1)
(3,3) (3,3)(1,2)(2,1) (3,3)(1,2)(2,1)
CHT in Commutative Semirings(Straubing’s Proof)
Lift original semiring to the semiring of paths:
- Matrix A is lifted to a matrix GA
of paths
- Permutation cycles lifted cyclic paths
Prove CHT in the semiring of paths:
- Show bijection between pos/neg products
Port results back to the original semiring:
- Apply products: (A)
- Path application: (1...
n)(A) = A(
1)...A(
n)
CHT in Commutative Semirings(Straubing’s Proof)
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
= {(1,2),(2,1)}
= (1,2)(2,1) 0 0
0 (2,1)(1,2) 0
0 0 0
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-| |
: cycles to be properly inserted
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-| |
: cycles to be properly inserted
Gn-| | =
Gn-| | + G
Gn-| |-1 +...+ Gn-| |
CHT in Idempotent Semirings
Lift original semiring to the semiring of paths:
- Matrix A: order in paths important
- Permutation cycles: rotations are distinct
Prove CHT in the semiring of paths:
- Products Gn-||
: cycles to be properly inserted
Port results back to the original semiring:
- Apply products: Gn-||
(A)
CHT in Idempotent Semirings
Theorem: Gn =
Pq
n
q1
n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole: has at least one cycle
in s
- Structural:
is a simple cycle of length k
- Remove in : [s/
] is in G
n-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence: Takes care of multiple copies
CHT in Idempotent Semirings
Theorem: Gn =
Pq
n
q1
n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole: has at least one cycle
in s
- Structural:
is also a simple cycle
- Remove in : [s/
] is in G
n-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence: Takes care of multiple copies
CHT in Idempotent Semirings
Theorem: Gn =
Pq
n
q1
n
GA
n-||
Proof:
LHS RHS: Let LHS
- Pidgeon-hole: has at least one cycle
in s
- Structural:
is also a simple cycle
- Remove in : [s/
] is in G
n-||
- Shuffle-product: Gn-|| reinserts
RHS LHS: Let RHS
- No wrong path: The shuffle is sound
- Idempotence: Takes care of multiple copies
CHT in Idempotent Semirings
Define: (i, i) =
if (i, i) = 0
0 if (i, i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G
n-||
- application of CHT to G
n-|| and G
n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Define: (i, i) =
if (i, i) = 0
0 if (i, i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G
n-||
- application of CHT to G
n-|| and G
n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Define: (i, i) =
if (i, i) = 0
0 if (i, i) =
Theorem: classic CHT can be derived by using:
- Gn-|| = G
n-|| + G
n-||
- application of CHT to G
n-|| and G
n-||
Matrix CHT: can be regarded as a constructive
version of the pumping lemma.
CHT in Idempotent Semirings
Finite Automata as Linear Systems
Now define the linear system LM
= [S,A,C]:
x
t(n+1) = x
t (n)A, x0
= S()
yt (n) = x
t (n)C, C = F()
Example: consider following automaton:
x3 x2x1
a
ab
b
0
0 1 0 1
A(a) = 0 1 0 , x ( ) = 0
0 0 0 0
0 0 1 0
A(b) = 0 0 0 , C( ) = 1
0 0 1 1
L1
A = A(a)a + A(b)b
Observability
t n-1 t
0 0
0
[y(0) y(1) ... y(n-1)] = x C AC ... A C] = x (1)
Let L = [S,A,C] be an n-state automaton. It's output:
[ O
L is observable if x is uniquely determin
Exampl t
ed by (
e: obserh v
1).
le abi i
n
1
2
3
1
b bb b ba aε a a a
x
A C
0 1 1 1 0 0 1 O =1 1 0 1 0 0 0
1 0 1 0
x
x 0 0 1
ty matrix of O L is:
x3 x2x1
a
ab
b
L1