an introduction to kolmogorov complexity (and its …...kolmogorov complexity by algorithmic means....
TRANSCRIPT
An introduction to Kolmogorov complexity
(and its applications)
Laurent Bienvenu ( LIAFA, CNRS & Université de Paris 7 )
CIRM, MarseilleFebruary 9, 2010
1. Kolmogorov complexity
Some truly random sequences?.
Let us imagine a company/website selling DVD’s, each containing asequences of 109 bits, and advertised as “truly random”.
We decide to order 4 such DVD’s
1. Kolmogorov complexity 3/22
1st DVD:
000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000.....
1. Kolmogorov complexity 4/22
2nd DVD:
01010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010101010...
1. Kolmogorov complexity 5/22
3rd DVD:
001001000011111101101010110010001000010110100011000010001101001100010011000110011000101000101110000000110111000001110011010001001010010100001001001110000010001000101001100111110011000111010000000010000010111010111010100110....
1. Kolmogorov complexity 6/22
3rd DVD:
001001000011111101101010110010001000010110100011000010001101001100010011000110011000101000101110000000110111000001110011010001001010010100001001001110000010001000101001100111110011000111010000000010000010111010111010100110....
It’s almost π written in binary!
1. Kolmogorov complexity 7/22
4th DVD:
111000000111000000000111000111000111000111000000111111000000000111111000000000000111000000111000000111000111111111111111000111000111000111000000000111000000111000000000111111000111000000000000111111111000000111000111111111000......
1. Kolmogorov complexity 8/22
None of these sequences look “random”
A priori, all sequences of length N have the same probability tooccur.
A posteriori, some of them look non-random.
How to formalize this intuition?
1. Kolmogorov complexity 9/22
None of these sequences look “random”
A priori, all sequences of length N have the same probability tooccur.
A posteriori, some of them look non-random.
How to formalize this intuition?
1. Kolmogorov complexity 9/22
None of these sequences look “random”
A priori, all sequences of length N have the same probability tooccur.
A posteriori, some of them look non-random.
How to formalize this intuition?
1. Kolmogorov complexity 9/22
Leibniz’s philosphy.
G.W. Leibniz (∼ 1686).
“[...] Mais il est bon de considérer que Dieu ne fait rien hors d’ordre[....] car quant à l’ordre universel, tout y est conforme. Ce qui est sivrai, que non seulement rien n’arrive dans le monde, qui soitabsolument irregulier, mais on ne sçauroit mêmes rien feindre de tel”
“[...] Mais quand une règle est fort composée, ce qui luy estconforme passe pour irregulier”
1. Kolmogorov complexity 10/22
Leibniz’s philosphy.
G.W. Leibniz (∼ 1686).
“[...] Mais il est bon de considérer que Dieu ne fait rien hors d’ordre[....] car quant à l’ordre universel, tout y est conforme. Ce qui est sivrai, que non seulement rien n’arrive dans le monde, qui soitabsolument irregulier, mais on ne sçauroit mêmes rien feindre de tel”
“[...] Mais quand une règle est fort composée, ce qui luy estconforme passe pour irregulier”
1. Kolmogorov complexity 10/22
The shortest description.
Idea: a non-random sequence will be regular, i.e., can be describedeasily.
We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:
“The smallest integer which cannot be described byless than one hundred words ”
1. Kolmogorov complexity 11/22
The shortest description.
Idea: a non-random sequence will be regular, i.e., can be describedeasily.
We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:
“The smallest integer which cannot be described byless than one hundred words ”
1. Kolmogorov complexity 11/22
The shortest description.
Idea: a non-random sequence will be regular, i.e., can be describedeasily.
We have to be careful with the term “description”. To wit, theso-called Berry paradox (discussed by Russel, 1927); consider:
“The smallest integer which cannot be described byless than one hundred words ”
1. Kolmogorov complexity 11/22
Kolmogorov complexity.
Chaitin, Kolmogorov (1966).
DefinitionLet x be a finite binary string. We call Kolmogorov complexity of x thequantity K(x) defined by
K(x) = the shortest computer program (in binary) that generates x
1. Kolmogorov complexity 12/22
Making things fully formal (1).
One may argue that K(x) depends on what we mean by “program”and also on the “operating system” on which our “programs” run.
Choose a model of computation for functions {0, 1}∗ → {0, 1}∗
(Turing machines, RAM machines etc.) such that there exists auniversal machine U, defined by:
U(0n1p) = Mn(p)
where Mn is the n-th machine.
1. Kolmogorov complexity 13/22
Making things fully formal (1).
One may argue that K(x) depends on what we mean by “program”and also on the “operating system” on which our “programs” run.
Choose a model of computation for functions {0, 1}∗ → {0, 1}∗
(Turing machines, RAM machines etc.) such that there exists auniversal machine U, defined by:
U(0n1p) = Mn(p)
where Mn is the n-th machine.
1. Kolmogorov complexity 13/22
Making things fully formal (2).
The machine U is additively optimal, i.e. it is better at describing thanany other machine, up to an additive constant. Formally:
PropositionFor any given machine M, there exists a constant cM such that forall p, x if M(p) = x is defined, then there exists p ′ such that|p ′| ≤ |p| + cM and that U(p ′) = x.
Then, set
K(x) = min{
|p| : U(p) = x}
We can view p as the shortest description or ideal compression of x.
1. Kolmogorov complexity 14/22
Kolmogorov complexity is well-defined, up to an additive constant.
Typically, we prove results of type
K(x) ≤ |x|/2 − O(1),
K(x) ≥ n − O(1),
etc.
1. Kolmogorov complexity 15/22
Basic properties (1).
The complexity of a string x is at most its length.
PropositionFor any string x, K(x) ≤ |x| + O(1)
A very intuitive result, as one can always describe a string by giving itexplicitely.
1. Kolmogorov complexity 16/22
Basic properties (2).PropositionFor all k:
#{
x : K(x) < k}
< 2k
Indeed, there are 20 + 21 + . . . + 2k−1 < 2k programs of size < k.
CorollaryFor a given n and c, there is only a proportion 2−c of strings oflenght n whose complexity is less than n − c.
Intuitive again: a string chosen at random should be close toincompressible with high probability. From this, it makes sense tocall algorithmically random any string x of whose complexity is closeto |x|.
1. Kolmogorov complexity 17/22
Basic properties (2).PropositionFor all k:
#{
x : K(x) < k}
< 2k
Indeed, there are 20 + 21 + . . . + 2k−1 < 2k programs of size < k.
CorollaryFor a given n and c, there is only a proportion 2−c of strings oflenght n whose complexity is less than n − c.
Intuitive again: a string chosen at random should be close toincompressible with high probability. From this, it makes sense tocall algorithmically random any string x of whose complexity is closeto |x|.
1. Kolmogorov complexity 17/22
Basic properties (3).
Another fundamental property is that it is not possible to increaseKolmogorov complexity by algorithmic means.
PropositionFor any computable function f, there exists a constant cf such that forall x
K(f(x)) ≤ K(x) + cf
This also shows that we can define Kolmogorov complexity to anytype of object which can be encoded in a binary string (integers,finite graph, pair of strings). The choice of the encoding will onlyaffect the complexity by a constant.
1. Kolmogorov complexity 18/22
Basic properties (3).
Another fundamental property is that it is not possible to increaseKolmogorov complexity by algorithmic means.
PropositionFor any computable function f, there exists a constant cf such that forall x
K(f(x)) ≤ K(x) + cf
This also shows that we can define Kolmogorov complexity to anytype of object which can be encoded in a binary string (integers,finite graph, pair of strings). The choice of the encoding will onlyaffect the complexity by a constant.
1. Kolmogorov complexity 18/22
Kolmogorov complexity for other objects.
For an integer m, K(m) ≤ log(m) + O(1)
For a finite graph G with n vertices, K(G) ≤ n2 + O(1)
For a pair of objects x, y, K(x, y) ≤ K(x) + K(y) + O(log |x|, log |y|)(the log term disappears if x and y are of about the same length)
1. Kolmogorov complexity 19/22
Basic properties (4).
The last important property is bad news: Kolmogorov complexity isnot a computable function :-(
The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by
f(n) = min {m : K(m) ≥ n}
By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf
(non-creation of complexity), and K(n) ≤ log(n) + O(1).
So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!
1. Kolmogorov complexity 20/22
Basic properties (4).
The last important property is bad news: Kolmogorov complexity isnot a computable function :-(
The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by
f(n) = min {m : K(m) ≥ n}
By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf
(non-creation of complexity), and K(n) ≤ log(n) + O(1).
So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!
1. Kolmogorov complexity 20/22
Basic properties (4).
The last important property is bad news: Kolmogorov complexity isnot a computable function :-(
The proof is essentially Berry’s paradox! Suppose K is computable.We can then design a computable function f : N → N by
f(n) = min {m : K(m) ≥ n}
By definition, for all n, K(f(n)) ≥ n. But also K(f(n)) ≤ K(n) + cf
(non-creation of complexity), and K(n) ≤ log(n) + O(1).
So we would have log(n) + O(1) ≥ n for all n. An obviouscontradiction!
1. Kolmogorov complexity 20/22
Conditional complexity.
Kolmogorov complexity can be seen as an algorithmic version ofentropy. Like for entropy, we can define a conditional version:
K(x | y) = the shortest computer program (in binary)
that transforms y into x
(the formalization is done as before).
A fundamental result is the symmetry of informa-tion (Levin and Kolmogorov ∼ 1970).K(x, y) = K(x) + K(y | x) (up to logarithmic term)
1. Kolmogorov complexity 21/22
Conditional complexity.
Kolmogorov complexity can be seen as an algorithmic version ofentropy. Like for entropy, we can define a conditional version:
K(x | y) = the shortest computer program (in binary)
that transforms y into x
(the formalization is done as before).
A fundamental result is the symmetry of informa-tion (Levin and Kolmogorov ∼ 1970).K(x, y) = K(x) + K(y | x) (up to logarithmic term)
1. Kolmogorov complexity 21/22
2. Randomness for infinite sequences