computabilityand randomness - department of statistics€¦ · computabilityand randomness...

12
Computability and Randomness Rod Downey and Denis R. Hirschfeldt Historical Roots Von Mises. Around 1930, Kolmogorov and others found- ed the theory of probability, basing it on measure theory. Probability theory is concerned with the distribution of outcomes in sample spaces. It does not seek to give any meaning to the notion of an individual object, such as a single real number or binary string, being random, but rather studies the expected values of random variables. Rod Downey is a professor of mathematics and the deputy head of the School of Mathematics and Statistics at Victoria University Wellington. His email address is [email protected]. Denis R. Hirschfeldt is a professor of mathematics at the University of Chicago. His email address is [email protected]. For permission to reprint this article, please contact: [email protected]. DOI: https://doi.org/10.1090/noti1905 How could a binary string representing a sequence of coin tosses be random, when all strings of length have the same probability of 2 for a fair coin? Less well known than the work of Kolmogorov are early attempts to answer this kind of question by providing no- tions of randomness for individual objects. The modern theory of algorithmic randomness realizes this goal. One way to develop this theory is based on the idea that an object is random if it passes all relevant “randomness tests.” For example, by the law of large numbers, for a random real , we would expect the number of 1’s in the binary expansion of to have limiting frequency 1 2 . (That is, writing () for the th bit of this expansion, we would expect to have lim →∞ |{<∶()=1}| = 1 2 .) Indeed, we would expect to be normal to base 2, meaning that for any binary string of length , the occurrences of in the binary expan- AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1001

Upload: others

Post on 28-Jun-2020

21 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

Computability andRandomness

Rod Downey and Denis R. HirschfeldtHistorical RootsVon Mises. Around 1930, Kolmogorov and others found-ed the theory of probability, basing it on measure theory.Probability theory is concerned with the distribution ofoutcomes in sample spaces. It does not seek to give anymeaning to the notion of an individual object, such asa single real number or binary string, being random, butrather studies the expected values of random variables.

Rod Downey is a professor of mathematics and the deputy head of the Schoolof Mathematics and Statistics at Victoria University Wellington. His emailaddress is [email protected] R. Hirschfeldt is a professor of mathematics at the University of Chicago.His email address is [email protected].

For permission to reprint this article, please contact:[email protected].

DOI: https://doi.org/10.1090/noti1905

How could a binary string representing a sequence of 𝑛coin tosses be random, when all strings of length 𝑛 havethe same probability of 2−𝑛 for a fair coin?

Less well known than the work of Kolmogorov are earlyattempts to answer this kind of question by providing no-tions of randomness for individual objects. The moderntheory of algorithmic randomness realizes this goal. Onewayto develop this theory is based on the idea that an objectis random if it passes all relevant “randomness tests.” Forexample, by the law of large numbers, for a random real𝑋,we would expect the number of 1’s in the binary expansionof 𝑋 to have limiting frequency 1

2 . (That is, writing 𝑋(𝑗)for the 𝑗th bit of this expansion, we would expect to havelim𝑛→∞

|{𝑗<𝑛∶𝑋(𝑗)=1}|𝑛 = 1

2 .) Indeed, we would expect 𝑋to be normal to base 2, meaning that for any binary string𝜎 of length 𝑘, the occurrences of 𝜎 in the binary expan-

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1001

Page 2: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

sion of 𝑋 should have limiting frequency 2−𝑘. Since baserepresentation should not affect randomness, we wouldexpect 𝑋 to be normal in this sense no matter what baseit were written in, so that in base 𝑏 the limiting frequencywould be 𝑏−𝑘 for a string 𝜎 of length 𝑘. Thus 𝑋 shouldbe what is known as absolutely normal.

The idea of normality, which goes back to Borel (1909),was extended by von Mises (1919), who suggested the fol-lowing definition of randomness for individual binary se-quences.1 A selection function is an increasing function 𝑓 ∶ℕ → ℕ. We think of 𝑓(𝑖) as the 𝑖th place selected in form-ing a subsequence of a given sequence. (For the defini-tion of normality above, where we consider the entire se-quence, 𝑓(𝑖) = 𝑖.) Von Mises suggested that a sequence𝑎0𝑎1 … should be random if any selected subsequence𝑎𝑓(0)𝑎𝑓(1) … is normal.

There is, of course, an obvious problem with this ap-proach. For any sequence 𝑋 with infinitely many 1’s wecould let 𝑓 select the positions where 1’s occur, and 𝑋would fail the test determined by 𝑓. However, it does notseem reasonable to be able to choose the testing placesafter selecting an 𝑋. The question is then: What kindsof selection functions should be allowed, to capture theintuition that we ought not to be able to sample from arandom sequence and get the wrong frequencies? It isreasonable to regard prediction as a computational pro-cess, and hence restrict ourselves to computable selectionfunctions. Indeed, this suggestion was eventually made byChurch (1940), though von Mises’ work predates the defi-nition of computable function, so he did not have a goodway to make his definition mathematically precise.

As we will see, von Mises’ approach had a more sig-nificant flaw, but we can build on its fundamental idea:Imagine that we are judges deciding whether a sequence𝑋 should count as random. If 𝑋 passes all tests we can(in principle) devise given our computational power, thenwe should regard 𝑋 as random since, as far as we are con-cerned, 𝑋 has all the expected properties of a random ob-ject. We will use this intuition and the apparatus of com-putability and complexity theory to describe notions of al-gorithmic randomness.

Aside from the intrinsic interest of such an approach,it leads to useful mathematical tools. Many processes inmathematics are computable, and the expected behaviorof such a process should align itself with the behavior ob-tained by providing it with an algorithmically random in-put. Hence, instead of having to analyze the relevant dis-tribution and its statistics, we can simply argue about thebehavior of the process on a single input. For instance, the

1Due to space limitations, we omit historical citations and those found in the booksDowney and Hirschfeldt [9], Li and Vitanyi [18], or Nies [24] from the list of references.In some sections below, we also cite secondary sources where additional references can befound.

expected number of steps of a sorting algorithm should bethe same as that for a single algorithmically random input.We could also bemore fine-grained and seek to understandexactly “how much” randomness is needed for certain typ-ical behaviors to arise. (See the section on “Some Applica-tions.”)

As we will discuss, algorithmic randomness also goeshand in hand with other parts of algorithmic informationtheory, such as Kolmogorov complexity, and has ties withnotions such as Shannon entropy and fractal dimension.Some basic computability theory. In the 1930s, Church,Gödel, Kleene, Post, and most famously Turing (1937)gave equivalent mathematical definitions capturing the in-tuitive notion of a computable function, leading to theChurch-Turing Thesis, which can be taken as asserting thata function (from ℕ to ℕ, say) is computable if and onlyif it can be computed by a Turing machine. (This defini-tion can easily be transferred to other objects of countablemathematics. For instance, we think of infinite binary se-quences as functions ℕ → {0, 1}, and identify sets of nat-ural numbers with their characteristic functions.) Nowa-days, we can equivalently regard a function as computableif we can write code to compute it in any given general-purpose programming language (assuming the languagecan address unlimited memory). It has also become clearthat algorithms can be treated as data, and hence thatthere is a universal Turing machine, i.e., there is a listingΦ0, Φ1,… of all Turing machines and a single algorithmthat, on input ⟨𝑒, 𝑛⟩ computes the resultΦ𝑒(𝑛) of runningΦ𝑒 on input 𝑛.2

It is important to note that a Turing machine mightnot halt on a given input, and hence the functions com-puted by Turing machines are in general partial. Indeed,as Turing showed, the halting problem “Does the 𝑒th Tur-ing machine halt on input 𝑛?” is algorithmically unsolv-able. Church and Turing famously showed that Hilbert’sEntscheidungsproblem (the decision problem for first-orderlogic) is unsolvable, in Turing’s case by showing that thehalting problem can be coded into first-order logic. Manyother problems have since been shown to be algorithmi-cally unsolvable by similar means.

We write Φ𝑒(𝑛)↓ to mean that the machine Φ𝑒 eventu-ally halts on input 𝑛. Then ∅′ = {⟨𝑒, 𝑛⟩ ∶ Φ𝑒(𝑛)↓} is aset representing the halting problem. This set is an exam-ple of a noncomputable computably enumerable (c.e.) set,

2The realization that such universal machines are possible helped lead to the developmentof modern computers. Previously, machines had been purpose-built for given tasks. In a1947 lecture on his design for the Automated Computing Engine, Turing said, “The specialmachine may be called the universal machine; it works in the following quite simple man-ner. When we have decided what machine we wish to imitate we punch a description of iton the tape of the universal machine . . . The universal machine has only to keep looking atthis description in order to find out what it should do at each stage. Thus the complexity ofthe machine to be imitated is concentrated in the tape and does not appear in the universalmachine proper in any way. . . . [D]igital computing machines such as the ACE . . . arein fact practical versions of the universal machine.” From our contemporary point of view,it may be difficult to imagine how novel this idea was.

1002 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

Page 3: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

which means that the set can be listed (not necessarily innumerical order) by some algorithm.

Another important notion is that of Turing reducibility(which we define for sets of natural numbers but is sim-ilarly defined for functions), where 𝐴 is Turing reducibleto 𝐵, written as 𝐴 ≤T 𝐵, if there is an algorithm for com-puting 𝐴 when given access to 𝐵. That is, the algorithmis allowed access to answers to questions of the form “Is𝑛 in 𝐵?” during its execution. This notion can be formal-ized using Turing machines with oracle tapes. If 𝐴 ≤T 𝐵,then we regard 𝐴 as no more complicated than 𝐵 from acomputability-theoretic perspective. We also say that 𝐴 is𝐵-computable or computable relative to 𝐵. Turing reducibil-ity naturally leads to an equivalence relation, where𝐴 and𝐵 are Turing equivalent if 𝐴 ≤T 𝐵 and 𝐵 ≤T 𝐴. The (Tur-ing) degree of 𝐴 is its equivalence class under this notion.(There are several other notions of reducibility and result-ing degree structures in computability theory, but Turingreducibility is the central one.)

In general, the process of allowing access to an oracle inour algorithms is known as relativization. As in the unrel-ativized case, we can list the Turing machines Φ𝐵

0 , Φ𝐵1 ,…

with oracle 𝐵, and let 𝐵′ = {⟨𝑒, 𝑛⟩ ∶ Φ𝐵𝑒 (𝑛)↓} be the rel-

ativization of the halting problem to 𝐵. This set is calledthe (Turing) jump of 𝐵. The jump operation taking 𝐵 to𝐵′ is very important in computability theory, one reasonbeing that 𝐵′ is the most complicated set that is still c.e.relative to 𝐵, i.e., 𝐵′ is c.e. relative to 𝐵 and every set that isc.e. relative to 𝐵 is 𝐵′-computable. There are several otherimportant classes of sets that can be defined in terms ofthe jump. For instance, 𝐴 is low if 𝐴′ ≤T ∅′ and highif ∅″ ≤T 𝐴′ (where ∅″ = (∅′)′). Low sets are in cer-tain ways “close to computable,” while high ones partakeof some of the power of ∅′ as an oracle. These propertiesare invariant under Turing equivalence, and hence are alsoproperties of Turing degrees.Martin-Löf randomness. As mentioned above, Churchsuggested that a sequence should count as algorithmicallyrandom if it is random in the sense of von Mises with se-lection functions restricted to the computable ones. How-ever, in 1939, Ville showed that von Mises’ approach can-not work in its original form, no matter what countablecollection of selection functions we choose. Let 𝑋 ↾ 𝑛 de-note the first 𝑛 bits of the binary sequence 𝑋.

Theorem 1 (Ville (1939)). For any countable collection ofselection functions, there is a sequence 𝑋 that passes all vonMises tests associated with these functions, such that for every𝑛, there are more 0’s than 1’s in 𝑋 ↾ 𝑛.

Clearly, Ville’s sequence cannot be regarded as random inany reasonable sense.

We could try to repair von Mises’ definition by addingfurther tests, reflecting statistical laws beyond the law of

large numbers. But which ones? Ville suggested ones re-flecting the law of iterated logarithms, which would takecare of his specific example. But how could we know thatfurther examples along these lines—i.e., sequences satis-fying both von Mises’ and Ville’s tests, yet failing to havesome other property we expect of random sequences—would not arise?

The situation was finally clarified in the 1960s byMartin-Löf (1966). In probability theory, “typicality” isquantified using measure theory, leading to the intuitionthat random objects should avoid null sets. Martin-Löf no-ticed that tests like von Mises’ and Ville’s can be thought ofas effectively null sets. His idea was that, instead of consid-ering specific tests based on particular statistical laws, weshould consider all possible tests corresponding to someprecisely defined notion of effectively null set. The restric-tion to such a notion gets around the problem that no se-quence can avoid being in every null set.

To giveMartin-Löf’s definition, wework for conveniencein Cantor space 2𝜔, whose elements are infinite binary se-quences. (We associate a real number with its binary ex-pansion, thought of as a sequence, so we will also obtain adefinition of algorithmic randomness for reals. The choiceof base is not important. For example, all of the notionsof randomness we consider are enough to ensure absolutenormality.) The basic open sets of Cantor space are theones of the form [𝜎] = {𝑋 ∈ 2𝜔 ∶ 𝑋 extends 𝜎} for𝜎 ∈ 2<𝜔, where 2<𝜔 is the set of finite binary strings.The uniform measure 𝜆 on this space is obtained by defin-ing 𝜆([𝜎]) = 2−|𝜎|. We say that a sequence 𝑇0, 𝑇1,… ofopen sets in 2𝜔 is uniformly c.e. if there is a c.e. set 𝐺 ⊆ℕ× 2<𝜔 such that 𝑇𝑛 = ⋃{[𝜎] ∶ (𝑛,𝜎) ∈ 𝐺}.

Definition 2. A Martin-Löf test is a sequence 𝑇0, 𝑇1,… ofuniformly c.e. open sets such that 𝜆(𝑇𝑛) ≤ 2−𝑛. A se-quence 𝑋 passes this test if 𝑋 ∉ ⋂𝑛 𝑇𝑛. A sequence isMartin-Löf random (ML-random) if it passes all Martin-Löftests.

The intersection of a Martin-Löf test is our notion ofeffectively null set. Since there are only countably manyMartin-Löf tests, and each determines a null set in the clas-sical sense, the collection of ML-random sequences hasmeasure 1. It can be shown that Martin-Löf tests includeall the ones proposed by von Mises and Ville, in Church’scomputability-theoretic versions. Indeed they include alltests that are “computably performable,” which avoids theproblem of having to adaptively introduce more tests asmore Ville-like sequences are found.

Martin-Löf’s effectivization of measure theory allowedhim to consider the laws a random sequence should obeyfrom an abstract point of view, leading to a mathemati-cally robust definition. As Jack Lutz said in a talk at the 7thConference on Computability, Complexity, and Randomness

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1003

Page 4: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

(Cambridge, 2012), “Placing computability constraints ona nonconstructive theory like Lebesgue measure seems apriori to weaken the theory, but it may strengthen the the-ory for some purposes. This vision is crucial for present-day investigations of individual random sequences, dimen-sions of individual sequences, measure and category incomplexity classes, etc.”The three approaches. ML-randomness can be thoughtof as the statistician’s approach to defining algorithmic ran-domness, based on the intuition that random sequencesshould avoid having statistically rare properties. There aretwo other major approaches:

• The gambler’s approach: random sequences shouldbe unpredictable.

• The coder’s approach: random sequences shouldnot have regularities that allow us to compress theinformation they contain.

The gambler’s approach may be the most immediatelyintuitive one to the average person. It was formalized inthe computability-theoretic setting by Schnorr (1971), us-ing the idea that we should not be able to make arbitrar-ily much money when betting on the bits of a random se-quence. The following notion is a simple special case ofthe notion of amartingale fromprobability theory. (See [9,Section 6.3.4] for further discussion of the relationship be-tween these concepts.)

Definition 3. A martingale is a function 𝑓 ∶ 2<𝜔 → ℝ≥0

such that

𝑓(𝜎) = 𝑓(𝜎0) + 𝑓(𝜎1)2

for all𝜎. We say that 𝑓 succeeds on𝑋 if limsup𝑛→∞ 𝑓(𝑋 ↾𝑛) = ∞.

We think of 𝑓 as the capital we have when betting onthe bits of a binary sequence according to a particular bet-ting strategy. The displayed equation ensures that the bet-ting is fair. Success then means that we can make arbitrar-ily much money when betting on 𝑋, which should nothappen if 𝑋 is random. By considering martingales withvarying levels of effectivity, we get various notions of algo-rithmic randomness, including ML-randomness itself, asit turns out.

For example, 𝑋 is computably random if no computablemartingale succeeds on it, and polynomial-time random ifno polynomial-time computable martingale succeeds onit. (For the purposes of defining these notionswe can thinkof the martingale as rational-valued.) Schnorr (1971)showed that𝑋 is ML-random iff no left-c.e. martingale suc-ceeds on it, where a function 𝑓 ∶ 2<𝜔 → ℝ≥0 is left-c.e. ifit is computably approximable from below, i.e., there isa computable function 𝑔 ∶ 2𝜔 × ℕ → ℚ≥0 such that𝑔(𝜎,𝑛) ≤ 𝑔(𝜎,𝑛 + 1) for all 𝜎 and 𝑛, and 𝑓(𝜎) =lim𝑛→∞ 𝑔(𝜎,𝑛) for all 𝜎. One way to think of a left-c.e.

martingale is that initially we might have no idea what tobet on some string 𝜎, but as we learn more about the uni-verse, we might discover that 𝜎 seems more unlikely tobe an initial segment of a random sequence, and are thenprepared to bet more of our capital on it.

The coder’s approach builds on the idea that a randomstring should have no short descriptions. For example, indescribing 010101… (1000 times) by the brief descrip-tion “print 01 1000 times,” we are using regularities inthis string to compress it. For a more complicated string,say the first 2000 bits of the binary expansion of 𝑒𝜋, theregularities may be harder to perceive, but are still thereand can still lead to compression. A random string shouldhave no such exploitable regularities (i.e., regularities thatare not present in most strings), so the shortest way to de-scribe it should be basically to write it out in full. This ideacan be formalized using the well-known concept of Kol-mogorov complexity. We can think of a Turing machine𝑀with inputs and outputs in 2<𝜔 as a description system.If 𝑀(𝜏) = 𝜎 then 𝜏 is a description of 𝜎 relative to thisdescription system. The Kolmogorov complexity 𝐶𝑀(𝜎) of𝜎 relative to 𝑀 is the length of the shortest 𝜏 such that𝑀(𝜏) = 𝜎. We can then take a universal Turing machine𝑈, which emulates any given Turing machine with at mosta constant increase in the size of programs, and define the(plain) Kolmogorov complexity of𝜎 as𝐶(𝜎) = 𝐶𝑈(𝜎). Thevalue of 𝐶(𝜎) depends on 𝑈, but only up to an additiveconstant independent of 𝜎. We think of a string as ran-dom if its Kolmogorov complexity is close to its length.

For an infinite sequence 𝑋, a natural guess would bethat 𝑋 should be considered random if every initial seg-ment of 𝑋 is incompressible in this sense, i.e., if 𝐶(𝑋 ↾𝑛) ≥ 𝑛−𝑂(1). However, plain Kolmogorov complexityis not quite the right notion here, because the informationin a description 𝜏 consists not only of the bits of 𝜏, butalso its length, which can provide another log2 |𝜏| manybits of information. Indeed, Martin-Löf (see [18]) showedthat it is not possible to have 𝐶(𝑋 ↾ 𝑛) ≥ 𝑛 − 𝑂(1):Given a long string 𝜌, we can write 𝜌 = 𝜎𝜏𝜈, where |𝜏|is the position of 𝜎 in the length-lexicographic orderingof 2<𝜔. Consider the Turing machine 𝑀 that, on input 𝜂,determines the |𝜂|th string 𝜉 in the length-lexicographicordering of 2<𝜔 and outputs 𝜉𝜂. Then 𝑁(𝜏) = 𝜎𝜏. Forany sequence 𝑋 and any 𝑘, this process allows us to com-press some initial segment of 𝑋 by more than 𝑘 many bits.

There are several ways to get around this problem bymodifying the definition of Kolmogorov complexity. Thebest-known one is to use prefix-free codes, that is, to re-strict ourselves tomachines𝑀 such that if𝑀(𝜏) is defined(i.e., if the machine eventually halts on input 𝜏) and 𝜇 isa proper extension of 𝜏, then 𝑀(𝜇) is not defined. Thereare universal prefix-free machines, and we can take sucha machine 𝑈 and define the prefix-free Kolmogorov complex-

1004 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

Page 5: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

ity of 𝜎 as 𝐾(𝜎) = 𝐶𝑈(𝜎). The roots of this notion canbe found in the work of Levin, Chaitin, and Schnorr, andin a certain sense—like the notion of Kolmogorov com-plexity more generally—even earlier in that of Solomonoff(see [9,18]). As shown by Schnorr (see Chaitin (1975)), itis indeed the case that 𝑋 is Martin-Löf random if and onlyif 𝐾(𝑋 ↾ 𝑛) ≥ 𝑛−𝑂(1).

There are other varieties of Kolmogorov complexity, but𝐶 and 𝐾 are the main ones. For applications, it oftendoes not matter which variety is used. The following sur-prising result establishes a fairly precise relationship be-tween 𝐶 and 𝐾. Let 𝐶(1)(𝜎) = 𝐶(𝜎) and 𝐶(𝑛+1)(𝜎) =𝐶(𝐶(𝑛)(𝜎)).

Theorem4 (Solovay (1975)). 𝐾(𝜎) = 𝐶(𝜎)+𝐶(2)(𝜎)±𝑂(𝐶(3)(𝜎)), and this result is tight in that we cannot extendit to 𝐶(4)(𝜎).

There is a vast body of research on Kolmogorov com-plexity and its applications. We will discuss some of theseapplications below; much more on the topic can be foundin Li and Vitanyi [18].

GoalsThere are several ways to explore the ideas introducedabove. First, there are natural internal questions, such as:How do the various levels of algorithmic randomness in-terrelate? How do calibrations of randomness relate to thehierarchies of computability and complexity theory, andto relative computability? How should we calibrate par-tial randomness? Can a source of partial (algorithmic)randomness be amplified into a source that is fully ran-dom, or at least more random? The books Downey andHirschfeldt [9] and Nies [24] cover material along theselines up to about 2010.

We can also consider applications. Mathematics hasmany theorems that involve “almost everywhere” behav-ior. Natural examples come from ergodic theory, analysis,geometric measure theory, and even combinatorics. Be-havior that occurs almost everywhere should occur at suf-ficiently random points. Using notions from algorithmicrandomness, we can explore exactly how much randomnessis needed in a given case. For example, the set of reals atwhich an increasing function is differentiable is null. Howcomplicated is this null set, and hence, what level of algo-rithmic randomness is necessary for a real to avoid it (as-suming the function is itself computable in some sense)?Is Martin-Löf randomness the right notion here?

We can also use the idea of assigning levels of random-ness to individual objects to prove new theorems or givesimpler proofs of known ones. Early examples of thismethod tended to use Kolmogorov complexity and whatis called the “incompressibility method.” For instance,Chaitin (1971) (see also [17]) famously used Kolmogorov

complexity to give a proof of a version of Gödel’s FirstIncompleteness Theorem, by showing that for any suffi-ciently strong, computably axiomatizable, consistent the-ory 𝑇, there is a number 𝑐 such that 𝑇 cannot prove that𝐶(𝜎) > 𝑐 for any given string 𝜎 (which also follows byinterpreting an earlier result of Barzdins; see [18, Section2.7]). More recently, Kritchman and Raz [17] used thesemethods to give a proof of the Second Incompleteness The-orem as well.3 As we will see below, a more recent lineof research has used notions of effective dimension basedon partial randomness to give new proofs of classical theo-rems in ergodic theory and obtain new results in geometricmeasure theory.

Some Interactions with ComputabilityHalting probabilities. A first question we might ask ishow to generate “natural” examples of algorithmically ran-dom reals. A classic example is Chaitin’s halting probabil-ity. Let 𝑈 be a universal prefix-free machine and let

Ω = ∑𝑈(𝜎)↓

2−|𝜎|.

This number is the measure of the set of sequences𝑋 suchthat 𝑈 halts on some initial segment of 𝑋, which we caninterpret as the halting probability of 𝑈, and was shownby Chaitin (1975) to beML-random (where, asmentionedabove, we identifyΩwith its binary expansion, thought ofas an infinite binary sequence).

For any prefix-free machine𝑀 in place of𝑈we can sim-ilarly define a halting probability. In some ways, haltingprobabilities are the analogs of computably enumerablesets in the theory of algorithmic randomness. Every halt-ing probability 𝛼 is a left-c.e. real, meaning that there isa computable increasing sequence of rationals convergingto it. Calude, Hertling, Khoussainov, and Wang (1998)showed that every left-c.e. real is the halting probability ofsome prefix-free machine.

We should perhaps write Ω𝑈 instead of Ω, to stress thedependence of its particular value on the choice of uni-versal machine, but the fundamental properties of Ω donot depend on this choice, much as those of the haltingproblem do not depend on the specific choice of enumer-ation of Turing machines. In particular, Kucera and Sla-man (2001) showed that every left-c.e. real is reducible toevery Ω𝑈 up to a strong notion of reducibility known asSolovay reducibility, and hence all such Ω𝑈’s are equiva-lent modulo this notion. (The situation is analogous tothat of versions of the halting problem, where the relevantnotion is known as 1-reducibility.)3Other recent work has explored the effect of adding axioms asserting the incompressibilityof certain strings in a probabilistic way. Bienvenu, Romashchenko, Shen, Taveneaux, andVermeeren [4] have shown that this kind of procedure does not help to prove new interest-ing theorems, but that the situation changes if we take into account the sizes of the proofs:randomly chosen axioms (in a sense made precise in their paper) can help to make proofsmuch shorter under the reasonable complexity-theoretic assumption that NP ≠ PSPACE.

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1005

Page 6: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

Left-c.e. and right-c.e. reals (those of the form 1−𝛼 fora left-c.e. 𝛼) occur naturally in mathematics. Bravermanand Yampolsky [7] showed that they arise in connectionwith Julia sets, and there is a striking example in symbolicdynamics: A 𝑑-dimensional subshift of finite type is a cer-tain kind of collection of 𝐴-colorings of ℤ𝑑, where 𝐴 is afinite set, defined by local rules (basically saying that cer-tain coloring patterns are illegal) invariant under the shiftaction

(𝑆𝑔𝑥)(ℎ) = 𝑥(ℎ + 𝑔) for 𝑔, ℎ ∈ ℤ𝑑 and 𝑥 ∈ 𝐴ℤ𝑑 .Its (topological) entropy is an important invariantmeasuringthe asymptotic growth in the number of legal colorings offinite regions. It has been known for some time that en-tropies of subshifts of finite type for dimensions 𝑑 ≥ 2 arein general not computable, but the following result givesa precise characterization.

Theorem 5 (Hochman and Meyerovitch [15]). The valuesof entropies of subshifts of finite type over ℤ𝑑 for 𝑑 ≥ 2 areexactly the nonnegative right-c.e. reals.

Algorithmic randomness and relative computability.Solovay reducibility is stronger than Turing reducibility, soΩ can compute the halting problem ∅′. Indeed Ω and∅′ are Turing equivalent, and in fact Ω can be seen as a“highly compressed” version of ∅′. Other computability-theoretically powerful ML-random sequences can be ob-tained from the following remarkable result.

Theorem6 (Gacs (1986), Kucera (1985)). For every𝑋 thereis an ML-random 𝑌 such that 𝑋 ≤T 𝑌.

This theorem and the Turing equivalence of Ω with ∅′

do not seem to accord with our intuition that random setsshould have low “useful information.” This phenomenoncan be explained by results showing that, for certain pur-poses, the benchmark set by ML-randomness is too low. Aset 𝐴 has PA degree if it can compute a {0, 1}-valued func-tion 𝑓 with 𝑓(𝑛) ≠ Φ𝑛(𝑛) for all 𝑛. (The reason for thename is that this property is equivalent to being able tocompute a completion of Peano Arithmetic.) Such a func-tion can be seen as a weak version of the halting problem,but while∅′ has PA degree, there are sets of PA degree thatare low, in the sense of the section on “Some basic com-putability theory,” and hence are far less powerful than∅′.

Theorem 7 (Stephan (2006)). If an ML-random sequencehas PA degree then it computes ∅′.

Thus there are two kinds of ML-random sequences.Ones that are complicated enough to somehow “simulate”randomness, and “truly random” ones that are muchweaker. It is known that the class of sequences that cancompute ∅′ has measure 0, so almost all ML-random se-quences are in the second class. One way to ensure that

a sequence is in that class is to increase the complexity ofour tests by relativizing them to noncomputable oracles.It turns out that iterates of the Turing jump are particu-larly natural oracles to use. Let ∅(0) = ∅ and ∅(𝑛+1) =(∅(𝑛))′. We say that 𝑋 is 𝑛-random if it passes all Martin-Löf tests relativized to ∅(𝑛−1). Thus the 1-randomsequences are just the ML-random ones, while the 2-random ones are the ones that are ML-random relative tothe halting problem. These sequences have low compu-tational power in several ways. For instance, they cannotcompute any noncomputable c.e. set, and in fact the fol-lowing holds.

Theorem 8 (Kurtz (1981)). If 𝑋 is 2-random and 𝑌 is com-putable relative both to ∅′ and to 𝑋, then 𝑌 is computable.

A precise relationship between tests and the dichotomymentioned above was established by Franklin and Ng [11].

In general, amongML-random sequences, computation-al power (or “useful information”) is inversely proportion-al to level of randomness. The following is one of manyresults attesting to this heuristic.

Theorem 9 (Miller and Yu (2008)). Let 𝑋 ≤T 𝑌. If 𝑋 isML-random and 𝑌 is 𝑛-random, then 𝑋 is also 𝑛-random.

There are many other interesting levels of algorithmicrandomness. Schnorr (1971) argued that his martingalecharacterization of ML-randomness shows that this is anintrinsically computably enumerable rather than computablenotion, and defined a notion now called Schnorr random-ness, which is like the notion of computable randomnessmentioned below Definition 3 but with an extra effective-ness condition on the rate of success of martingales. Healso showed that 𝑋 is Schnorr random iff it passes allMartin-Löf tests 𝑇0, 𝑇1,… such that the measures 𝜆(𝑇𝑛)are uniformly computable (i.e., the function 𝑛 ↦ 𝜆(𝑇𝑛)is computable in the sense of the section on “Analysis andergodic theory” below). It follows immediately from theirdefinitions in terms of martingales that ML-randomnessimplies computable randomness, which in turn impliesSchnorr randomness. It ismore difficult to prove that noneof these implications can be reversed. In fact, these levelsof randomness are close enough that they agree for setsthat are somewhat close to computable, as shown by thefollowing result, where highness is as defined in the sec-tion on “Some basic computability theory.”

Theorem 10 (Nies, Stephan, and Terwijn (2005)). Everyhigh Turing degree contains a set that is computably randombut not ML-random and a set that is Schnorr random but notcomputably random. This fact is tight, however, because everynonhigh Schnorr random set is ML-random.

As we will discuss, various notions of algorithmic ran-domness arise naturally in applications.

1006 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

Page 7: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

Randomness-theoretic weakness. As mentioned above,𝑋 is ML-random iff 𝐾(𝑋 ↾ 𝑛) ≥ 𝑛 − 𝑂(1), i.e., 𝑋’s ini-tial segments have very high complexity. There are similarcharacterizations of other notions of algorithmic random-ness, as well as of notions arising in other parts of com-putability theory, in terms of high initial segment complex-ity. For instance, Downey and Griffiths (2004) showedthat 𝑋 is Schnorr random iff 𝐶𝑀(𝑋 ↾ 𝑛) ≥ 𝑛 − 𝑂(1)for every prefix-free machine 𝑀 with computable haltingprobability, while Kjos-Hanssen, Merkle, and Stephan(2006) showed that 𝑋 can compute a diagonally noncom-putable function, that is, a function ℎ with ℎ(𝑒) ≠ Φ𝑒(𝑒)for all 𝑒, iff there is an 𝑋-computable function 𝑓 such that𝐶(𝑋 ↾ 𝑓(𝑛)) ≥ 𝑛 for all 𝑛. But what if the initial seg-ments of a sequence have low complexity? Such sequenceshave played an important role in the theory of algorithmicrandomness, beginning with the following information-theoretic characterization of computability.

Theorem 11 (Chaitin (1976)). 𝐶(𝑋 ↾ 𝑛) ≤ 𝐶(𝑛)+𝑂(1)iff 𝑋 is computable.

It is also true that if 𝑋 is computable then 𝐾(𝑋 ↾ 𝑛) ≤𝐾(𝑛)+𝑂(1). Chaitin (1977) considered sequences withthis property, which are now called 𝐾-trivial. He showedthat every𝐾-trivial sequence is∅′-computable, and askedwhether they are all in fact computable. Solovay (1975)answered this question by constructing a noncomputable𝐾-trivial sequence.

The class of 𝐾-trivials has several remarkable proper-ties. It is a naturally definable countable class, containedin the class of low sets (as defined in the section on “Somebasic computability theory,” where we identify a set withits characteristic function, thought of as a sequence), butwith stronger closure properties. (In technical terms, it iswhat is known as a Turing ideal.) Post’s problem askedwhether there are computably enumerable sets that are nei-ther computable nor Turing equivalent to the halting prob-lem. Its solution in the 1950s by Friedberg and Much-nik introduced the somewhat complex priority method,which has played a central technical role in computabilitytheory since then. Downey, Hirschfeldt, Nies, and Stephan(2003) showed that 𝐾-triviality can be used to give a sim-ple priority-free solution to Post’s problem.

Most significantly, there are many natural notions ofrandomness-theoretic weakness that turn out to be equiv-alent to 𝐾-triviality.

Theorem 12 (Nies (2005), Nies and Hirschfeldt for(1) → (3)). The following are equivalent.

1. 𝐴 is 𝐾-trivial.2. 𝐴 is computable relative to some c.e. 𝐾-trivial set.3. 𝐴 is low for 𝐾, meaning that 𝐴 has no compressionpower as an oracle. i.e., that 𝐾𝐴(𝜎) ≥ 𝐾(𝜎) − 𝑂(1),

where 𝐾𝐴 is the relativization of prefix-free Kolmogorovcomplexity to 𝐴.

4. 𝐴 is low for ML-randomness, meaning that 𝐴 doesnot have any derandomization power as an oracle, i.e., anyML-random set remains ML-random when this notion isrelativized to 𝐴.

There are now more than a dozen other characteriza-tions of 𝐾-triviality. Some appear in [9, 24], and severalothers have emerged more recently. These have been usedto solve several problems in algorithmic randomness andrelated areas. Lowness classes have also been found forother randomness notions. For Schnorr randomness, forinstance, lowness can be characterized using notions oftraceability related to concepts in set theory, as first ex-plored by Terwijn and Zambella (2001).

Some ApplicationsIncompressibility and information content. This articlefocuses on algorithmic randomness for infinite objects, butwe should mention that there have been many applica-tions of Kolmogorov complexity under the collective ti-tle of the incompressibility method, based on the observa-tion that algorithmically random strings should exhibittypical behavior for computable processes. For example,this method can be used to give average running timesfor sorting, by showing that if the outcome is not whatwe would expect then we can compress a random input.See Li and Vitanyi [18, Chapter 6] for applications of thistechnique to areas as diverse as combinatorics, formal lan-guages, compact routing, and circuit complexity, amongothers. Many results originally proved using Shannon en-tropy or relatedmethods also have proofs using Kolmogor-ov complexity. For example, Messner and Thierauf [22]gave a constructive proof of the Lovasz Local Lemma us-ing Kolmogorov complexity.

Other applications come from the observation thatin some sense Kolmogorov complexity provides an“absolute” measure of the intrinsic complexity of a string.We can define a notion of conditional Kolmogorov com-plexity 𝐶(𝜎 ∣ 𝜏) of a string 𝜎 given another string 𝜏.Then, for example, 𝐶(𝜎 ∣ 𝜎) = 𝑂(1), and 𝜎 is “inde-pendent of 𝜏” if 𝐶(𝜎 ∣ 𝜏) = 𝐶(𝜎) ±𝑂(1). Researcherscomparing two sequences𝜎,𝜏 representing, say, twoDNAsequences, or two phylogenetic trees, or two languages, ortwo pieces of music, have invented many distance metrics,such as themaximumparsimony distance on phylogenetictrees, but it is also natural to use a content-neutral mea-sure of “information distance” like max{𝐶(𝜎 ∣ 𝜏),𝐶(𝜏 ∣𝜎)}. There have been some attempts to make this work inpractice for solving classification problems, though resultshave so far been mixed. Of course, 𝐶 is not computable,but it can be replaced in applications by measures derived

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1007

Page 8: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

from practical compression algorithms. See [18, Sections8.3 and 8.4].Effective dimensions. If𝑋 = 𝑥0𝑥1 … is random, thenwemight expect a sequence such as 𝑥000𝑥100𝑥200… to be“13 -random.” Making precise sense of the idea of partial al-

gorithmic randomness has led to significant applications.Hausdorff used work of Caratheodory on 𝑠-dimensionalmeasures to generalize the notion of dimension to possi-bly nonintegral values, leading to concepts such as Haus-dorff dimension and packing dimension. Much like algo-rithmic randomness can make sense of the idea of individ-ual reals being random, notions of partial algorithmic ran-domness can be used to assign dimensions to individualreals.

The measure-theoretic approach, in which we forinstance replace the uniform measure 𝜆 on 2𝜔 by a gen-eralized notion assigning the value 2−𝑠|𝜎| to [𝜎] (where0 < 𝑠 ≤ 1), was translated by Lutz (2000, 2003) into anotion of 𝑠-gale, where the fairness condition of a martin-gale is replaced by 𝑓(𝜎) = 2−𝑠(𝑓(𝜎0) + 𝑓(𝜎1)). Wecan view 𝑠-gales as modeling betting in a hostile environ-ment (an idea due to Lutz), where “inflation” is acting sothat not winning means that we automatically lose money.Roughly speaking, the effective fractal dimension of a se-quence is then determined by the most hostile environ-ment in which we can still make money betting on thissequence.

Mayordomo (2002) and Athreya, Hitchcock, Lutz, andMayordomo (2007) found equivalent formulations interms of Kolmogorov complexity, which we take as defi-nitions. (Here it does not matter whether we use plain orprefix-free Kolmogorov complexity.)

Definition 13. Let 𝑋 ∈ 2𝜔. The effective Hausdorff dimen-sion of 𝑋 is

dim(𝑋) = lim inf𝑛→∞

𝐾(𝑋 ↾ 𝑛)𝑛 .

The effective packing dimension of 𝑋 is

Dim(𝑋) = limsup𝑛→∞

𝐾(𝑋 ↾ 𝑛)𝑛 .

It is not hard to extend these definitions to elements ofℝ𝑛, yielding effective dimensions between 0 and 𝑛. Theycan also be relativized to any oracle 𝐴 to obtain the ef-fective Hausdorff and packing dimensions dim𝐴(𝑋) andDim𝐴(𝑋) of 𝑋 relative to 𝐴.

It is of course not immediately obvious why these no-tions are effectivizations of Hausdorff and packing dimen-sion, but crucial evidence of their correctness is providedby point to set principles, which allow us to express the di-mensions of sets of reals in terms of the effective dimen-sions of their elements. The most recent and powerful ofthese is the following, where we denote the classical Haus-

dorff dimension of 𝐸 ⊆ ℝ𝑛 by dimH(𝐸), and its classicalpacking dimension by dimp(𝐸).Theorem 14 (Lutz and Lutz [19]).

dimH(𝐸) = min𝐴⊆ℕ

sup𝑋∈𝐸

dim𝐴(𝑋).

dimp(𝐸) = min𝐴⊆ℕ

sup𝑋∈𝐸

Dim𝐴(𝑋).

For certain well-behaved sets 𝐸, relativization is actu-ally not needed, and the classical dimension of 𝐸 is thesupremum of the effective dimensions of its points. In thegeneral case, it is of course not immediately clear that theminima mentioned in Theorem 14 should exist, but theydo. Thus, for example, to prove a lower bound of 𝛼 fordimH(𝐸) it suffices to prove that, for each 𝜖 > 0 and each𝐴, the set 𝐸 contains a point 𝑋 with dim𝐴(𝑋) > 𝛼 − 𝜖.In several applications, this argument turns out to be eas-ier than ones directly involving classical dimension. Thisfact is somewhat surprising given the need to relativize toarbitrary oracles, but in practice this issue has so far turnedout not to be an obstacle.

For example, Lutz and Stull [21] obtained a new lowerbound on the Hausdorff dimension of generalized sets ofFurstenberg type; Lutz [20] showed that a fundamental in-tersection formula, due in the Borel case to Kahane andMattila, is true for arbitrary sets; and Lutz and Lutz [19]gave a new proof of the two-dimensional case (originallyproved by Davies) of the well-known Kakeya conjecture,which states that, for all 𝑛 ≥ 2, if a subset of ℝ𝑛 has linesof length 1 in all directions, then it has Hausdorff dimen-sion 𝑛.

There had been earlier applications of effective dimen-sion, for instance in symbolic dynamics, whose iterativeprocesses are naturally algorithmic. For example, Simp-son [25] generalized a result of Furstenberg as follows. Let𝐴 be finite and𝐺 be eitherℕ𝑑 or ℤ𝑑. A closed set𝑋 ⊆ 𝐴𝐺

is a subshift if it is closed under the shift action of 𝐺 on 𝐴𝐺

(see the section on “Halting probabilities”).

Theorem 15 (Simpson [25]). Let𝐴 be finite and𝐺 be eitherℕ𝑑 or ℤ𝑑. If𝑋 ⊆ 𝐴𝐺 is a subshift then the topological entropyof𝑋 is equal both to its classical Hausdorff dimension and to thesupremum of the effective Hausdorff dimensions of its elements.

In currently unpublished work, Day has used effectivemethods to give a new proof of the Kolmogorov-Sinai the-orem on entropies of Bernoulli shifts.

There are other applications of sequences of high effec-tive dimension, for instance ones involving the interestingclass of shift complex sequences. While initial segments ofML-random sequences have high Kolmogorov complexity,not all segments of such sequences do. Random sequencesmust contain arbitrarily long strings of consecutive 0’s, forexample. However, for any 𝜖 > 0 there are 𝜖-shift complex

1008 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

Page 9: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

sequences 𝑌 such that for any string 𝜎 of consecutive bitsof 𝑌, we have 𝐾(𝜎) ≥ (1 − 𝜖)|𝜎| − 𝑂(1). These se-quences can be used to create tilings with properties suchas certain kinds of pattern-avoidance, and have found usesin symbolic dynamics. See for instanceDurand, Levin, andShen (2008) and Durand, Romashchenko, and Shen [10].Randomness amplification. Many practical algorithmsuse random seeds. For example, the important PolynomialIdentity Testing (PIT) problem takes as input a polynomial𝑃(𝑥1,… , 𝑥𝑛)with coefficients from a large finite field anddetermineswhether it is identically0. Many practical prob-lems can be solved using a reduction to this problem.There is a natural fast algorithm to solve it randomly: Takea random sequence of values for the variables. If the poly-nomial is not 0 on these values, “no” is the correct answer.Otherwise, the probability that the answer is “yes” is veryhigh. It is conjectured that PIT has a polynomial-time de-terministic algorithm,4 but no such algorithm is known.

Thus it is important to have good sources of random-ness. Some (including Turing) have believed that random-ness can be obtained from physical sources, and there arenow commercial devices claiming to do so. At a more the-oretical level, we might ask questions such as:

1. Can a weak source of randomness always be amplifiedinto a better one?

2. Can we in fact always recover full randomness frompartial randomness?

3. Are random sources truly useful as computational re-sources?

In our context, we can consider precise versions of suchquestions by taking randomness to mean algorithmic ran-domness, and taking all reduction processes to be com-putable ones. One way to interpret the first two questionsthen is to think of partial randomness as having nonzeroeffective dimension. For example, for packing dimension,we have the following negative results.

Theorem 16 (Downey and Greenberg (2008)). There is an𝑋 such that Dim(𝑋) = 1 and 𝑋 computes no ML-randomsequence. (This𝑋 can be built to be of minimal degree, whichmeans that every 𝑋-computable set is either computable or hasthe same Turing degree as𝑋. It is known that such an𝑋 cannotcompute an ML-random sequence.)

Theorem 17 (Conidis [8])). There is an 𝑋 such thatDim(𝑋) > 0 and 𝑋 computes no 𝑌 with Dim(𝑌) = 1.

On the other hand, we also have the following strongpositive result.

Theorem 18 (Fortnow, Hitchcock, Pavan, Vinochandran,and Wang (2006)). If 𝜖 > 0 and Dim(𝑋) > 0 then there is

4This conjecture comes from the fact that PIT belongs to a complexity class known as BPP,which is widely believed to equal the complexity class P of polynomial-time solvable prob-lems, since Impagliazzo and Wigderson showed in the late 1990s that if the well-knownSatisfiability problem is as hard as generally believed, then indeed BPP = P.

an 𝑋-computable 𝑌 such that Dim(𝑌) > 1− 𝜖. (In fact, 𝑌can be taken to be equivalent to 𝑋 via polynomial-time reduc-tions.)

For effective Hausdorff dimension, the situation isquite different. Typically, the way we obtain an 𝑋 withdim(𝑋) = 1

2 , say, is to start with an ML-random sequenceand somehow “mess it up,” for example by making ev-ery other bit a 0. This kind of process is reversible, inthe sense that it is easy to obtain an 𝑋-computable ML-random. However, Miller [23] showed that it is possible toobtain sequences of fractional effective Hausdorff dimen-sion that permit no randomness amplification at all.

Theorem19 (Miller [23]). There is an𝑋 such thatdim(𝑋) =12 and if 𝑌 ≤T 𝑋 then dim(𝑌) ≤ 1

2 .

That is, effective Hausdorff dimension cannot in generalbe amplified. (In this theorem, the specific value 1

2 is onlyan example.) Greenberg and Miller [13] also showed thatthere is an 𝑋 such that dim(𝑋) = 1 and 𝑋 does not com-pute any ML-random sequences. Interestingly, Zimand(2010) showed that for two sequences 𝑋 and 𝑌 of nonzeroeffective Hausdorff dimension that are in a certain techni-cal sense sufficiently independent, 𝑋 and 𝑌 together cancompute a sequence of effective Hausdorff dimension 1.

In some attractive recent work, it has been shown thatthere is a sense in which the intuition that every sequenceof effective Hausdorff dimension 1 is close to an ML-random sequence is correct. The following is a simplifiedversion of the full statement, which quantifies how muchrandomness can be extracted at the cost of altering a se-quence on a set of density 0. Here 𝐴 ⊆ ℕ has (asymptotic)density 0 if lim𝑛→∞

|𝐴↾𝑛|𝑛 = 0.

Theorem 20 (Greenberg, Miller, Shen, and Westrick [14]).If dim(𝑋) = 1 then there is an ML-random 𝑌 such that{𝑛 ∶ 𝑋(𝑛) ≠ 𝑌(𝑛)} has density 0.

The third question above is whether sources of random-ness can be useful oracles. Here we are thinking in terms ofcomplexity rather than just computability, so results suchas Theorem 6 are not directly relevant. Allender and oth-ers have initiated a program to investigate the speedupsthat are possible when random sources are queried effi-ciently. Let 𝑅 be the set of all random finite binary stringsfor either plain or prefix-free Kolmogorov complexity (e.g.,𝑅 = {𝑥 ∶ 𝐶(𝑥) ≥ |𝑥|}). For a complexity class 𝒞, let 𝒞𝑅

denote the relativization of this class to𝑅. So, for instance,for the class P of polynomial-time computable functions,P𝑅 is the class of functions that can be computed in poly-nomial time with 𝑅 as an oracle. (For references to thearticles in this and the following theorem, see [1].)

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1009

Page 10: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

Theorem 21 (Buhrman, Fortnow, Koucky, and Loff (2010);Allender, Buhrman, Koucky, van Melkebeek, and Ronneb-urger (2006); Allender, Buhrman, and Koucky (2006)).

1. PSPACE ⊆ P𝑅.2. NEXP ⊆ NP𝑅.3. BPP ⊆ P𝑅

tt (where the latter is the class of functionsthat are reducible to 𝑅 in polynomial time via truth-tablereductions, a more restrictive notion of reduction than Tur-ing reduction).

The choice of universal machine does have some effecton efficient computations, but we can quantify over all uni-versal machines. In the result below, 𝑈 ranges over uni-versal prefix-free machines, and 𝑅𝐾𝑈 is the set of randomstrings relative to Kolmogorov complexity defined using𝑈.

Theorem 22 (Allender, Friedman, and Gasarch (2013);Cai, Downey, Epstein, Lempp, and Miller (2014)).

1. ⋂𝑈 P𝑅𝐾𝑈tt ⊆ PSPACE.

2. ⋂𝑈 NP𝑅𝐾𝑈 ⊆ EXPSPACE.

We can also say that sufficiently random oracles will al-ways accelerate some computations in the following sense.Say that 𝑋 is low for speed if for any computable set 𝐴and any function 𝑡 such that 𝐴 can be computed in time𝑡(𝑛) using 𝑋 as an oracle, there is a polynomial 𝑝 suchthat𝐴 can be computed (with no oracle) in time boundedby 𝑝(𝑡(𝑛)). That is, 𝑋 does not significantly accelerateany computation of a computable set. Bayer and Slaman(see [3]) constructed noncomputable sets that are low forspeed, but these cannot be very random.

Theorem 23 (Bienvenu and Downey [3]). If 𝑋 is Schnorrrandom, then it is not low for speed, and this fact is witnessedby an exponential-time computable set 𝐴.Analysis and ergodic theory. Computable analysis is anarea that has developed tools for thinking about comput-ability of objects like real-valued functions by taking ad-vantage of separability. Say that a sequence of rationals𝑞0, 𝑞1,… converges fast to 𝑥 if |𝑥 − 𝑞𝑛| ≤ 2−𝑛 for all 𝑛.A function 𝑓 ∶ ℝ → ℝ is (Type 2) computable if there isan algorithm Φ that, for every 𝑥 ∈ ℝ and every sequence𝑞0, 𝑞1,… that converges fast to 𝑥, ifΦ is given 𝑞0, 𝑞1,… asan oracle, then it can compute a sequence that convergesfast to 𝑓(𝑥). We can extend this definition to similar sep-arable spaces. We can also relativize it, and it is then notdifficult to see that a function is continuous iff it is com-putable relative to some oracle, basically because to definea continuous function we need only to specify its actionon a countable collection of balls.

Mathematics is replete with results concerning almosteverywhere behavior, and algorithmic randomness allowsus to turn such results into “quantitative” ones like the fol-lowing.

Theorem 24 (Brattka, Miller, and Nies [5], also Demuth(75, see [5]) for (2)).

1. The reals at which every computable increasing functionℝ → ℝ is differentiable are exactly the computably randomones.

2. The reals at which every computable function ℝ → ℝof bounded variation is differentiable are exactly the ML-random ones.

Ergodic theory is another area that has been studiedfrom this point of view. A measure-preserving transfor-mation 𝑇 on a probability space is ergodic if all measur-able subsets 𝐸 such that 𝑇−1(𝐸) = 𝐸 have measure 1or 0. Notice that this is an “almost everywhere” defini-tion. We can make this setting computable (and many sys-tems arising from physics will be computable). One wayto proceed is to work in Cantor space without loss of gen-erality, since Hoyrup and Rojas [16] showed that any com-putable metric space with a computable probability mea-sure is isomorphic to this space in an effective measure-theoretic sense. Then we can specify a computable trans-formation 𝑇 as a computable limit of computable partialmaps 𝑇𝑛 ∶ 2<𝜔 → 2<𝜔 with certain coherence condi-tions. We can also transfer definitions like that of ML-randomness to computable probability spaces other thanCantor space.

The following is an illustrative result. A classic theoremof Poincare is that if𝑇 is measure-preserving, then for all𝐸of positive measure and almost all 𝑥, we have 𝑇𝑛(𝑥) ∈ 𝐸for infinitely many 𝑛. For a class 𝒞 of measurable subsets,𝑥 is a Poincare point for 𝑇 with respect to 𝒞 if for every𝐸 ∈ 𝒞 of positivemeasure,𝑇𝑛(𝑥) ∈ 𝐸 for infinitely many𝑛. An effectively closed set is one whose complement can bespecified as a computably enumerable union of basic opensets.

Theorem 25 (Bienvenu, Day, Mezhirov, and Shen [2]). Let𝑇 be a computable ergodic transformation on a computableprobability space. Every ML-random element of this space isa Poincare point for the class of effectively closed sets.

In general, the condition that the element be ML-random is not just sufficient but necessary, even in a sim-ple case like the shift operator on Cantor space.

The non-ergodic case has also been analyzed, by Frank-lin and Towsner [12], who also studied the Birkhoff er-godic theorem. In these and several other cases, similarcorrespondences with various notions of algorithmic ran-domness have been found. While many theorems of er-godic theory have been analyzed in this way, including theBirkhoff, Poincare, and von Neumann ergodic theorems,some, like Furstenberg’s ergodic theorem, have yet to beunderstood from this point of view.

Regarding the physical interpretation of some of thework in this area, Braverman, Grigo, and Rojas [6] have

1010 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

Page 11: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

obtained results that they argue show that, while randomnoise makes predicting the short-term behavior of a sys-tem difficult, it may in fact allow prediction to be easier inthe long term.Normality revisited. Borel’s notion of normality, withwhich we began our discussion, is a very weak kind ofrandomness. Polynomial-time randomness implies abso-lute normality, and Schnorr and Stimm (1971/72) showedthat a sequence is normal to a given base iff it satisfies amartingale-based notion of randomness defined using cer-tain finite state automata, amuchweakermodel of compu-tation than Turing machines. Building examples of abso-lutely normal numbers is another matter, as Borel alreadynoted. While it is conjectured that 𝑒, 𝜋, and all irrationalalgebraic numbers such as√2 are absolutely normal, noneof these have been proved to be normal to any base. Inhis unpublished manuscript “A note on normal numbers,”believed to have been written in 1938, Turing built a com-putable absolutely normal real, which is in a sense the clos-est we have come so far to obtaining an explicitly-describedabsolutely normal real. (His construction was not pub-lished until his Collected Works in 1992, and there wasuncertainty as to its correctness until Becher, Figueira, andPicchi (2007) reconstructed and completed it, correctingminor errors.5) There is a sense in which Turing antici-pated Martin-Löf’s idea of looking at a large collection ofeffective tests, in this case ones sufficiently strong to ensurethat a real is normal for all bases, but sufficiently weak to al-low some computable sequence to pass them all. He tookadvantage of the correlations between blocks of digits inexpansions of the same real in different bases.

This approach can also be thought of in terms of effec-tive martingales, and its point of view has brought abouta great deal of progress in our understanding of normalityrecently. For instance, Becher, Heiber, and Slaman (2013)showed that absolutely normal numbers can beconstructed in low-level polynomial time, and Lutz andMayordomo (arXiv:1611.05911) constructed them in“nearly linear” time. Much of the work along these lineshas been number-theoretic, connected to various notionsof well-approximability of irrational reals, such as that ofa Liouville number, which is an irrational 𝛼 such that forevery natural number 𝑛 > 1, there are 𝑝,𝑞 ∈ ℕ for which|𝛼 − 𝑝

𝑞 | < 𝑞−𝑛. For example, Becher, Heiber, and Sla-man (2015) have constructed computable absolutely nor-mal Liouville numbers. This work has also producedresults in the classical theory of normal numbers, for in-stance by Becher, Bugeaud, and Slaman (2016).

5See https://www-2.dc.uba.ar/staff/becher/publications.htmlfor references to the papers cited here and below.

References[1] Allender E. The complexity of complexity. Computability

and complexity: Springer, Cham; 2017:79–94. MR3629714[2] Bienvenu L, Day A, Mezhirov I, Shen A. Ergodic-type char-

acterizations of algorithmic randomness. Programs, proofs,processes: Springer, Berlin; 2010:49–58. https://doi.org/10.1007/978-3-642-13962-8_6 MR2678114

[3] Bienvenu L, Downey R. On low for speed oracles.35th Symposium on Theoretical Aspects of Computer Sci-ence: Schloss Dagstuhl. Leibniz-Zent. Inform., Wadern;2018:15:1–15:13. MR3779296

[4] Bienvenu L, Romashchenko A, Shen A, Taveneaux A, Ver-meeren S. The axiomatic power of Kolmogorov complexity,Ann. Pure Appl. Logic, no. 9 (165):1380–1402, 2014, DOI10.1016/j.apal.2014.04.009. MR3210074

[5] Brattka V, Miller JS, Nies A. Randomness and differentia-bility, Trans. Amer. Math. Soc., no. 1 (368):581–605, 2016.https://doi.org/10.1090/tran/6484. MR3413875

[6] Braverman M, Grigo A, Rojas C. Noise vs computationalintractability in dynamics. Proceedings of the 3rd Innovationsin Theoretical Computer Science Conference: ACM, New York;2012:128–141. MR3388383

[7] Braverman M, Yampolsky M. Computability of Juliasets, Mosc. Math. J., no. 2 (8):185–231, 399, 2008, DOI10.17323/1609-4514-2008-8-2-185-231 (English, withEnglish and Russian summaries). MR2462435

[8] Conidis CJ. A real of strictly positive effective packing di-mension that does not compute a real of effective pack-ing dimension one, J. Symbolic Logic, no. 2 (77):447–474,2012, DOI 10.2178/jsl/1333566632. MR2963016

[9] Downey RG, Hirschfeldt DR. Algorithmic randomness andcomplexity, Theory and Applications of Computability,Springer, New York, 2010. MR2732288

[10] Durand B, Romashchenko A, Shen A. Fixed-pointtile sets and their applications, J. Comput. System Sci.,no. 3 (78):731–764, 2012, DOI 10.1016/j.jcss.2011.11.001.MR2900032

[11] Franklin JNY, Ng KM. Difference randomness, Proc.Amer. Math. Soc., no. 1 (139):345–360, 2011, DOI10.1090/S0002-9939-2010-10513-0. MR2729096

[12] Franklin JNY, Towsner H. Randomness and non-ergodicsystems, Mosc. Math. J., no. 4 (14):711–744, 827, 2014,DOI 10.17323/1609-4514-2014-14-4-711-744 (English,with English and Russian summaries). MR3292047

[13] Greenberg N, Miller JS. Diagonally non-recursive func-tions and effective Hausdorff dimension, Bull. Lond. Math.Soc., no. 4 (43):636–654, 2011. https://doi.org/10.1112/blms/bdr003. MR2820150

[14] Greenberg N, Miller JS, Shen A, Westrick LB. Dimen-sion 1 sequences are close to randoms, Theoret. Comput.Sci. (705):99–112, 2018, DOI 10.1016/j.tcs.2017.09.031.MR3721461

[15] Hochman M, Meyerovitch T. A characterization of theentropies of multidimensional shifts of finite type, Ann. ofMath. (2), no. 3 (171):2011–2038, 2010, DOI 10.4007/an-nals.2010.171.2011. MR2680402

[16] Hoyrup M, Rojas C. Computability of probability mea-sures and Martin-Löf randomness over metric spaces, In-

AUGUST 2019 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY 1011

cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
Page 12: Computabilityand Randomness - Department of Statistics€¦ · Computabilityand Randomness RodDowneyandDenisR.Hirschfeldt HistoricalRoots VonMises. Around1930,Kolmogorovandothersfound-ed

FEATURED TITLE FROM

Inequalities: An Approach through ProblemsSecond EditionB. J. Venkatachala, Homi Bhabha Centre for Science Education, Mumbai, India

This book is an introduction to basic topics in inequalities and their appli-

cations. These include the arithmetic mean-geometric mean inequality, Cauchy–Schwarz inequality, Chebyshev inequality, rearrangement inequality, convex and concave functions and Muirhead’s theorem. More than 400 problems are included in the book and their solutions are explained. A chapter on geometric inequalities is a special feature of this book. Most of these prob-lems are from International Mathematical Olympiads and from many national Mathematical Olympiads.

The book is intended to help students who are preparing for var-ious mathematical competitions. It is also a good sourcebook for graduate students in consolidating their knowledge of inequalities and their applications.

Hindustan Book Agency; 2018; 532 pages; Softcover; ISBN: 978-93-86279-68-2; List US$72; AMS members US$57.60; Order code HIN/75

Titles published by the Hindustan Book Agency(New Delhi, India) include studies in advanced

mathematics, monographs, lecture notes, and/orconference proceedings on current topics of interest.

Discover more books at bookstore.ams.org/hin.

Publications of Hindustan Book Agency are distributed within the Americas by theAmerican Mathematical Society. Maximum discount of 20% for commercial channels.

form. and Comput., no. 7 (207):830–847, 2009, DOI10.1016/j.ic.2008.12.009. MR2519075

[17] Kritchman S, Raz R. The surprise examination paradoxand the second incompleteness theorem, Notices Amer.Math. Soc., no. 11 (57):1454–1458, 2010. MR2761888

[18] Li M, Vitanyi P. An introduction to Kolmogorov complex-ity and its applications, Third, Texts in Computer Science,Springer, New York, 2008. https://doi.org/10.1007/978-0-387-49820-1 MR2494387

[19] Lutz JH, Lutz N. Algorithmic information, plane Kakeyasets, and conditional dimension, ACM Trans. Comput. The-ory, no. 2 (10):Art. 7, 22, 2018, DOI 10.1145/3201783.MR3811993

[20] Lutz N. Fractal intersections and products via algorith-mic dimension (83):Art. No. 58, 12, 2017. MR3755351

[21] Lutz N, Stull DM. Bounding the dimension of pointson a line. Theory and applications of models of computation:Springer, Cham; 2017:425–439. MR3655512

[22] Messner J, Thierauf T. A Kolmogorov complexity proofof the Lovasz local lemma for satisfiability, Theoret. Comput.Sci. (461):55–64, 2012, DOI 10.1016/j.tcs.2012.06.005.MR2988089

[23] Miller JS. Extracting information is hard: a Tur-ing degree of non-integral effective Hausdorff dimen-sion, Adv. Math., no. 1 (226):373–384, 2011, DOI10.1016/j.aim.2010.06.024. MR2735764

[24] Nies A. Computability and randomness, Oxford LogicGuides, vol. 51, Oxford University Press, Oxford, 2009.MR2548883

[25] Simpson SG. Symbolic dynamics: entropy = dimension= complexity, Theory Comput. Syst., no. 3 (56):527–543,2015, DOI 10.1007/s00224-014-9546-8. MR3334259

Rod Downey Denis R. Hirschfeldt

Credits

Opening image by erhui1979, courtesy of Getty Images.Photo of Rod Downey is courtesy of Robert Cross/Victoria

University of Wellington.Photo ofDenis R.Hirschfeldt is courtesy ofDenis R.Hirschfeldt.

1012 NOTICES OF THE AMERICAN MATHEMATICAL SOCIETY VOLUME 66, NUMBER 7

cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle
cav
Rectangle