distribution of streaks of a coin flip!

34
ϕ-Normal Numbers Jonah Bloch-Johnson May 11, 2008 Abstract The sequence of quotients of the simple continued fraction representa- tion of almost every number in [0,1] contains the natural number k with asymptotic frequency: 1 ln 2 ln (k + 1) 2 k(k + 2) This result has a number of applications; for instance, it tells us the distribution of streaks one would get after tossing a coin an infinite amount of times. Introduction This paper was prepared for Ming-Lun Hsieh for an undergraduate seminar. It is based on research I have been performing with David Bayer. The seminar was on number theory; whereas most number theory is concerned with properties of the natural numbers, this paper is concerned with a property of real numbers found in the interval [0, 1]. More specifically, the paper is centered around defining a prop- erty, that of being ϕ-normal, which is named as an analogy with the more familiar normal real numbers (don’t worry if you don’t know these–they are defined in the paper.) The subject requires an introduction to continued fractions, Lebesgue integration, and dynamical systems–specifically a dynamical system known as the Gauss transformation, as well as an important theorem of ergodic systems, the Birkhoff Ergodic Theorem. As a result, this paper consists primarily of condensa- tion of material from three books: “An Introduction to the Theory of Numbers” by G.H.Hardy and E.M.Wright (Chapter 10) “Real and Complex Analysis” by Walter Rudin (Chapter 1) “Introduction to Dynamical Systems” by Michael Brin and Garrett Stuck (Chapters 1 and 4) 1

Upload: jonah-bloch-johnson

Post on 12-Nov-2014

489 views

Category:

Documents


1 download

DESCRIPTION

This is a draft of a paper about how almost every number can be represented by a simple continued fraction of a certain sort, and the fact that this has implications for how often you will get three tails (or four heads, etc.) in a row when you flip a coin...

TRANSCRIPT

Page 1: Distribution of streaks of a coin flip!

ϕ-Normal Numbers

Jonah Bloch-Johnson

May 11, 2008

Abstract

The sequence of quotients of the simple continued fraction representa-tion of almost every number in [0,1] contains the natural number k withasymptotic frequency:

1ln 2

ln(k + 1)2

k(k + 2)

This result has a number of applications; for instance, it tells us thedistribution of streaks one would get after tossing a coin an infinite amountof times.

Introduction

This paper was prepared for Ming-Lun Hsieh for an undergraduate seminar. It isbased on research I have been performing with David Bayer. The seminar was onnumber theory; whereas most number theory is concerned with properties of thenatural numbers, this paper is concerned with a property of real numbers found inthe interval [0, 1]. More specifically, the paper is centered around defining a prop-erty, that of being ϕ-normal, which is named as an analogy with the more familiarnormal real numbers (don’t worry if you don’t know these–they are defined inthe paper.) The subject requires an introduction to continued fractions, Lebesgueintegration, and dynamical systems–specifically a dynamical system known as theGauss transformation, as well as an important theorem of ergodic systems, theBirkhoff Ergodic Theorem. As a result, this paper consists primarily of condensa-tion of material from three books:

“An Introduction to the Theory of Numbers” by G.H.Hardy and E.M.Wright(Chapter 10)

“Real and Complex Analysis” by Walter Rudin (Chapter 1)“Introduction to Dynamical Systems” by Michael Brin and Garrett Stuck

(Chapters 1 and 4)

1

Page 2: Distribution of streaks of a coin flip!

I also used a paper found online to help prove the corollary for the Berkhoff Er-godic Theorem. My original material consists of those sections which join togetherthe books and the fleshing out of some of the proofs, as well as the application atthe end of the paper, which was inspired by ideas from David Bayer. A note onnumbering conventions in this paper: if a theorem is introduced as “6 Theorem:,”then it is referred to in the text as Theorem 6.

A final note: the material in this subject touches upon two rather beautifulstructures in number theory; periodic continued fractions and the Stern-Brocottree. In class, people expressed particular interest in the theorem of Lagrange,all continued fractions representing irrational roots of integer-coefficient quadraticequations are peridoci, as well as its converse. Unfortunately, there is not room inthis paper to develop the theory behind either of them, but for those who wantto investigate either can look in chapter 10 of the Hardy and Wright book for thecontinued fraction material and in chapters 4 and 6 of Donald Knuth’s “ConcreteMathematics” to learn more about the Stern-Brocot tree.

1 The Gauss Transformation and Continued Frac-

tions

1 Definition: A discrete-time dynamical system (X,f ) consists of a non-emptyset X and a map f :X→X. (f need not be onto.) For n ∈ N, the nth iterate off is the n-fold composition fn = f ◦ · · · ◦ f ; we definite f 0 to be the identitymap (also denoted Id.)

2 Definition: Let {x} denote the greatest integer less than or equal to x forx ∈ R. The map ϕ : [0, 1]→ [0, 1] defined by

ϕ(x) =

{1x− { 1

x} if x ∈ (0, 1]

0 if x = 0

is called the Gauss transformation.

3 Remarks: a) For each interval ( 1n+1

, 1n],∀n ∈ N, ϕ(x) = 1

x− n. Therefore, on

every such interval ϕ(x) is monotonically decreasing, continuous, andhas a range [0,1).

b) ∀n ∈ N, ϕ( 1n) = n− {n} = n− n = 0.

c) ∀n ∈ N, limε→0+

(ϕ( 1n− ε)) = 1. So, based on 3 b), ϕ is discontinuous at 1

n

for all n ∈ N.

2

Page 3: Distribution of streaks of a coin flip!

When we study a dynamical system (X,f), we are mainly interested in whathappens to the elements of X after repeated iterations of f . In the caseof the Gauss transformation, questions of this sort are greatly simplified byknowledge of continued fractions.

4 Definition: The function of N+1 variables

x = a0 +1

a1 +1

a2 +1

. . . +1

aN

is called a finite continued fraction, and is often rewritten [a0, a1, a2, . . . , aN ].a0, a1, a2, . . . , aN are called the partial quotients or simply quotients of thecontinued fraction. Here are some facts about continued fractions that followquickly from their definition:

[a0, a1] = a0 +1

a1

[a0, a1, . . . , an−1, an] = [a0, a1, . . . , an−2, an−1 +1

an] (1)

[a0, a1, . . . , an] = a0 +1

[a1, a2, . . . , an]= [a0, [a1, a2, . . . , an]]

and more generally,

[a0, a1, . . . , an] = [a0, a1, . . . , am− 1, [am, am + 1, . . . , an]], for1 ≤ m < n (2)

5 Definition: We call [a0, a1, . . . , an](0 ≤ n ≤ N) the nth convergent to [a0, a1, . . . , aN ].

6 Theorem: If pn and qn are defined by

p0 = a0, p1 = a1a0 + 1, pn = anpn−1 + pn−2 (2 ≤ n ≤ N) (3)

q0 = 1, q1 = a1, qn = anqn−1 + qn−2 (2 ≤ n ≤ N) (4)

then[a0, a1, . . . , an] =

pnqn.

Proof. If n = 0,p0

q0=a0

1= [a0]

3

Page 4: Distribution of streaks of a coin flip!

If n = 1,p1

q1=a1a0 + 1

a1

= a0 +1

a1

= [a0, a1]

Suppose it is true for n ≤ m, where m < N . Then

[a0, a1, . . . , am−1, am] =pmqm

=ampm−1 + pm−2

amqm−1 + qm−2

.

It is (1) that allows the recursion to propagate:

pm+1

qm+1

= [a0, a1, . . . , am−1, am, am+1] = [a0, a1, . . . , am +1

am+1

] =

(am + 1am+1

)pm−1 + pm−2

(am + 1am+1

)qm−1 + qm−2

=ampm−1 + pm−1

am+1+ pm−2

amqm−1 + qm−1

am+1+ qm−2

=

am+1(ampm−1 + pm−2) + pm− 1

am+1(amqm−1 + qm−2) + qm− 1=am+1pm + pm−1

am+1qm + qm−1

7 Theorem: pn and qn, as defined in Theorem 6, satisfy:

pnqn−1 − pn−1qn = (−1)n−1

Proof.

pnqn−1 − pn−1qn = (anpn−1 + pn−2)qn−1 − pn−1(anqn−1 + qn−2) =

anpn−1qn−1 + pn−2qn−1 − anpn−1qn−1 − pn−1qn−2 =

− (pn−1qn−2 − pn−2qn−1).

By simply repeating this deduction with n− 1, n− 2, . . . , 2 in place of n, weget

pnqn−1 − pn−1qn = (−1)n−1(p1q0 − p0q1) =

(−1)n−1((a0a1 + 1) · 1− a0a1) = (−1)n−1.

Further, we have

pnqn− pn−1

qn−1

=pnqn−1

qnqn−1

− pn−1qnqnqn−1

=pnqn−1 − pn−1qn

qnqn−1

=(−1)n−1

qnqn−1

(5)

4

Page 5: Distribution of streaks of a coin flip!

8 Theorem:pnqn−2 − pn−2qn = (−1)nan

Proof.

pnqn−2 − pn−2qn = (anpn−1 + pn−2)qn−2 − pn−2(anqn−1 + qn−2) =

anpn−1qn−2 + pn−2qn−2 − anpn−2qn−1 − pn−2qn−2 =

an(pn−1qn−2 − pn−2qn−1) = (−1)n−2an = (−1)nan.

Thus,pnqn− pn−2

qn−2

=(−1)nanqnqn−1

(6)

In general, we write

xn =pnqn

= the nth convergent

and x = xN . Based on (2),

x = [a0, a1, . . . , aN ] = [a0, a1, . . . , am−1, [am, am+1, . . . , aN ]] =

[am, am+1, . . . , aN ]pm−1 + pm−2

[am, am+1, . . . , aN ]qm−1 + qm−2

(7)

for (2 ≤ n ≤ N.)

9 Theorem: Suppose thataj > 0 (1 ≤ j ≤ N.) (8)

Then

a) The even convergents x2n increase strictly with n, while the odd conver-gents x2n+1 decrease strictly.

Proof. Based on (2) and (8), we have that every qn is positive, as isevery an where 0 < n < N .

(−1)nanqn−2qn

has the sign of (−1)n. By (6),

(−1)nanqn−2qn

=pnqn− pn−2

qn−2

= xn − xn−2.

Thus, xn − xn−2 has the sign of (−1)n; so if n is even, xn > xn−2, andif n is odd, xn < xn−2, ∀n > 1.

5

Page 6: Distribution of streaks of a coin flip!

b) Every odd convergent is greater than any even convergent.

Proof. Every qn is positive, so

(−1)n−1

qnqn−1

has the sign of (−1)n−1. So, after (5), xn−xn−1 has the sign of −1)n−1.Specifically, x2m+1 − x2m has the sign (−1)(2m+1)−1 = (−1)2m = 1, so

x2m+1 > x2m. (9)

Suppose Theorem 9 b) were false. Then ∃m,µ ∈ N s.t.

x2m+1 ≤ x2µ.

If µ < m, then after Thm 9 a), x2m+1 ≤ x2µ < x2m, contradicting (9);if µ > m, then x2µ+1 < x2m+1 ≤ x2µ, also contradicting (9).

c) The value of the continued fraction is greater than that of any of its evenconvergents and less than that of any of its odd convergents (exceptthat it is equal to the last convergent, whether this be even or odd.)

Proof. x = xN is the greatest of the even or least of the odd convergents,in either of which case 9 a) and b) show 9 c) to be true.

10 Definition: A continued fraction [a0, a1, . . . , aN ] wherein a0 ∈ Z and ∀js.t.1 ≤j ≤ N, aj ∈ (N) is called a simple continued fraction.

Theorem 9 holds for simple continued fractions; further, from (3) and (4) wehave that for a simple continued fraction, pn ∈ Z and qn ∈ N.If [a0, a1, a2, . . . , aN ] = pN

qN= x, we say that x (which ∈ Q) is represented by

that continued fraction.

11 Theorem a) qn ≥ qn−1 for n ≤ 1, with inequality when n > 1, and

b) qn ≥ n, with inequality when n > 3.

Proof. q0 = 1, q1 = a1 ≥ 1 = q0.For n > 1, qn−2 ≥ 1; qn = anqn−1 + qn−2 ≥ qn−1 + 1, so qn > qn−1, andby induction qn ≥ n.Finally, if n > 3, then qn ≥ qn−1 + qn−2 > qn−1 + 1 ≥, so qn > n.

12 Theorem: The convergents to a simple continued fraction are in their lowestterms.

6

Page 7: Distribution of streaks of a coin flip!

Proof. pnqn−1 − pn−1qn = (−1)n−1. So,d|pn and d|qn → d|(−1)n−1 → d|1; thus, gcd(p, q) = 1

13 Theorem: If x is representable by a simple continued fraction with an oddnumber of convergents, then it is also representable by one with an evennumber, and vice versa.

Proof. If an ≥ 2, [a0, a1, . . . , aN ] = [a0, a1, . . . , aN − 1, 1];If aN = 1, [a0, a1, . . . , aN−1, 1] = [a0, a1, . . . , aN−2, aN−1 + 1].

14 Definition: We call a′n = [an, an+1, . . . , aN ](0 ≤ n ≤ N) the nth completequotient of the continued fraction [a0, a1, . . . , an, . . . , aN ]. Thus:

x = a′0, x = a0 +1

a′1

From (7), we have

x =a′npm−1 + pm−2

a′nqm−1 + qm−2

(10)

15 Theorem: an = {a′n}, except aN−1 = {a′N−1} − 1 when aN = 1.

Proof. If N = 0, then a0 = a′0 = {a′0}.If N > 0, then,

a′n = an +1

a′n+1

. (11)

Lemma a′n+1 > 1(0 ≤ n ≤ N − 1) except that a′n+1 = 1 when n = N − 1and aN = 1.

Proof. We will prove this using backwards induction:For n = N − 1,a′(N−1)+1 = a′N = aN ; if a′(N−1)+1 = 1, then aN = 1. Otherwise, sinceaN ∈ N, aN > 1, so a′(N−1)+1 > 1.

Suppose a′n+1 > 1. Then 0 < 1an+1

; 1 ≤ an−1. So, a′n = an + 1a′n+1

> 1 + 0 =

1.

Since a′n+1 < 1 for (0 ≤ n ≤ N − 1) (except when aN = 1), we have0 < 1

a′n+1< 1. Thus,

an < an +1

a′n+1

= a′n+1 < an + 1. (12)

7

Page 8: Distribution of streaks of a coin flip!

So, for n ≤ N − 1, we have an = {a′n} except when aN = 1, in whichcase a′N−1 = aN−1 + 1

aN= aN−1 + 1, so that a′N−1 = {a′N−1}. Further,

aN−1 = a′N−1 − 1aN

= a′N−1 − 11

= a′N−1 − 1 = {a′N−1} − 1.Finally, aN = a′N = {aN}.

16 Definition: Two continued fractions are identical if they are formed by thesame exact series of quotients.

17 Theorem: If two simple continued fractions [a0, a1, . . . , aN ] and [b0, b1, . . . , bM ]have the same value x, and aN > 1 and bM > 1, then M = N , and thefractions are identical.

Proof. By Theorem 16, a0 = {a′0} = {x} = {b′0} = b0.Suppose the first n partial quotients in the continued fractions are identical,and that a′n and b′n are the complete quotients. Then x = [a0, a1, . . . , an−1, a

′n] =

[a0, a1, . . . , an−1, b′n].

If n = 1, a0 + 1a′1

= a0 + 1b′1

; so a′1 = b′1; so a1 = {a′1} = {b′1} = b1.

If n > 1, then by (10),

x =a′npn−1 + pn−2

a′nqn−1 + qn−2

=b′npn−1 + pn−2

b′nqn−1 + qn−2

.

(a′npn−1 + pn−2)(b′nqn−1 + qn−2)− (b′npn−1 + pn−2)(a

′nqn−1 + qn−2) = 0

a′nb′npn−1qn−1 + pn−2qn−2 + a′npn−1qn−2 + b′npn−2qn−1−

(a′nb′npn−1qn−1 + pn−2qn−2 + b′npn−1qn−2 + a′npn−2qn−1) = 0.

(a′n − b′n)(pn−1qn−1 − pn−2qn−1) = 0.

(a′n − b′n)(−1)n = 0,

a′n = b′n.

Since aN > 1, bM > 1, an = {a′n} = {b′n} = bn, so an = bn.Suppose, without loss of generality, that N ≤M . Then our argument showsthat an = bn for m ≤ N . If N < M , then

pNqN

= [a0, a1,≤, aN ] = [b0, b1,≤, bN , bN+1,≤, bM ] =b′N+1pN + pN−1

b′N+1qN + qN−1

.

pN(b′N+1qN + qN−1) = qN(b′N+1pN + pN−1)

pNqN−1 − pN−1qN = 0.

This contradicts Theorem 7. So, M = N , and the fractions are identical.

8

Page 9: Distribution of streaks of a coin flip!

Let x be any real number, and let a0={x}. Then

x = a0 + ξ0, 0 ≤ ξ0 < 1.

If ξ0 6= 0, then we can write

1

ξ0= a′1, {a′1} = a1, a

′1 = a1 + ξ1, 0 ≤ ξ1 < 1.

If ξ1 6= 0, then we can write

1

ξ1= a′2, {a′2} = a2, a2, a

′2 = a2 + ξ2, 0 ≤ ξ2 < 1.

In general, 1ξn−1

= a′n > 1, so an ≥ 1 for n ≥ 1.

Thus, x = [a0, a′1] = [a0, a1 + 1

a′2] = [a0, a1, a

′2] = [a0, a1, a2, a

′3] = . . ., where

a0, a1, . . . , are integers, and ∀j ∈ N, aj ∈ N.

18 Definition The system of equations

x = a0 + ξ0 (0 ≤ ξ0 < 1)

1

ξ0= a′1 = a1 + ξ1 (0 ≤ ξ1 < 1)

1

ξ1= a′2 = a2 + ξ2 (0 ≤ ξ2 < 1)

......

is known as the continued fraction algorithm. The algorithm continues solong as ξn 6= 0. If we eventually reach a value of n, say N, for which ξN = 0,then the algorithm terminates and we have:

x = [a0, a1, . . . , aN ]

In this case, x is represented by a simple continued fraction, and the numbersa′n are the complete quotients of the continued fraction.

19 Theorem: Any rational number x can be represented by a finite simple con-tinued fraction.

Proof. Put x into the continued fraction algorithm:Case 1: x is an integer.Then ξ0 = 0, so x = a0.Case 2: x is not an integer.

9

Page 10: Distribution of streaks of a coin flip!

Then x = hk, h ∈ Z, k ∈ N.

Since hk

= a0 + ξ0;h = a0k + xi0k.Let k1 = ξ0k.(k1 is the remainder of h/k := (h

k− {h

k})k.)

So, a′1 = kk1

= {a′1}+ ξ1 = a1 + ξ1, and k = a1k1 + ξ1k1.

Let k2 = ξ1k1; then k = a1k1 + k2, where k2 is the remainder of kk1

.We can continue this process, where k1 = a2k2+k3, k2 = a3k3+k4, . . . , kn−2 =an−1kn−1 + kn, where an−1 = {kn−2

kn−1} and kn is the remainder of kn−2

kn−1(one

implies the other.) The continued fraction algorithm continues as long asξn 6= 0;

1

ξn= a′n+1 =

knkn+1

. So, ξn =kn+1

kn; thus, ξn = 0↔ kn+1 = 0.

So, the continued fraction algorithm continues as long as kn+1 6= 0. Since kjis the remainder of kj−1/kj−2, kj ≥ 0 for ∀j ∈ N, k, k1, k2, . . . is a sequence ofnon-negative integers. Since kn = ankn+1 + kn+2, kn > kn+1, so this sequenceis strictly decreasing. Therefore, kN+1 = 0 for some N . Therefore, ξN = 0for some N, and so the continue fraction algorithm terminates, giving us afinite simple continued fraction that represents x.

20 Remark. The system of equations used in the proof of Theorem 19 is Euclid’salgorithm, used for determining gcd(h, k).

21 Remark. Since ξN = 0, a′N = aN ; also

0 <1

aN=

1

a′N= ξN−1 < 1; aN > 1, and so aN ≥ 2.

Hence the continued fraction algorithm gives us a representation of the typethat was shown to be unique in Theorem 17. For any such representation,we can always make the variation of Theorem 13. This gives us:

22 Theorem: A rational number can be expressed as a finite simple continuedfraction in just two ways, one with N convergents and one with N + 1convergents; in the former, the last partial quotient is > 1, and in the latter,it is = 1. The continued fraction algorithm always gives us the first type ofrepresentation.

What about irrational numbers? If we put one into the continued fractionalgorithm and the algorithm terminated, we would have a finite simple con-tinued fraction representing an irrational number-but it is clear that finitesimple continued fractions are rational, and so this gives us a contradiction.So, we must have that the continued fraction algorithm never terminateswhen given an irrational number. This implies that they might be repre-sented by infinite continued fractions-so let’s develop a theory for those:

10

Page 11: Distribution of streaks of a coin flip!

23 Definition: Suppose that we have a sequence a0, a1, a2, . . . where a0 ∈ Z andaj ∈ N∀j ∈ N. Therefore ∀n ∈ N, [a0, a1, . . . , an] is a simple continuedfraction, which in turn represents a rational number, let us say xn. If lim

n→∞xn

exists, and we call it x, then we say that the simple continued fraction[a0, a1, a2, . . .] converges to the value x and write x = [a0, a1, a2, . . .].

24 Theorem: If a0, a1, a2, . . . , is a sequence of integers as described above (a0 ∈Z,∀j ∈ Naj ∈ N), then xn = [a0, a1, . . . , an] tends to a limit x when n→∞;or, in other words, all infinite simple continued fractions are convergent.

Proof. xn is called a convergent to [a0, a1, a2, . . .] if

xn =pnqn

= [a0, a1, . . . , an].

If N ≥ n, then xn is also a convergent to [a0, a1, . . . , aN ]. By Thm 9 a),the even convergents form an increasing sequence and the odd convergentsform a decreasing one. Every even convergent is less than x,, by Thm 9 b),so that the increasing sequence of even convergents is bounded above; andevery odd convergent is greater than x0, so that the decreasing sequence ofodd convergents is bounded below. Hence, the odd convergents tend to alimit `1, and the odd convergents tend to a limit `2, and `1 ≤ `2.∣∣∣∣p2n

q2n− p2n−1

q2n−1

∣∣∣∣ =

∣∣∣∣p2nq2n−1 − p2n−1q2nq2nq2n−1

∣∣∣∣ =

∣∣∣∣(−1)2n−1

q2nq2n−1

∣∣∣∣ =1

q2nq2n−1

;

Since qn ≥ n,1

q2nq2n−1

≤ 1

2n(2n− 1).

As n→ 0, 12n(2n−1)

→ 0, so∣∣∣p2n

q2n− p2n−1

q2n−1

∣∣∣→ 0.

Thus, `1 = `2. Let us define x := `1 = `2; then [a0, a1, a2, . . .] converges tothis x.

This proof also proves us the following theorem:

25 Theorem: An infinite simple continued fraction is less than any of its oddconvergents and greater than any of its even convergents.

We call a′n = [an, an+1, . . .] the nth complete quotient of the continued fractionx = [a0, a1, . . .]. Some familiar facts follow:

a′n = limN→∞

[an, an+1, . . . , aN ] = an + limN→∞

1

[an+1, . . . , aN ]= an +

1

a′n+1

.

11

Page 12: Distribution of streaks of a coin flip!

For instance,

x = a′0 = a0 +1

a′1Also,

a′n+1 > an+1 > 0, so 0 <1

a′n+1

< 1;

since a′n > an, an = a′n −1

a′n+1

, an ∈ N, we have:

an = {a′n}

This gives us:

26 Theorem: If [a0, a1, a2, . . .] = x, then

a0 = {x}, an = {a′n}.

Also, through a parallel argument to Thm 17, we have

27 Theorem: Two infinite simple continued fractions which have the same valueare identical.

This tells us that if an irrational number can be expressed as an infinitecontinued fraction, then its representation is unique. Consider the con-tinued fraction algorithm; if x is irrational, the algorithm doesn’t termi-nate, and hence it gives us a sequence of integers a0, a1, a2, . . . and numbersa′0, a

′1, a′2, . . ., where

x = [a0, a′1] = [a0, a1, a

′2] = . . . = [a0, a1, a2, . . . an, a

′n+1],

and where, by (10), x =a′n+1pn + pn−1

a′n+1qn + qn−1

, so

x−pnqn

=a′n+1pn + pn−1

a′n+1qn + qn−1

−pnqn

=a′n+1pnqn + pn−1qn − (a′n+1pnqn + p1qn−1)

qn(a′n+1qn + qn−1)=

(−1)n

qn(a′n+1qn + qn−1); so

∣∣∣∣x− pnqn

∣∣∣∣ < 1

qn(an+1qn + qn−1)=

1

qn(qn+1

≤ 1

n(n+ 1),

which→ 0 as n→∞.

Thus:x = lim

n→∞

pnqn

= [a0, a1, a2, . . . an, . . .]

So, the algorithm gives a continued fraction whose value is x. Further, thisrepresentation is unique; each irrational number can be represented at leastone way and at most one way as a simple continued fraction.

12

Page 13: Distribution of streaks of a coin flip!

28 Theorem: Every irrational number can be expressed in exactly one way byan infinite simple continued fraction.

We have now done a lot of work on continued fractions, and have gottensome important results: by putting a real number through the continuedfraction algorithm, we get a simple continued fraction equal to that number;if the real number is rational, we will get a finite simple continued fractionwhose last quotient is greater than 1, and which is a unique expression forthat rational number among finite simple continued fractions with quotientsgreater than 1; if the real number is irrational, we will get an infinite simplecontinued fraction which is the unique expression of that irrational numberas a simple continued fraction.

Consider this algorithm. In general, it states that a′n = an + ξn, or in otherwords, a′n = {a′n}+ 1

a′n+1. So,

a′n − {a′n} =1

a′n+1

(13)

Recall the Gauss transformation:

ϕ(x) =

{1x− { 1

x} if x ∈ (0, 1]

0 if x = 0

Suppose, n ≥ 0, 0 < ξn < 1; then ξn = 1a′n+1

.

For n ≥ 1, 0 < 1a′n, so ϕ1a′n is well defined, and, by (13),

ϕ(1

a′n) = a′n − {a′n} =

1

a′n+1

.

Suppose that for n = N, ξn = 0;, the continued fraction algorithm terminates,and we have that for the final complete quotient aN ,

ϕ(1

a′N) = a′N − {a′N} = 0.

Therefore, we have

ϕn−1(1

a′1) =

1

a′nunless the continued fraction is finite and n > N,

in which case ϕn−1(1

a′1) = 0.

13

Page 14: Distribution of streaks of a coin flip!

Since an = {a′n},

an =

{1

ϕn−1 1a′1

}. (14)

Let us constrict our attention to a real number x in the interval [0, 1). Thesimple continued fraction expansion of x generated by the continued fractionalgorithm will have to have a0 = {x} = 0. Therefore,

x ∈ [0, 1), then x = a0 +1

a′1.

This and (14) give us

29 Theorem: If x ∈ [0, 1], its continued fraction expansion as obtained throughthe continued fraction algorithm is:[1] if x = 1,

[0,{

1x

},{

1ϕ(x)

},{

1ϕ2(x)

}, . . . ,

{1

ϕN−1(x)

}] if x is rational, and

[0,{

1x

},{

1ϕ(x)

},{

1ϕ2(x)

}, . . . ,

{1

ϕn−1(x)

}, . . .] if x is irrational.

Further, if x is rational, and aN is the last quotient of its continued fractionexpansion as obtained through the algorithm, then ∀m ≥ N,ϕm(x) = 0.

For the next theorem, we will need a lemma:

Lemma: ϕ(x) ∈ Q iff x ∈ Q.

Proof. If x ∈ Q, then x = pq, p,∈ N, q ∈ Z

ϕ(x) =q

p− {q

p} =

q

p− (q − q mod p)

p=q mod p

p∈ Q.

Suppose x /∈ Q, but ϕ(x) ∈ Q. Then ϕ(x) = ab, a, b ∈ Z.

ϕ(x) =a

b=

1

x−{

1

x

},

a+ b{

1x

}b

=1

x,

x =b

a+ b{

1x

} , where b, a+ b

{1

x

}6= 0,∈ Z, so

x ∈ Q, giving us to a contradiction.

14

Page 15: Distribution of streaks of a coin flip!

We get that if x is rational, ϕ(x) is rational, and if x is irrational, then ϕ(x)is irrational. Suppose x is rational, and suppose that its fraction generatedby the continued fraction algorithm terminates after aN . Then ξN = 0, soa′N = aN , and xiN = aN = {a′N}, so ϕN−1(x) =

{ϕN−1(x)

}.. Further,

a′n > an∀n < N , and so,

ϕn−1(x) >{ϕn−1(x)

}∀n s.t. 1 < n < N.

Consider then the fraction generated by putting ϕ(x) through the algorithm;for 1 < n < N,ϕn−1(x) > {ϕn−1(x)} . So, ϕn−2(ϕ(x)) > {ϕn−2(ϕ(x))} ∀n s.t. 1 <n < N, or in other words, ϕn−1(ϕ(x)) > {ϕn−1(ϕ(x))} ∀n s.t. 1 < n < N−1.Thus, if bj are the partial quotients of the fraction generated by ϕ(x), b′n−1 >bn−1, so ξn > 0,∀n s.t. 1 < n < N − 1.

Therefore, the fraction does not terminate anywhere before reaching bN−1.However, we have

ϕN−1(x) = {ϕN−1(x)}ϕN−2(ϕ(x)) = {ϕN−1(ϕ(x))}

b′N−1 = bn−1

xiN−1 = 0,

so that the fraction terminates after bN −1. Therefore, combining this resultwith Thm 29, we get that if x is rational and x ∈ [0, 1), and the simplecontinued fraction generated by putting x through the continued fractionalgorithm ends with aN (the N+1th term), then the simple continued fractiongenerated by putting ϕ(x) through the algorithm ends with its N th term, andlooks like this:

ϕ(x) = [0,

{1

ϕ(x)

},

{1

ϕ(ϕ(x))

},

{1

ϕ2(ϕ(x))

}, . . . ,

{1

ϕN−2(ϕ(x))

}]

= [0,

{1

ϕ(x)

},

{1

ϕ2(x))

},

{1

ϕ3(x))

}, . . . ,

{1

ϕN−1(x)

}]

remember, x = [0,

{1

x

},

{1

ϕ(x)

},

{1

ϕ2(x)

}, . . . ,

{1

ϕN−1(x)

}]

(15)

Similarly, if x is irrational and x ∈ [0, 1], then ϕ(x) is irrational, and thus:

ϕ(x) = [0,

{1

ϕ(x)

},

{1

ϕ(ϕ(x))

},

{1

ϕ2(ϕ(x))

}, . . .] =

[0,

{1

ϕ(x)

},

{1

ϕ2(x))

},

{1

ϕ3(x))

}, . . .]

remember that x = [0,

{1

x

},

{1

ϕ(x)

},

{1

ϕ2(x)

}, . . .]

(16)

15

Page 16: Distribution of streaks of a coin flip!

30 Definition: an(x) := the n + 1th quotient, an, that one would get if one putx through the continued fraction algorithm.

Using this definition, (15) and (16), we get

31 Theorem: For x = [0, 1),If n = 0, an(ϕ(x)) = an(x) = 0.If n ≥ 1, an(ϕ(x)) = an+1(x).

In other words, to get the continued fraction of ϕ(x) from the continuedfraction of x, simple remove a1(x) from its entry and shift all the entriesfollowing it over one:

x = [0, a1(x), a2(x), a3(x), . . .]

ϕ(x) = [0, a2(x), a3(x), a4(x), . . .]

This can be repeated, so that

32 Corollary: For x ∈ [0, 1), n ≥ 1, an(ϕm(x)) = an+m(x).

Once more, in other words, to get from ϕm(x) from x, remove a1 througham of x, and shift the remaining terms over:

x = [0, a1(x), a2(x), a3(x), . . . , am(x), am+1(x), am+2(x), . . .]

ϕm(x) = [0, am+1(x), am+2(x), am+3(x), . . .]

(Remember, if x is rational, and its expansion→ N terms, ϕm(x)’s expansionhas N −m.)

33 Theorem: When x ∈ [0, 1)ϕ−1(x), the preimage of x, consists of all continuedfractions of the form: [0, n, a1(x), a2(x), . . .], where n ∈ N.

Proof. Suppose ϕ(y) = x. Then, for n = 0, a0(y) = a0(x) = 0.], and n ≥1, an(x), an+1(y). Since y is of the form [0, z, a1(x), a2(x), . . .]. Since z isundetermined by x, it can be any number in the range of a1(y), which is N.So, y is of the form [0, n, a1(x), a2(x), . . .], where n ∈ N.

34 Corollary: When x ∈ [0, 1), ϕ−m(x), the preimage of x, consists of all con-tinued fractions of the form: [0, n1, n2, . . . , nm, a1(x), a2(x), . . .], where nj ∈N for 1 ≤ j ≤ m.

16

Page 17: Distribution of streaks of a coin flip!

35 Remarks: The theory of simple continue fractions and the Gauss transfor-mation gives us a remarkable way to look at the continuum, a way differentfrom the standard decimal expansion. We need to develop some terminologyfirst.

36 Definition: The interval [ 1n+1

, 1n) = [[0, n + 1], [0, n]), n ∈ N, is called an nth

platform of level 1.

An interval of the form ( 1z+ 1

n

, 1z+ 1

n+1

] = ([0, z, n], [0, z, n + 1]], z, n ∈ N, is

called an nth platform of level 2.

The nth platforms of the mth level refer to those intervals of the form[

1

z1 + 1z2+ 1

...+ 1

zn−1+ 1n+1

,1

z1 + 1z2+ 1

...+ 1

zn−1+ 1n

) : z1, z2, . . . , zm−1, n ∈ N

=

{[[0, z1, z2, . . . , zn−1, n+ 1], [0, z1, z2, . . . , zn−1, n]) : z1, z2, . . . , zm−1, n ∈ N}if m is odd and

(

1

z1 + 1z2+ 1

...+ 1

zn−1+ 1n

,1

z1 + 1z2+ 1

...+ 1

zn−1+ 1n+1

] : z1, z2, . . . , zm−1, n ∈ N

=

{([0, z1, z2, . . . , zn−1, n], [0, z1, z2, . . . , zn−1, n+ 1]] : z1, z2, . . . , zm−1, n ∈ N}if m is even.

Each nth platform of level m contains exactly one jth platform of level m+ 1for each j ∈ N, and is in fact the union of this collection of platforms ofone higher level with its own endpoint (either the left or right endpoint, de-pending on whether m is odd or even, respectively.) Thus, we could describeuniquely each platform by listing the platforms of lower levels that it is amember of.

37 Definition: pln1,n2,...,nm−1,nm is the nthm platform of level m that’s contained inthe nthm−1 platform of level m− 1 that’s contained in the . . . that’s containedin the nnd2 platform of level 2 that’s contained in the nth1 platform of level 1.

17

Page 18: Distribution of streaks of a coin flip!

The sequence of numbers a1(x), a2(x), a3(x), . . . associated with a numberx ∈ [0, 1), being the quotients of the simple continued fraction generated byputting x through the continued fraction algorithm, gives us the address ofthe number x in the terms of which platforms it’s on. For each aj(x), weknow that x is on the ajth platform of level j. More specifically, we knowthat x ∈ pla1(x),a2(x),...,an(x) for all n ≤ N if x is rational and aN is the lastquotient of x, or for all n ∈ N if x is irrational.

Let us call the simple continued fraction generated by the continued fractionalgorithm applied to a number x the continued fraction expansion of x. Anumber’s decimal expansion tells us its location in the decimal system, inwhich each level is divided into ten equal pieces. The continued fractionexpansion of a number x ∈ [0, 1) tells us its location in the coordinate systemof the platforms, in which each level is divided into a countably infinitenumber of parts of logorhythmically decreasing length.

Now, it is not hard to see that ϕ(pln1,n2,n3,...) is equal to pln2,n3,n4,.... In thisway, the Gauss transformation strips away levels of information about thelocation of a number–if we know ϕm(x), we don’t know which platforms oflevels 1 through m− 1 it resides on.

One final note for now about the relationship of ϕ(x) and the platforms.Suppose we define PL to be the set of all platforms of all levels, and supposethat we define f(X), where X is a collection of sets and f is a function onthe members of those sets to be the set of images of the members of X underf . Then we have that

ϕ(PL) = PL

and soϕm(PL) = PL,m ∈ N

2 The Lebesgue Integral and Ergodic Theory

I will only have space to gave a skeletal version of the theory behind theLebesque integral; I will state many theorems without proving them.

38 Definition: A topology (X, τ) consists of a collection τ of subsets of a set Xsuch that:

i) ∅ ∈ τ and X ∈ τ.ii) If Vi ∈ τ for i = 1, . . . , n, then V1 ∩ V2 ∩ . . . ∩ Vn ∈ τ.

18

Page 19: Distribution of streaks of a coin flip!

iii) If {Vα} is an arbitary collection of members of τ (finite, countable, oruncountable), then ∪αVα ∈ τ .

If there exists such a τ for a set X, then X is called a topological space, andthe members of τ are called the open sets in X.

39 Definition: A collection M of subsets of a set X is said to be a σ − algebrain X if M has the following property:

i) X ∈Mii) If A ∈M, then X − A ∈Miii) If A = ∪∞n=1An and if An ∈M for n = 1, 2, 3, . . . , then A ∈M.

If there exists such a M for a set X, then X is called a measurable space,and the members of M are called the measurable sets in X. Further if X isa measurable space, Y is a topological space, and f is a mapping of X intoY , then f is said to be measurable if f−1(V ) is a measurable set in X forevery open set V in Y.

40 Remarks: Another way of describing a σ-algebra is to say that it containsthe empty set and that it is closed under countable numbers of intersections,unions, and complements. An example of a commonly used topology is thetopology on the extended real number line [−∞,∞] generated by takingfinite intersections and infinite unions of open intervals (intervals of the form[−∞, a), (a, b), (a,∞].) An example of a commonly used σ-algebra is theσ-algebra on the extended real number line [−∞,∞] generated by takingcountable complements, unions, and intersections of any interval (all intervalsof the form [−∞, a), [−∞, a], (a, b), (a, b], [a, b), [a, b], (a,∞], [a,∞].)

41 Definition: Let {an} be a sequence in [−∞,∞], and put

bk = sup {ak, ak+1, ak+2, . . .}(k = 1, 2, 3, . . .) (17)

where sup is the least upper bound of the sequence ak, ak+1, ak+2, . . . and

β = inf {b1, b2, b3, . . .} (18)

where inf is the greatest lower bound of the sequence b1, b2, b3, . . . . We callβ the upper limit of {an}, and write

β = lim supn→∞

an

The following properties are easily verified: first, b1 ≥ b2 ≥ b3 ≥ . . ., so thatbk → β as k →∞; secondly, there is a subsequence {ani

} of {an} such thatani→ β as i→∞, and β is the largest number with this property.

19

Page 20: Distribution of streaks of a coin flip!

The lower limit is defined analogously: switch (17) and (18). If {an} con-verges, we have:

lim supn→∞

an = lim infn→∞

an = limn→∞

an

and some thought shows that if lim supn→∞

an = lim infn→∞

an, then an converges to

that limit.

Suppose {fn} is a sequence of extended-real functions on a set X. Thensupnfn and lim sup

n→∞fn are the functions defined on X by

(supnfn)(x) = sup

n(fn(x))

(lim supn→∞

fn)(x) = lim supn→∞

(fn(x))

42 Definition: If E is a subset of X, then the function on X

ξE(x) =

{1 if x ∈ E0 if x /∈ E

is called the characteristic function of the set E.

A function s on a measurable space X whose range consists of only finitelymany points in [0,∞) is known as a simple function. Clearly, a simple func-tion is expressible as a finite linear combination of characteristic functions. Arather beautiful theorem that we have not the time to prove is the following:

43 Theorem: Let f : X → [0,∞] be measurable. There exist simple measurablefunctions sn on X such that

a) 0 ≤ s1 ≤ s2 ≤ . . . ≤ f .b) sn(x)→ f(x) as n→∞, for every x ∈ X

44 Definition: A measure is a function µ, defined on a σ − algebra M, whoserange is in [0,∞] and which is σ−additive, meaning that if {Ai} is a disjointcountable collection of members of M, then

µ(∞∪i=1Ai) =

∞∑i=1

µ(Ai).

We will also assume that ∃A ∈ M such that µ < ∞. A measurable spacewith a measure defined on it is called a measure space.

As an example, take the σ − algebra described above on the extended realline, [−∞,∞], let us call it M and define the following measure on it: if I

20

Page 21: Distribution of streaks of a coin flip!

is an interval in M , let m(I) = b − a, where a and b are the left and rightendpoints of I, respectively, regardless of whether I is open, semi-open, orclosed. If X ∈M is not an interval, it is not hard to see that it is expressibleas a countable union of disjoint intervals; as such, define m(X) as the sumof the measure of the disjoint intervals resulting from a decomposition of X.It can be shown that m is well-defined in this way.

A consequence of this is that m({2}) = m([2, 2]) = 0]. Thus, we have anon-empty set of measure 0. In fact, since measures are σ − additive, wecan take the union of a countable number of such single-point intervals andget an infinite set of measure 0. We can therefore come to the conclusionthat the set of rational numbers has measure 0; and we can go further still–ifyou extend our definition of measure to cover a larger group of sets, creatingwhat is known as the Lebesgue measure–which we will not define here–onefinds that one can have an uncountable set of measure 0–namely, the famousCantor set. Further:

45 Definition: Suppose that x ∈ [0, 1], and that the digit b occurs nb times inthe first n places of the decimal expansion of x, base r. If

nbn→ β

when n→∞, then we say that b appears with asymptotic frequency β. (Thislimit need not exist for a given x.) We say that x is simply normal base r if

nbn→ 1

r

for each of the r possible values of b. Further, we say that x is normal baser if all of the numbers

x− {x}, rx− {rx}, r2x− {r2x}, . . .

are simply normal in all of the sacles

r, r2, r3, ...;

This is the same as saying that in the decimal expansion of x in base r, everycombination

b1b2 . . . bk

of digits occurs with the right frequency: if nb is the number of times thissequence occurs in the first n digits of x, then

nbn→ 1

rk

21

Page 22: Distribution of streaks of a coin flip!

as n → ∞. There is theorem, the proof of which is once more beyond thescope of this paper, that the set of normal numbers has measure 1; thus, theset of numbers that aren’t normal in [0,1] has measure 0. When somethingis true for all members of a measure space X except for a set of measure 0,we say that it is true for almost every x ∈ X. Thus, almost every x in [0, 1]is normal. The purpose of this paper is to prove an analogous result for thecontinued fraction expansion of a number in [0,1].

One of the reason that it is significant for a given set to be of measure 0is demonstrated in the constructing of the Lebesgue integral, which bringstogether a number of the ideas presented in this section:

46 Definition: Suppose that (X,M) is a measurable space and that µ is a mea-sure on M. If s is a measurable simple function on X, we have pointed outthat it is expressible as

s =n∑i=1

αiξAi.

where α1, . . . , αn are the distinct values of s, Ai = {x : s(x) = αi}, and ξAi

is the characteristic function of Ai, and if E ∈M, we define∫E

sdµ =n∑i=1

αiµ(Ai ∩ E).

(If this equation requires us to take 0 ·∞, we follow the convention 0 ·∞ = 0.If f : X → [0,∞] is measurable, and E ∈M, we define∫

E

fdµ = sup

∫E

sdµ (19)

the supremum being taken over all simple measurable functions s such that0 ≤ s ≤ f . The left number (19) is called the Lebesgue integral of f overE, with respect to the measure µ. Further, suppose that f is a real-valuedmeasurable function on X for which∫

X

|f |dµ <∞

We call the collection of such functions L1(µ), also known as the Lebesqueintegrable functions (with respect to µ). Let f+ = max{f, 0} and f− =−min{f, 0}; then we define∫

E

fdµ =

∫E

f+dµ−∫E

f−dµ

22

Page 23: Distribution of streaks of a coin flip!

for every measurable set E. One of the consequences of such a definition isthat if E is a set of measure 0 and f ∈ L1(µ), then

∫Efdµ = 0, and that as

such, sets of measure 0 exert no influence on integrals taken over them.

Before getting to the most important theorems of this paper, we will needtwo more results from Lebesgue theory, which I will state without proof:

47 Fatou’s Lemma: If fn : X → [0,∞] is measurable, for each positive integern, then ∫

X

(lim infn→∞

fn)dµ ≤ lim infn→∞

∫X

fndµ

48 Lebesgue’s Dominated Convergence Theorem: Suppose {fn} is a sequenceof real measurable functions on X such that

f(x)) = limn→∞

fn(x)

exists for every x ∈ X. If there is a function g ∈ L1(µ) such that

|fn(x)| ≤ g(x)(n = 1, 2, 3, . . . ;x ∈ X)

then f ∈ L1(µ),

limn→∞

∫X

|fn − f |dµ = 0,

and

limn→∞

∫X

fndµ =

∫X

fdµ.

49 Definition: Suppose (X,A, µ) and (Y,B, ν) are measurable spaces. We canredefine measurable function from X to Y without respect to a topology onY by saying that a function if T : X → Y is measurable if the preimageof any measurable set is measurable. A measurable function T : X → Yis measure-preserving if µ(T−1(B)) = ν(B), ∀B ∈ B. If T maps X ontoitself, we call T a measure-preserving transformation, and we say that µ isT -invariant.

If µ(X) = 1, we say that (X,A, µ) is a probability space. If µ(X) is finite,then we say (X,A, µ) is a finite measure space, which can be rescaled to bea probability space by scaling µ by 1

µ(X).

50 Poincare Recurrence Theorem: Let T be a measure-preserving transfor-mation of a finite space (X,A, µ). If A is a measurable set, then for almostevery x ∈ A, there are infinitely many k ∈ N such that T k(x) ∈ A.

23

Page 24: Distribution of streaks of a coin flip!

Proof. Let

B = {x ∈ A : T k(x) /∈ A for all k ∈ N} = A−⋃k∈N

T−k(A).

Then B ∈ A, and all the preimages T−k(B) are disjoint, are measurable,and have the same measure as B. Since X has finite total measure, and

µ(X) ≥ µ(⋃k∈N

T−k(B)) =∞∑k=1

µ(T−k(B)) =∞∑k=1

µ(B),

it follows that µ(B) = 0. Hence, for almost every x ∈ A, x ∈ A − B, sothat for almost every x,∃n ∈ N : T n(x) ∈ A. Since T n(x) ∈ A, ∃n1 ∈N : T n1(Tn(x)) ∈ A, T n+n1(x) ∈ A; by similar arguments, ∃n2, n3, . . . :T n+n1(x), T n+n2(x), T n+n3(x), . . . ,∈ A.

Before proving the most important theorem of the paper, we need to provea combinatorial lemma:

51 Definition: If a1, . . . , am are real numbers and 1 ≤ n ≤ m, we say that ak isan n-leader if ak + . . . + ak+p−1 ≥ 0 for some p, 1 ≤ p ≤ n (for instance,nonnegative numbers are always n-leaders.)

52 Lemma: For every n, 1 ≤ n ≤ m, the sum of all n-leaders is nonnegative.

Proof. If there are no n-leaders, the lemma is true. Otherwise, let ak bethe first n-leader, and p ≥ 1. If p > 1, then ak < 0, and we have thatak+1 + . . .+ak+p+1 ≥ 0, and so ak+1 is an n-leader. Repeating this argument,we get that if k ≤ j ≤ k + p − 1, then aj + . . . + ak+p−1 ≥ 0, and so aj isan n-leader, and we have that for the total collected n-leaders thus far, theirsum is nonnegative. Repeating the argument with the remaining sequenceak+p, . . . , an proves the lemma.

53 Birkhoff Ergodic Theorem: Let T be a measure-preserving transformationin a finite space (X,A, µ), and let f ∈ L1(µ), and is real-valued. Then thelimit

f(x) = limn→∞

1

n

n−1∑k=0

f(T k(x))

exists for almost every x ∈ X. Further, f ∈ L1(µ), is T-invariant, and satisfies∫X

f(x)dµ =

∫X

f(x)dµ

24

Page 25: Distribution of streaks of a coin flip!

Proof. Let

A = {x ∈ X : f(x) + f(T (x)) + . . .+ f(T k(x)) ≥ 0 for some k ∈ N ∪ {0}}

Lemma: (Maximal Ergodic Theorem)∫A

f(x)dµ ≥ 0.

Proof. Let An = {x ∈ X : Σki=0f(T i(x)) ≥ 0 for some k, 0 ≤ k ≤ n}. Then

An ⊂ An+1, A = ∪n∈NAn. Define:

gn(x) =

{f(x) if x ∈ An0 if x ∈ X − An

Clearly, gn ∈ L1(µ), and f(x) = limn→∞

g(x), for all x ∈ A. Also, |gn+1| ≥ |gn|,and that |f | > |gn| ∀n ∈ N. So, applying the Dominated ConvergenceTheorem, we get: ∫

A

fdµ = limn→∞

∫A

gndµ = limn→∞

∫An

fdµ

Therefore, it suffices to show that∫Anfdµ ≥ 0 for each n.

Fix an arbitrary m ∈ N. Let sn(x) be the sum of the n-leaders in thesequence f(x), f(T (x)), . . . , f(Tm+n−1(x)). For k ≤ m + n − 1, let Bk ⊂ Xbe the set of points for which f(T k(x)) is an n-leader of this sequence. FromLemma 52, we have 0 ≤

∫Xsn(x)dµ. A little bit of thought shows that:

0 ≤∫X

sn(x)dµ =m+n−1∑k=0

∫Bk

f(T k(x))dµ (20)

Note that for 1 ≤ k ≤ m,x ∈ Bk iff T (x) ∈ Bk−1, and further, for these k,Bk = T−1(Bk−1) = T−k(B0), and so∫

Bk

f(T k(x))dµ =

∫T−k(B0)

f(T k(x))dµ =

∫B0

f(x)dµ,

this last step due to the fact that µ is T-invariant. Thus, the first m + 1

25

Page 26: Distribution of streaks of a coin flip!

steps of (20) are equal, and since B0 = An−1, we have

0 ≤m+n−1∑k=0

∫Bk

f(T k(x))dµ =

(m+ 1)

∫An−1

f(x)dµ+m+n−1∑k=m+1

∫Bk

f(T k(x))dµ ≤

(m+ 1)

∫An−1

f(x)dµ+ (n− 1)

∫X

|f(x)|dµ.So,∫An−1

f(x)dµ ≥−(n− 1)

∫X|f(x)|dµ

m+ 1.

Since m is arbitrary, we can make it as large as we want, showing that∫Anfdµ ≥ 0.

Continuing the proof of the Birkhoff ergodic theorem, for any a, b ∈ R, a < b,the set

X(a, b) =

{x ∈ X : lim inf

n→∞

1

n

n−1∑i=0

f(T i(x)) < a < b < lim supn→∞

1

n

n−1∑i=0

f(T i(x))

}

is measurable and T-invariant. Applying the Maximal Ergodic Theorem forthe function f(x)− b, we have that∫

B

f(x)dµ ≥ 0 where

B = {x ∈ X : (f(x)− b) + (f(T (x))− b) + . . .

+(f(T k(x))− b) ≥ 0 for some k ∈ N ∪ {0}} =

{x ∈ X : (f(x) + f(T (x)) + . . .+ f(T k−1(x)))− b(k) ≥ 0 for some k ∈ N} =

{x ∈ X :1

k(f(x) + f(T (x)) + . . .+ f(T k−1(x))) ≥ b for some k ∈ N}

Therefore, we see that if x ∈ X(a, b), then x ∈ B. If x ∈ B, but x /∈ X(a, b),then there are two options. Either only a finite number of

1

k

k−1∑i=0

f(T i(x)) < a for all k ∈ N

26

Page 27: Distribution of streaks of a coin flip!

but by the Poincare Recurrence Theorem, this can only be true for a set ofmeasure 0; or

1

k

k−1∑i=0

f(T i(x)) = b for all k ∈ N, including i = 0,

so f(x) = b.

In either case, we have ∫B−X(a,b)

(f(x)− b)dµ = 0.

So,

0 ≤∫B

(f(x)− b)dµ =

∫X(a,b)

(f(x)− b)dµ+

∫B−X(a,b)

(f(x)− b)dµ =∫X(a,b)

(f(x)− b)dµ

By an analogous argument, we have

0 ≤∫X(a,b)

(a− f(x))dµ

and therefore

0 ≤∫X(a,b)

(a− f(x))dµ+

∫X(a,b)

(f(x)− b)dµ =

∫X(a,b)

(a− b)dµ.

But since b > a, the above equation must be equality, and X(a, b) must havemeasure 0. Since a and b were arbitrary, we have that the limit representedby f exists for almost every x ∈ X.

For n ∈ N, let fn(x) = 1nΣn−1i=0 f(T i(x)). Define f : X → R by f(x) =

lim infn→∞

fn(x). Then f is measurable, and fn converges for almost every x to

f . By Fatou’s Lemma,∫X

|f |dµ =

∫X

lim infn→∞

|fn(x)|dµ ≤ lim infn→∞

∫X

fndµ ≤

lim infn→∞

1

n

n−1∑j=0

∫X

|f(T j(x))|dµ = lim infn→∞

1

n

n−1∑j=0

∫T−j(X)

|f(T j(x))|dµ

27

Page 28: Distribution of streaks of a coin flip!

By the T-invariance of µ, we have(∫X

|f |dµ ≤)

lim infn→∞

1

n

n−1∑j=0

∫T−j(X)

|f(T j(x))|dµ =

lim infn→∞

1

n

n−1∑j=0

∫X

|f(x)|dµ = lim infn→∞

1

n· n∫X

|f(x)|dµ =

lim infn→∞

∫X

|f(x)|dµ =

∫X

|f(x)|dµ.

Therefore,∫X|f |dµ < ∞, f ∈ L1(µ). Finally, let N be the set of points for

which the limit of f does not exist. (µ(N) = 0.) Let

fn(x) =

1n

n−1∑j=0

f(T j(x)) if x ∈ X −N

0 if x ∈ N

and let

f0(x) =

limn→∞

1n

n−1∑j=0

f(T j(x)) if x ∈ X −N

0 if x ∈ N

Then fn → f0 for every x ∈ X; as has been shown, |f | ∈ L1(µ); further,|f | ≥ |fn(x)| ∀n ∈ N. So, using the Dominated Convergence Theorem, wehave f0 ∈ L1(µ), and so∫

X

fdµ =

∫X−N

fdµ =

∫X−N

f0dµ =

∫X

f0dµ =

limn→∞

∫X

1

n

n−1∑j=0

f(T j(x))dµ = limn→∞

1

n

n−1∑j=0

∫X

f(T j(x))dµ =

limn→∞

1

n

n−1∑j=0

∫T−j(X)

f(T j(x))dµ = limn→∞

1

n

n−1∑j=0

∫X

f(x)dµ =

limn→∞

1

n· n∫X

f(x)dµ = limn→∞

∫X

f(x)dµ =

∫X

f(x)dµ.

54 Definition: Let T be a measure-preserving transformation on a measure space(X,A, µ). A measurable function f : X → R is essentially T-invariantif µ({x ∈ X : f(T nx) 6= f(x)}) = 0 for every t. A measurable set A

28

Page 29: Distribution of streaks of a coin flip!

is essentially T-invariant if its characteristic function ξA is essentialy T-invariant. A measure-preserving transformation T is ergodic if any essentiallyT-invariant measurable set has either measure 0 or full measure (the measureof the entire set.)

55 Corollary to the Birkhoff Ergodic Theorem: If a measure-preserving trans-formation T in a finite measure space (X,A, µ) is ergodic, then for eachf ∈ L1(µ)

limn→∞

1

n

n−1∑k=0

f(T k(x)) =1

µ(X)

∫X

f(x)dµ, for almost every x ∈ X.

The converse is also true, but we shall not need it here.

Proof. If f is ergodic, then for each real number c the set

Ya = x ∈ X : lim supn→∞

1

n

n−1∑i=0

f(T i(x)) < a

is fully invariant. Therefore, it is essentially invariant, and we have thatµ(Ya) is either 0 or µ(X). Consider the function a → µ(Ya). It is clearlymonotonically increasing, and therefore takes the form

µ(Ya) =

{0 if a < a0

µ(X) if a0 < a(21)

where a0 ∈ [−∞,∞]. Let

N = {x ∈ X : lim supn→∞

1

n

n−1∑i=0

f(T i(x)) 6= a0}.

N is also invariant, but µ(N) = µ(X) is incompatible with (21); so we havethat µ(N) = 0, so that for almost every x ∈ X,

f(x) = limn→∞

1

n

n−1∑i=0

f(T i(x)) = lim supn→∞

1

n

n−1∑i=0

f(T i(x)) = a0

So that we have ∫X

f(x)dµ =

∫X

a0dµ =

∫X

f(x)dx

µ(X)a0 =

∫X

f(X)dX

a0 =1

µ(X)

∫X

f(x)dµ,

completing the proof.

29

Page 30: Distribution of streaks of a coin flip!

Finally, we return to the Gauss transformation.

56 Definition: The Gauss measure µ is defined by

µ(A) =1

ln 2

∫A

1 + x.

It will not be proved here, but this measure is ϕ-invariant. It is also aprobability measure. We will need a few more facts before completing ourfinal pair of theorems. First of all, through induction, we can see that if pn(x)

qn(x)

is the nth convergent of the continued fraction expansion of x, then

pn(x) ≥ 2(n−2)/2 and qn(x) ≥ 2(n−1)/2 for n ≥ 2.

Secondly, define ψb1,...,bn to be

ψb1,...,bn(t) = [b1, . . . , bn−1, bn + t]

. In other words, ψb1,...,bn maps [0, 1) onto plb1,...,bn , and is thus decreasing ifn is odd and increasing if n is even. For x ∈ plb1,...,bn ,

x = ψb1,...,bn =pn(x) + tpn−1(x)

qn(x) + tqn−1(x). (22)

Finally, if λ is the default measure on [0, 1], the Lebesgue measure (which isequivalent to m discussed above on objects such as intervals), it is not hardto see that then λ(plb1,...,bn) = (qn(qn + qn−1))

−1.

57 Theorem: The Gauss transformation is ergodic for the Gauss measure µ.

Proof. For a measure ν and measurable sets A and B with νB 6= 0, letν(A|B) = ν(A ∩ B)/ν(B) denote the conditional measure. Fix b1, . . . , bn,and let pln = plb1,...,bn , ψn = ψb1,...,bn . The length of pln is ±(ψn(1)− ψn(0)),and for 0 ≤ x < y ≤ y,

λ({z : x ≤ ϕn(z) < y} ∩ pln) = ±(ψn(y)− ψn(x)),

where the sign depends on whether n is odd or even. Therefore

λ(ϕ−n([x, y))|pln) =ψn(y)− ψn(x)

ψn(1)− ψn(0),

and, by Theorem 7 and (22),

λ(ϕ−n([x, y))|pln) = (y − x) · qn(qn + qn−1)

(qn + xqn−1)(qn + yqn−1).

30

Page 31: Distribution of streaks of a coin flip!

The second factor in the right-hand side is between 1/2 and 2. Hence

1

2λ([x, y)) ≤ λ(ϕ−n([x, y))|pln) ≤ 2λ([x, y)).

Since the intervals [x, y) generate the σ-algebra,

1

2λ(A) ≤ λ(ϕ−n(A)|pln) ≤ 2λ(A) (23)

for any measurable set A ⊂ [0, 1]. Because the density of the Gauss measureµ is between 1/(2 ln2) and 1/ln2,

1

2 ln 2λ(A) ≤ µ(A) ≤ 1

ln 2λ(A).

By (23),1

4µ(A) ≤ µ(ϕ−n(A)|pln) ≤ 4µ(A)

for any measurable set A ⊂ [0, 1].

Let A be a measurable ϕ-invariant set with µ(A) > 0. Then 14µ(A) ≤

µ(A|pln), or, equivalently, 14µ(pln) ≤ µ(pln|A). Since the platforms (inter-

vals) pln generate the σ-algebra, 14µ(B) ≤ µ(B|A) for any measurable set B.

By choosing B = [0, 1]− A we obtain that µ(A) = 1.

This brings us to our conclusion:

58 Definition: Suppose that x ∈ [0, 1], and that the natural number k occursnk times in the first n partial quotients of the (simple) continued fractionexpansion of x. If

nkn→ κ

when n→∞, then we say that k appears with asymptotic frequency κ.

59 Theorem: For almost every x ∈ [0, 1], every integer k ∈ N appears in its(simple) continued fraction expansion (i.e. in the sequence a1(x), a2(x), . . .with asymptotic frequency

1

ln 2ln

((k + 1)2

k(k + 2)

).

Numbers for which Theorem 59 is true are called ϕ-normal numbers.

31

Page 32: Distribution of streaks of a coin flip!

Proof. an(x) = k iff ξplk(ϕ(x)) = 1. By the corollary to the Birkhoff ErgodicTheorem, for almost every x,

limn→∞

1

n

n−1∑i=0

an(x) =

limn→∞

1

n

n−1∑i=0

ξplk(ϕi(x)) =

∫ 1

0

ξplkdµ = µ

([1

k,

1

k + 1

))=

1

ln 2ln

((k + 1)2

k(k + 2)

).

A quick application of this theorem. Suppose that you construct a tree inthe following way. Start with a point; draw two line segments branchingoff, creating two new points. Out of each of those points, draw two newline segments, creating four new points. Continue this indefinitely. This iscalled an infinite binary tree. Arrange the tree so that each new row of 2n

points is arranged in a horizontal line beneath the previous row. From thisperspective, it is possible to describe every point of the tree in terms of aunique sequence of lefts and rights taken from the initial point. For instance,left-right-right takes you one place and right-right-left takes you some placeelse. Further, each sequence of lefts and rights takes you somewhere on thetree. (The initial point is equivalent to zero lefts and zero rights.) Therefore,there is a bijection from the set of sequences of lefts and rights and the pointsof the binary tree. Further, it is not hard to see that you can rewrite eachsequence of lefts and rights in terms of streaks of lefts and rights; specifically,you can rewrite all sequences in the form

Rα0Lα1Rα2Lα3 . . . Rαn−1Lαn ,

αj ∈ Z∀j;αj > 0 for 1 ≤ j ≤ n− 1;α0 and αn ≥ 0

Based on this, there is a bijection between points on the tree and all positivesimple continued fractions, where each point on the tree Rα0Lα1 . . . Rαn−1Lαn

is mapped to

α0 +1

α1 +1

. . . +1

αn−1 +1

αn + 1

If we made a tree at each point of which was written its correspondingpositive simple continued fraction, and then found the irreducible rationalnumber representing that fraction, we would get the famous Stern-Brocot

32

Page 33: Distribution of streaks of a coin flip!

tree. In such a tree, infinite continued fractions would represent strings oflefts and rights of infinite length.

Alternatively, suppose that we represent each point on the binary tree witha string of the form

Lα1Rα2 . . . Lαn−1Rαn ,

αj ∈ Z∀j;αj > 0 for 2 ≤ j ≤ n− 1;α1 and αn ≥ 0

Then we can establish a bijection between points on the tree and all numbersin the interval (0, 1] by mapping Lα1Rα2 . . . Lαn−1Rαn to

1

α1 +1

α2 +1

. . . +1

αn + 1

In this case, we can directly apply Theorem 59.

60 Definition: Suppose that we have a string of lefts and rights, Lα1Rα2 . . ., andthat the natural number k occurs nk times in the first n numbers of thesequence {α1, α2, . . .}. If

nkn→ κ

Then we say that k appears with asymptotic frequency κ. In that case, ifwe define a measure on the set of all infinite strings such that a set of stringshas the measure that its corresponding set in (0, 1] has, we would see that foralmost every string of rights and lefts on a binary tree, every integer k ∈ Nappears (as an exponent) with asymptotic frequency

1

ln 2ln

((k + 1)2

k(k + 2)

).

In other words, probabilistically speaking, for 100% of the paths along aninfinite tree, the distribution of streaks (i.e. the number of times one movesthe same direction once in a row, twice in a row, three times in a row,. . .,k times in a row, etc.) will be defined by the above equation. A binarytree describes many processes in the world, for instance the flipping of acoin; tails = left, heads = right, for example. Since the infinite binary treerepresents all scenarios of an infinite number of coin flips, we can see that100% of the time, the distribution of an infinite amount of coin flips willhave a distribution of streaks as defined above, and that as finite amounts

33

Page 34: Distribution of streaks of a coin flip!

of coin flips get larger, they will tend towards this distribution. This means,for instance, that about 40% of streaks will be of length 1.

Finally, here are some questions I would like to investigate: what is therelationship between the set of normal numbers and the set of ϕ-normalnumbers? What is the smallest measure 1 subset of [0, 1]? Are there complexanalogies to simple continued fractions?

That is all–thank you for reading!

34