from apollonius to zaremba: local-global phenomena in thin orbits

42
BULLETIN (New Series) OF THE AMERICAN MATHEMATICAL SOCIETY Volume 50, Number 2, April 2013, Pages 187–228 S 0273-0979(2013)01402-2 Article electronically published on January 18, 2013 FROM APOLLONIUS TO ZAREMBA: LOCAL-GLOBAL PHENOMENA IN THIN ORBITS ALEX KONTOROVICH Abstract. We discuss a number of natural problems in arithmetic, arising in completely unrelated settings, which turn out to have a common formu- lation involving “thin” orbits. These include the local-global problem for in- tegral Apollonian gaskets and Zaremba’s Conjecture on finite continued frac- tions with absolutely bounded partial quotients. Though these problems could have been posed by the ancient Greeks, recent progress comes from a pleasant synthesis of modern techniques from a variety of fields, including harmonic analysis, algebra, geometry, combinatorics, and dynamics. We describe the problems, partial progress, and some of the tools alluded to above. Contents 1. Introduction 187 2. Zaremba’s Conjecture 189 3. Integral Apollonian gaskets 200 4. The thin Pythagorean problem 212 5. The circle method: tools and proofs 217 Acknowledgments 225 About the author 225 References 225 1. Introduction In this article we will discuss recent developments on several seemingly unrelated arithmetic problems, which each boil down to the same issue of proving a “local- global principle for thin orbits”. In each of these problems, we study the orbit O · v 0 , of some given vector v 0 Z d , under the action of some given group or semigroup, Γ, (under multiplication) of d-by-d integer matrices. It will turn out that the orbits arising naturally in our problems are thin ; roughly speaking, this means that each orbit is degenerate in its algebro-geometric closure, containing relatively very few points. Received by the editors August 15, 2012, and, in revised form, November 4, 2012. 2010 Mathematics Subject Classification. Primary 11F41, 11J70, 11P55, 20H10, 22E40. Partially supported by NSF grants DMS-1209373, DMS-1064214 and DMS-1001252. c 2013 American Mathematical Society Reverts to public domain 28 years from publication 187

Upload: luis-alberto-fuentes

Post on 20-May-2017

222 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

BULLETIN (New Series) OF THEAMERICAN MATHEMATICAL SOCIETYVolume 50, Number 2, April 2013, Pages 187–228S 0273-0979(2013)01402-2Article electronically published on January 18, 2013

FROM APOLLONIUS TO ZAREMBA:

LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

ALEX KONTOROVICH

Abstract. We discuss a number of natural problems in arithmetic, arisingin completely unrelated settings, which turn out to have a common formu-lation involving “thin” orbits. These include the local-global problem for in-tegral Apollonian gaskets and Zaremba’s Conjecture on finite continued frac-tions with absolutely bounded partial quotients. Though these problems couldhave been posed by the ancient Greeks, recent progress comes from a pleasantsynthesis of modern techniques from a variety of fields, including harmonicanalysis, algebra, geometry, combinatorics, and dynamics. We describe theproblems, partial progress, and some of the tools alluded to above.

Contents

1. Introduction 1872. Zaremba’s Conjecture 1893. Integral Apollonian gaskets 2004. The thin Pythagorean problem 2125. The circle method: tools and proofs 217Acknowledgments 225About the author 225References 225

1. Introduction

In this article we will discuss recent developments on several seemingly unrelatedarithmetic problems, which each boil down to the same issue of proving a “local-global principle for thin orbits”. In each of these problems, we study the orbit

O = Γ · v0,

of some given vector v0 ∈ Zd, under the action of some given group or semigroup,Γ, (under multiplication) of d-by-d integer matrices. It will turn out that the orbitsarising naturally in our problems are thin; roughly speaking, this means that eachorbit is degenerate in its algebro-geometric closure, containing relatively very fewpoints.

Received by the editors August 15, 2012, and, in revised form, November 4, 2012.2010 Mathematics Subject Classification. Primary 11F41, 11J70, 11P55, 20H10, 22E40.Partially supported by NSF grants DMS-1209373, DMS-1064214 and DMS-1001252.

c©2013 American Mathematical SocietyReverts to public domain 28 years from publication

187

Page 2: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

188 ALEX KONTOROVICH

Each of the problems then takes another vector w0 ∈ Zd, and for the standardinner product 〈·, ·〉 on Rd, forms the set

S := 〈w0,O〉 ⊂ Z

of integers, asking what numbers are in S .For an integer q ≥ 1, the projection map

Z → Z/qZ

can give an obvious obstruction to membership. Let S (mod q) be the image of thisprojection,

S (mod q) := {s(mod q) : s ∈ S } ⊂ Z/qZ.

For example, suppose that any number in S leaves a remainder of 1, 2 or 3 whendivided by 4, that is, S (mod 4) = {1, 2, 3}. Then one can conclude, without any

further consideration, that 101010

/∈ S , since 101010 ≡ 0 (mod 4). This is called a

local obstruction. Call n admissible if it avoids all local obstructions,

n ∈ S (mod q), for all q ≥ 1.

In many applications, the set S (mod q) is significantly easier to analyze than theset S itself. But a local to global phenomenon predicts that, if n is admissible,then in fact n ∈ S , thereby reducing the seemingly more difficult problem to theeasier one.

It is the combination of these concepts, (i) thin orbits, and (ii) local-globalphenomena, which will turn out to be the “beef” of the problems we intend todiscuss.

1.1. Outline. We begin in §2 with Zaremba’s Conjecture. We will explain howthis problem arose naturally in the study of “good lattice points” for quasi–MonteCarlo methods in multi-dimensional numerical integration, and how it also has ap-plications to the linear congruential method for pseudo-random number generators.But the assertion of the conjecture is a statement about continued fraction expan-sions of rational numbers, and as such is so elementary that Euclid himself couldhave posed it. We will discuss recent progress by Bourgain and the author, provinga density version of the conjecture.

We change our focus in §3 to the ancient geometer Apollonius of Perga. As wewill explain, his straight-edge and compass construction of tangent circles, wheniterated ad infinitum, gives rise to a beautiful fractal circle packing in the plane,such as that shown in Figure 1. Recall that the curvature of a circle is just oneover its radius. For special configurations, all the curvatures of circles in the givenpacking turn out to be integers; these are the numbers shown in Figure 1. In §3we will present progress on the problem, which integers appear? It was recentlyproved by Bourgain and the author that almost every admissible number appears.

In §4, also stemming from Greek mathematics, we describe a local-global problemfor a thin orbit of Pythagorean triples, as will be defined there. This problem isa variant of the so-called Affine Sieve, recently introduced by Bourgain, Gamburd,and Sarnak. We will explain an “almost” local-global theorem in this context dueto Bourgain and the author.

Finally, these three problems are reformulated to the aforementioned commonumbrella in §5, where some of the ingredients of the proofs are sketched. The prob-lems do not naturally fit in an established area of research, having no L-functions

Page 3: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 189

18

23 27146374

359347

62234

135458 242

383

347

123426

210323462327

47194

431

491

110378

207338

267

498

83306

222435

135482

398

203287

387

35 162422

407

371

78282

183338

467

147242

363

407

63242158

303498

387

107386

302

167243

335443

Figure 1. An integral Apollonian gasket.

or Hecke theory (though they are unquestionably problems about whole numbers),not being part of the Langlands Program (though involving automorphic formsand representations), nor falling under the purview of the classical circle methodor sieve, which attempt to solve equations or produce primes in polynomials (hereit is not polynomials that generate points, but the aforementioned matrix actions).Instead the proofs borrow bits and pieces from these fields and others, the ma-jor tools including analysis (the circle method, exponential sum bounds, infinitevolume spectral theory), algebra (strong approximation, Zariski density, spin andorthogonal groups associated to quadratic forms, representation theory), geometry(hyperbolic manifolds, circle packings, Diophantine approximation), combinatorics(sum-product, expander graphs, spectral gaps), and dynamics (ergodic theory, mix-ing rates, the thermodynamic formalism). We aim to highlight some of these in-gredients throughout.

1.2. Notation. We use the following standard notation. A quantity is defined viathe symbol “:=”, and a concept being defined is italicized. Write f ∼ g for f/g → 1,f = o(g) for f/g → 0, and f = O(g) or f g for f ≤ Cg. Here C > 0 is calledan implied constant, and is absolute unless otherwise specified. Moreover, f � gmeans f g f . We use e(x) = e2πix. The cardinality of a finite set S is writtenas |S| or #S. The transpose of a vector v is written vt. The meaning of algebraicsymbols can change from section to section; for example the (semi)group Γ andquadratic form Q will vary depending on the context.

2. Zaremba’s Conjecture

Countless applications require pseudo-random numbers: deterministic algorithmswhich “behave randomly”. Probably the simplest, oldest, and best known amongthese is the so-called linear congruential method: For some starting seed x0, iteratethe map

(2.1) x → bx+ c (mod d).

Page 4: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

190 ALEX KONTOROVICH

2000 4000 6000 8000 10000n

2000

4000

6000

8000

10000

bn mod d

2000 4000 6000 8000 10000n

2000

4000

6000

8000

10000

bn mod d

(a) Multiplier b = 4217. (b) Multiplier b = 4015.

Figure 2. Graphs of the map (2.2) with prime modulus d = 10037and multiplier b as shown.

Here b is called the multiplier, c the shift, and d the modulus. For simplicity, weconsider the homogeneous case c = 0. To have as long a sequence as possible, taked to be prime, and b a primitive root mod d, that is, a generator of the cyclic group(Z/dZ)×. In this case we may as well start with the seed x0 = 1; then the iteratesof (2.1) are nothing more than the map

(2.2) n → bn (mod d).

We show graphs of this map in Figure 2 for the prime d = 10037, with two choicesof roots, b = 4217 and b = 4015. In both cases, the graphs “look” random, in that,given b and n, it is hard to guess where bn(mod d) will lie (without just computing).Similarly, given b and bn(mod d), it is typically difficult to determine n; this is theclassical problem of computing a discrete logarithm.

A slightly more rigorous statistical test for randomness is the serial correlationof pairs: How well can we guess where bn+1 is, knowing bn? To this end, we plotin Figure 3 these pairs, or what is the same, the pairs

(2.3)

{(bn

d,bn+1

d

)(mod 1)

}dn=1

⊂ R2/Z2

in the unit square, with the previous choices of modulus and multiplier. Focus firston Figure 3a: it looks like a fantastically equidistributed grid. Keep in mind thatthe mesh in each coordinate is of size 1/d ≈ 1/10000, so we have (10000)2 pointsfrom which to choose, yet we are only plotting 10000 points, square-root the totalnumber of options.

On the other hand, look at Figure 3b: these parameters make a terrible randomnumber generator! Given the first few terms in this sequence (x1, x2, x3, . . . , xk),with xn = bn/d(mod 1), we simply plot the pairs (x1, x2), (x2, x3), . . . , (xk−1, xk),and then have a 1 : 5 guess for where xk+1 will be.

A related phenomenon also appears in two-dimensional numerical integration:Suppose that you wish to integrate a “nice” function f on R2/Z2 ∼= [0, 1) × [0, 1),

Page 5: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 191

0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

0.2 0.4 0.6 0.8 1.0

0.2

0.4

0.6

0.8

1.0

(a) Multiplier b = 4217. (b) Multiplier b = 4015.

Figure 3. Plots of the points (2.3) for the same choices of modulusd = 10037 and multipliers as in Figure 2.

say of finite variation, V (f) < ∞, where

V (f) :=

∫ 1

0

∫ 1

0

(|f |+∣∣∣∣ ∂∂xf∣∣∣∣+ ∣∣∣∣ ∂∂y f

∣∣∣∣+ ∣∣∣∣ ∂2

∂x∂yf

∣∣∣∣) dxdy.The idea is to take a large sample of points Z in R2/Z2 and approximate the integralby the average of f(z), z ∈ Z. For this to be a good approximation one obviouslyneeds that f does not vary much in a small ball, and that the points of Z are welldistributed throughout R2/Z2. In fact, the famous Koksma–Hlawka inequality (see[Nie78, p. 966]) states, rather beautifully, that this is all that one needs to take intoaccount: ∣∣∣∣∣

∫ 1

0

∫ 1

0

f(x, y)dx dy − 1

|Z|∑z∈Z

f(z)

∣∣∣∣∣ ≤ C · V (f) ·Disc(Z).

Here C > 0 is an absolute constant, and Disc is the discrepancy of the set Z, definedas follows. Take a rectangle R = [a, b]× [c, d] ⊂ R2/Z2. One would like the fractionof points in R to be close to its area, so set

Disc(Z) := supR⊂R2/Z2

∣∣∣∣#(Z ∩R)

#Z −Area(R)

∣∣∣∣ .It is elementary that for a growing family Z(k) ⊂ R2/Z2, |Z(k)| → ∞, the

discrepancy Disc(Z(k)) decays to 0 if and only if Z(k) becomes equidistributed inR2/Z2. But more than just indicating equidistribution, the discrepancy measuresthe rate. For example, observe that for any finite sample set Z, we have the lowerbound Disc(Z) ≥ 1/|Z|. Indeed, take a family of rectangles R zooming in on asingle point in Z; the proportion of points in R is always 1/|Z|, while the area ofR can be made arbitrarily small. It turns out there is a sharpest possible lowerbound, due to Schmidt [Sch72]:

(2.4) for any finite Z ⊂ S, Disc(Z) � log |Z||Z| .

Standard Monte Carlo integration is the process of computing the integral off by just sampling z ∈ Z according to the uniform measure; the Central Limit

Page 6: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

192 ALEX KONTOROVICH

Theorem then predicts that

(2.5) Disc(Z) ≈ 1

|Z|1/2 ,

ignoring log log factors. So comparing (2.5) to (2.4), it is clear that uniformlysampled sequences are far from optimal in numerical integration. Alternatively,one could take Z to be an evenly spaced d-by-d grid,

Z = {(i/d, j/d) : 0 ≤ i, j < d},with |Z| = d2. But then the rectangle [ε, 1/d − ε] × [0, 1] contains no grid pointswhile its area is almost 1/d = 1/|Z|1/2, again giving (2.5).

In the quasi–Monte Carlo method, rather than sampling uniformly, one tries tofind a special sample set Z to come as close as possible to the optimal discrepancy(2.4). Ideally, such a set Z would also be quickly and easily constructible by acomputer algorithm. Not surprisingly, the set Z illustrated in Figure 3a makes anexcellent sample set. It was this problem which led Zaremba to his theorem andconjecture, described below.

Returning to our initial discussion, observe that the sequence (2.3) is essentially(since b is a generator) the same as

(2.6) Zb,d :=

{(n

d,bn

d

)}dn=1

(mod 1).

And this is nothing more than a graph of our first map (2.1). Now it is clear thatboth Figures 3a and 3b are “lines”, but the first must be “close to a line withirrational slope”, causing the equidistribution. This Diophantine property is bestdescribed in terms of continued fractions, as follows.

For x ∈ (0, 1), we use the notation

x = [a1, a2, . . . ]

for the continued fraction expansion

x =1

a1 +1

a2 +.. . .

The integers aj ≥ 1 are called partial quotients of x. Rational numbers have finitecontinued fraction expansions.

One is then immediately prompted to study the continued fraction expansionsof the “slopes” b/d in Figure 3:

4217/10037 = [2, 2, 1, 1, 1, 2, 2, 2, 1, 2, 2, 1, 2],

4015/10037 = [2, 2, 2007].

Note the gigantic partial quotient 2007 in the second expression, while the partialquotients in the first are all 1’s and 2’s. Observations of this kind naturally ledZaremba to the following

Theorem 2.7 (Zaremba, 1966 [Zar66, Corollary 5.2]). Fix (b, d) = 1 with b/d =[a1, a2, . . . , ak] and let A := max aj. Then for Zb,d given in (2.6),

(2.8) Disc(Zb,d) ≤(

4A

log(A+ 1)+

4A+ 1

log d

)log d

d.

Page 7: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 193

Since |Zb,d| = d, comparing the upper bound (2.8) to Schmidt’s lower bound (2.4)shows that the sequences (2.6) are essentially best possible, up to the “constant” A(and this optimal equidistribution is precisely what we observe visually in Figure3a). But the previous sentence is complete nonsense: A is not constant at all; itdepends on d,1 and Figure 3b perfectly illustrates what can go wrong.

With this motivation, Zaremba predicted that in fact A can be taken constant:

Conjecture Z (Zaremba, 1972 [Zar72, p. 76]). Every natural number is the de-nominator of a reduced fraction whose partial quotients are absolutely bounded.

That is, there exists some absolute A > 1 so that for each d ≥ 1, there is some(b, d) = 1, so that b/d = [a1, . . . , ak] with max aj ≤ A.

Zaremba even suggested a sufficient value for A, namely A = 5. So this is reallya problem that could have been posed in Book VII of the Elements (after Euclid’salgorithm): Using the partial quotients aj ∈ {1, . . . , 5}, does the set of (reduced)fractions with expansion [a1, . . . , ak] contain every integer as a denominator? Thereason for Zaremba’s guess A = 5 is simply that it is false for A = 4, as we nowexplain. First some more notation.

Let RA be the set of rationals with the desired property that all partial quotientsare at most A,

RA :=

{b

d= [a1, . . . , ak] : (b, d) = 1, and aj ≤ A, ∀j

};

and let DA be the set of denominators which arise,

DA :=

{d : ∃(b, d) = 1 with

b

d∈ RA

}.

Then Zaremba’s conjecture is that D5 = N, and we claim that this is false for D4.Indeed, 6 /∈ D4: the only numerators to try are 1 and 5, but the continued fractionexpansion of 1/6 is just [6], and 5/6 = [1, 5], so the largest partial quotient in bothis too big.

That said, there are only two other numbers, 54 and 150, known to be missingfrom D4 (see [OEI]), leading one to ask what happens if a finite number of exceptionsis permitted. Indeed, Niederreiter [Nie78, p. 990] conjectured in 1978 that forA = 3,D3 already contains every sufficiently large number; we write this as

D3 ⊃ N�1.

With lots more computational capacity and evidence, Hensley almost 20 years later[Hen96] conjectured even more boldly that the same holds already for A = 2:

(2.9) D2 ⊃ N�1.

Lest the reader be tempted to one-up them all, let us consider the case A = 1. HereR1 contains only continued fractions of the form [1, . . . , 1], and these are quotientsof consecutive Fibonacci numbers Fn,

R1 = {Fn/Fn+1}.So D1 = {Fn} is just the Fibonacci numbers, and this is an exponentially thinsequence.

1The value A also depends on b, but the important variable for applications is |Zb,d| = d.

Page 8: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

194 ALEX KONTOROVICH

In fact, Hensley conjectured something much stronger than (2.9). First somemore notation. Let CA be the set of limit points of RA,

CA := {[a1, a2, . . . ] : aj ≤ A, ∀j}.To get our bearings, consider again the case A = 1. Then C1 = {1/ϕ} is just thesingleton consisting of the reciprocal of the golden mean.

Now take A = 2. Consider the unit interval [0, 1]. The numbers in the range(1/2, 1] have first partial quotient a1 = 1, and those in (1/3, 1/2] have first partialquotient a1 = 2. The remaining interval [0, 1/3] has numbers whose first partialquotient is already too big, and thus is cut out. We repeat in this way, cutting outintervals for each partial quotient, and arriving at C2; see Figure 4.

For any A ≥ 1, the Cantor-like set CA has some Hausdorff dimension

(2.10) δA := dim(CA),

which we recall is defined as the infimum of all s ≥ 0 for which

(2.11) inf⋃j Bj ⊃ CA

⎧⎨⎩∑j

r(Bj)s

⎫⎬⎭vanishes. The infimum in (2.11) is over collections {Bj}j of open balls (intervals)which cover CA, and r(Bj) is the radius of Bj (half the length of the interval).

Clearly δ1 = 0, since C1 is a single point. There is a substantial literatureestimating the dimension δ2 which we will not survey, but the current record is dueto Jenkinson and Pollicott [JP01], whose superexponential algorithm estimates

(2.12) δ2 = 0.5312805062772051416244686 · · · .If we relax the bound A, the Cantor sets increase, as do their dimensions. In fact,Hensley [Hen92] determined the asymptotic expansion, which to first order is

(2.13) δA = 1− 6

π2A+ o

(1

A

),

as A → ∞. In particular, the dimension can be made arbitrarily close to 1 bytaking A large.

Figure 4. The Cantor set C2 =⋂∞

k=1 C(k)2 , where C

(k)2 =

{[a1, . . . , aj , . . . , ak, . . . ] : aj ≤ A for all 1 ≤ j ≤ k} restricts onlythe first k partial quotients.

Page 9: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 195

We can now explain Hensley’s stronger conjecture. His observation is that oneneed not only consider restricting the partial quotients aj to the full interval [1, A],one can allow more flexibility by fixing any finite “alphabet” A ⊂ N, and restrictingthe partial quotients to the “letters” in this alphabet. To this end, let CA be theCantor set

CA := {[a1, a2, . . . ] : aj ∈ A, ∀j ≥ 1},and similarly let RA be the partial convergents to CA, DA the denominators ofRA, and δA the Hausdorff dimension of CA. Then Hensley’s elegant claim is thefollowing

Conjecture 2.14 (Hensley, 1996 [Hen96, Conjecture 3, p. 16]).

(2.15) DA ⊃ N�1 ⇐⇒ δA > 1/2.

Observe in particular that δ2 in (2.12) exceeds 1/2, and hence Hensley’s fullconjecture (2.15) implies the special case A = 2 in (2.9).

Here is some heuristic evidence in favor of (2.15). Let us visualize the set RAof rationals, by grading each fraction according to the denominator. That is, ploteach fraction b/d at height d, showing the set

(2.16)

{(b

d, d

):

b

d∈ RA, (b, d) = 1

}.

We show this plot in Figure 5a for A = {1, 2} truncated at height N = 10000, andin Figure 5b for A = {1, 2, 3, 4, 5} truncated at height N = 1000. We give a nameto this truncation, defining

RA(N) :=

{b

d∈ RA : (b, d) = 1, 1 ≤ b < d < N

}.

Observe that the “vertical tentacles” in Figure 5 emanate from points on the x-axislying in the Cantor sets CA; compare Figures 5a and 4. Moreover, note that if atleast one point has been placed at height d, then d ∈ DA. That is, the “beef” ofthis problem boils down to, what are the projections of the plots in Figure 5 to they-axis? In particular, does every (sufficiently large) integer appear?

The first question to address is, How big is |RA(N)|, that is, how many pointsare being plotted in Figures 5a and 5b? Hensley [Hen89] showed that, as N → ∞,

(2.17) #RA(N) � N2δA ,

where the implied constant can depend on A. (Hensley proved this for the alphabetA = {1, 2, . . . , A}, but the same proof works for an arbitrary finite A.)

Now, the =⇒ direction of (2.15) is trivial. Indeed, let

DA(N) := DA ∩ [1, N ],

so that the left-hand side of (2.15) is equivalent to

(2.18) #DA(N) = N +O(1), as N → ∞.

Then it is clear that #RA(N) counts d’s with multiplicity, whereas #DA(N) countseach appearing d only once; hence

(2.19) #DA(N) ≤ #RA(N)(2.17) N2δA .

So if (2.18) holds, then (2.19) implies that 2δA must be at least 1.

Page 10: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

196 ALEX KONTOROVICH

0.2 0.4 0.6 0.8 1.0

2000

4000

6000

8000

10000

0.2 0.4 0.6 0.8 1.00

200

400

600

800

1000

(a) A = 2, N = 10000. (b) A = 5, N = 1000.

Figure 5. For each b/d ∈ RA(N), plot b/d versus d, with A andtruncation parameter N as shown.

Page 11: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 197

A caveat: we do not know how to verify (2.18) for a single alphabet! Neverthelessthe content of Hensley’s Conjecture is clearly the opposite ⇐= direction. Here issome evidence in favor of this claim.

An old theorem of Marstrand’s [Mar54] states the following. Let E ⊂ [0, 1] ×[0, 1] be a Hausdorff measurable set having Hausdorff dimension α > 1. Thenthe projection of E into a line of slope tan θ is “large,” for Lebesgue-almost everyθ ∈ R/2πZ. Here “large” means of positive Lebesgue measure. One may thusheuristically think of (2.16) as E above, with (2.17) suggesting the “dimension”α = 2δA. Then DA is the projection of this E to the y-axis, and it should be“large” according to the analogy. Marstrand’s theorem says nothing about anindividual line, and does not apply to the countable set (2.16), so the analogycannot be furthered in any meaningful way. Nevertheless, we see the conditionα > 1 is converted into 2δA > 1, giving evidence for the ⇐= direction of (2.15).

For another heuristic, if one uniformly samples N2δ pairs (b, d) out of the integersup to N , a given d is expected to appear with multiplicity roughly N2δ−1. Forδ > 1/2 and N growing, this multiplicity will be positive with probability tendingto 1.

This heuristic does not rule out the possible conspiracy that only very few (aboutN2δ−1) d’s actually appear, each with very high (about N) multiplicity. But suchan argument in reverse leads to another bit of evidence toward (2.15): since themultiplicity of any d < N is at most N , we have the elementary lower bound

#DA(N) ≥ 1

N#RA(N)

(2.17)� 1

NN2δA = N2δA−1.

So if δA > 1/2, then the set DA already grows at least at a power rate. Furthermore,for any fixed ε > 0, one can take some A = A(ε) sufficiently large so that 2δA−1 >1− ε. For example, using (2.13), we can take A = {1, 2, . . . , A} where

A >12

π2ε(1 + o(1)).

Here o(1) → 0 as ε → 0. Hence one can produce N1−ε points in DA(N), which isalready substantial progress toward (2.18).

But unfortunately, Hensley’s conjecture (2.15), as stated, is false.

Lemma 2.20 (Bourgain and Kontorovich, 2011 [BK11, Lemma 1.19]). The alpha-bet A = {2, 4, 6, 8, 10} has dimension δA = 0.517 . . . , which exceeds 1/2, but doesnot contain every sufficiently large number.

Proof. The dimension can be computed by the Jenkinson–Pollicott algorithm usedto establish (2.12). It is an elementary calculation from the definitions to show forthis alphabet that every fraction in RA is of the form 2m/(4n+1) or (4n+1)/(2m),and so DA ≡ {0, 1, 2}(mod 4). Hence DA does not contain every sufficiently largenumber. �

That is, there can be congruence obstructions, in addition to the condition ondimension. This suggests instead a closer analogy with Hilbert’s 11th problem,which asks, What numbers are represented by a given integral (or rational) qua-dratic form? According to this analogy, we make the following

Definition 2.21. Call d represented by the given alphabet A if d ∈ DA. Also,call d admissible for the alphabet A if it is everywhere locally represented, meaningthat d ∈ DA(mod q) for all q ≥ 1.

Page 12: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

198 ALEX KONTOROVICH

One can then modify Hensley’s conjecture to state that, if δA exceeds 1/2 (anArchimedean condition), then every sufficiently large admissible number is repre-sented, akin to Hasse’s local-to-global principle.

Remark 2.22. We will explain in §2.2 that the alphabet A = A = 2{1, 2} has nolocal obstructions, so Hensley’s first conjecture (2.9) is still plausible.

Here is some progress toward the conjecture.

Theorem Z (Bourgain and Kontorovich, 2011 [BK11]). Almost every natural num-ber is the denominator of a reduced fraction whose partial quotients are bounded by50.

Here “almost every” is in the sense of density: for A = {1, 2, . . . , 50},1

N#(DA ∩ [1, N ]) → 1,

as N → ∞. The proof in fact shows that for any alphabet A having sufficientlylarge dimension

(2.23) δA > δ0,

almost every admissible number is represented, where the value

(2.24) δ0 = 1− 5/312 ≈ 0.98

is sufficient. Using refined versions of Hensley’s asymptotic expansion (2.13), thevalue A = 50 seems to satisfy (2.23). The reason Theorem Z needs no mention ofadmissibility is that any alphabetA with such a large dimension (2.24) must alreadycontain both 1 and 2; missing even one of these letters will drop the dimension bytoo much. Hence there are actually no local obstructions in the theorem, cf. Remark2.22.

To explain the source of this progress, we reformulate Zaremba’s problem ina way that highlights the role of the hitherto unmentioned “thin orbit” lurkingunderneath.

2.1. Reformulation. The key to the above progress is the old and elementaryobservation that

b

d= [a1, . . . , ak]

is equivalent to

(2.25)

(∗ b∗ d

)=

(0 11 a1

)· · ·(

0 11 ak

).

With this observation, it is natural to introduce the semigroup generated by ma-trices of the above form with partial quotients restricted to the given alphabet.Let

(2.26) Γ = ΓA :=

⟨(0 11 a

): a ∈ A

⟩+,

where the superscript “+” denotes generation as a semigroup (no inverse matrices).Then the orbit

(2.27) O = OA := Γ · v0

Page 13: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 199

with

(2.28) v0 = (0, 1)t

isolates the set of second columns in Γ, and from (2.25) is hence in bijection with theset RA. The “thinness” of the orbit is explained by Hensley’s counting statement(2.17), which implies that

#{v ∈ O : ‖v‖ < N} � N2δA ,

as N → ∞. If O consisted of all integer pairs (b, d)t, the above count would bereplaced by N2, ignoring constants. So this is the reason we call O thin: it containsmany fewer points than the ambient set in which it naturally sits.

From (2.25) again, the set DA is nothing more than the set of bottom rightentries of matrices in ΓA. This can be isolated via

(2.29) 〈v0,O〉 = 〈v0,Γ · v0〉 = DA,

where the inner product is the standard one on R2. Thus d is represented if andonly if there is a γ ∈ Γ so that

(2.30) d = 〈v0, γ · v0〉,with v0 given in (2.28).

2.2. Local obstructions. One can now easily understand Remark 2.22, and thesource of any potential local obstructions. The key observation, via (2.29), is thatto understand DA(mod q), one needs only to understand the reduction of Γ(mod q).And the latter can be analyzed by some algebra, namely the so-called strong ap-proximation property; see e.g. [Rap12] for a comprehensive survey. As we will seebelow, this is a property which determines when the reduction mod q map is onto.For general algebraic groups this is a deep theory, the first proof [MVW84] usingthe classification of finite simple groups. But for SL2, the proofs are elementary,see e.g. [DSV03].

First observe that Γ sits inside the integer points of the algebraic group GL2,meaning that any solution in Z to the polynomial equation (ad− bc)m = 1 gives anelement

(a bc d

)∈ GL2(Z), and vice-versa. Actually GL2 does not have strong ap-

proximation, (e.g. the determinant in GL2(Z) can only be ±1, while in GL2(Z/5Z)it is 1, 2, 3 or 4; hence the reduction map cannot be onto). So we first pass to SL2,as follows. The generators in (2.26) all have determinant −1, so the product of anytwo has determinant +1. We make these products the generators for a subsemi-group Γ of Γ, that is, set Γ := Γ∩ SL2. We recover the original Γ-orbit O in (2.27)

by a finite union of Γ-orbits. The limiting Cantor set and its Hausdorff dimensionare unaffected.

Then strong approximation says essentially that for p a sufficiently large primeand q = pe any p power, the reduction of Γ mod q is all of SL2(Z/qZ). (It does

not matter that Γ is only a semigroup; upon reduction mod q, it becomes a group.)Moreover for ramified primes p (those for which the reduction mod p is not onto),the reduction mod sufficiently large powers of p stabilizes after some finite height.This means that there is some power e0 = e0(p, Γ) so that the following holds.For any higher power e > e0, if M ∈ SL2(Z/p

eZ) is such that its reduction is in

Γ(mod pe0), then M is also in Γ(mod pe). (These statements are best made in thelanguage of p-adic numbers, which we avoid here.) A key ingredient is that, while

Γ is some strange subset of SL2(Z), it is nevertheless Zariski dense in SL2. This

Page 14: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

200 ALEX KONTOROVICH

means that if P (a, b, c, d) is a polynomial which vanishes for every(a bc d

)∈ Γ, then

P also vanishes on all matrices in SL2 with entries in C.In the above, “sufficiently large”, both for primes p to be unramified and the

stabilizing powers e0 of ramified primes, can be effectively computed in terms ofthe generators. Then for an arbitrary modulus q = pe11 · · · pekk , the reduction mod qcan be pieced together from those mod p

ejj using a type of Chinese Remainder

Theorem for groups called Goursat’s Lemma. This leaves some finite group theoryto determine completely the reduction of Γ mod any q, and hence explains all localobstructions via (2.29).

We now leave Zaremba’s problem and return to sketch a proof of Theorem Z in§5.

3. Integral Apollonian gaskets

Apollonius of Perga (ca. 262–190 BCE) wrote a two-volume book on tangencies,solving in every conceivable configuration the following general problem: Giventhree circles in the plane, any of which may have radius zero (a point) or infinity(a line), construct a circle tangent to the given ones. The volumes were lost butthe statements survived via a survey of the work by Pappus. In the special casewhen the given three circles are themselves mutually tangent with disjoint pointsof tangency (Figure 6a), Apollonius proved that

(3.1) there are exactly two solutions

to his problem (Figure 6b). Adding these new circles to the configuration, onehas many other triples of tangent circles, and Apollonius’s construction can beapplied to them (Figure 6c). Iterating in this way ad infinitum, as apparentlywas first done in Leibniz’s notebook, gives rise to a circle packing, the closureof which has become known in the last century as an Apollonian gasket. We re-strict our discussion henceforth to bounded gaskets, such as that illustrated inFigure 1; there the number shown inside a circle is its curvature, that is, one overits radius. Such pictures have received considerable attention recently; see e.g.[LMW02, GLM03, GLM05, GLM06a, GLM06b, EL07, Sar07, Sar08, BGS10, KO11,Oh10,BF11, Sar11, Fuc11, FS11,OS12,Vin12, LO12,BK12]. We will focus our dis-cussion on the following two problems.

(a) Three mutually (b) Two more (c) Six moretangent circles. tangent circles. tangent circles.

Figure 6

Page 15: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 201

(1) The counting problem: For a fixed gasket G , how quickly do the circlesshrink, or alternatively, how many circles are there in G with curvaturebounded by a growing parameter T?

(2) The local-global problem: Suppose G is furthermore integral, meaning thatits circles all have integer curvatures, such as the gasket in Figure 1. Howmany distinct integers appear up to a growing parameter N? That is, countcurvatures up to N , but without multiplicity.

Problem (2) does not yet look like a local-global question, but will soon turninto one. We first address problem (1) in more detail.

3.1. The counting problem.

3.1.1. Preliminaries. Some notation: for a typical circle C in a fixed bounded gasketG , let r(C) be its radius and

κ(C) = 1/r(C)

its curvature. Let

(3.2) NG (T ) := #{C ∈ G : κ(C) < T}

be the desired counting function. To study this quantity, one might introduce an“L-function”:

(3.3) LG (s) :=∑C∈G

1

κ(C)s=∑C∈G

r(C)s.

Since the sum of the areas of inside circles in G yields the area of the boundingcircle, the series LG converges for Re(s) ≥ 2. It has some abscissa of convergenceδ, meaning LG converges for Re(s) > δ and diverges for Re(s) < δ. Boyd [Boy73]proved that this abscissa δ is none other than the Hausdorff dimension of the gasketG , as should not be too surprising, comparing (3.3) with the definition (see (2.11)).In fact, Apollonian gaskets are rigid, in the sense that one can be mapped toany other by Mobius transformations. The latter are conformal (angle-preserving)motions of the complex plane, sending z → (az + b)/(cz + d), ad− bc = 1 . Henceδ is a universal constant; McMullen [McM98] estimates that

(3.4) δ = 1.30568 · · · .

From such considerations, Boyd [Boy82] was able to conclude that

logNG (T )

log T→ δ,

as T → ∞.To refine this crude estimate to an asymptotic formula for NG (T ), the author

and Oh [KO11] established a “spectral interpretation” for LG , proving

(3.5) NG (T ) ∼ c · T δ,

for some c = c(G ) > 0, as T → ∞. (This asymptotic was recently refined further inVinogradov’s thesis [Vin12] and independently by Lee and Oh [LO12], giving lowerorder error terms.) The remainder of this subsection is devoted to explaining thisspectral interpretation and highlighting some of the ideas going into the proof of(3.5).

Page 16: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

202 ALEX KONTOROVICH

C4

C1

C2

C3

C4

C4

Figure 7. Generation from a root quadruple.

3.1.2. Root quadruples and generation by reflection. It is easy to see [GLM03, p. 14]that each such gasket G contains a root configuration C = C(G ) := (C1, C2, C3, C4)of four largest mutually tangent circles in G . Let

(3.6) v0 = v0(G ) = (κ1, κ2, κ3, κ4)t

with κj = κ(Cj) be the root quadruple of corresponding curvatures. The boundingcircle, being internally tangent to the others, is given the opposite orientation tomake all interiors disjoint; this is accounted for by giving it negative curvature. Forexample, in Figure 1, the root quadruple is

(3.7) v0 = (−10, 18, 23, 27)t,

where the bounding circle has radius 1/10.Three tangent circles, say C1, C2, C3 have three points of tangency, and they de-

termine a dual circle C4 passing through these points; see Figure 7. Thus the rootconfiguration C determines a dual configuration C = (C1, C2, C3, C4) of four mutu-

ally tangent circles, orthogonal to those in C; see Figure 8. Reflection through C4

fixes C1, C2, and C3, and sends C4 to C ′4, the other solution to Apollonius’s prob-

lem (3.1); see Figure 7. Starting with the root configuration, repeated reflectionsthrough the dual circles give the whole circle packing.

3.1.3. Hyperbolic space and the group A. Following Poincare, we extend these circlereflections to the hyperbolic upper half-space,

(3.8) H3 := {(x1, x2, y) : x1, x2 ∈ R, y > 0},

replacing the action of the dual circle Cj by a reflection through a (hemi)sphere sj

whose equator is Cj (with j = 1, . . . , 4). We abuse notation, writing sj for boththe hemisphere and the conformal map reflecting through sj . The group

(3.9) A := 〈s1, s2, s3, s4〉 < Isom(H3),

generated by these reflections, acts discretely on H3; it is a so-called Schottky group,in that the four generating spheres have disjoint interiors.

The A-orbit of any fixed basepoint p0 ∈ H3 has a limit set in the boundary ∂H3,which is easily seen to be the original gasket; see Figure 9. A fundamental domain

Page 17: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 203

C4

C3

C2

C1

Figure 8. Root and dual configurations.

for an action is a region

(3.10) Ω ⊂ H3,

so that any point in H3 can be sent to Ω in an essentially unique way; for the actionof A, one can take Ω to be the exterior of the four hemispheres.

To see this, observe that if a point p = (x1, x2, y) ∈ H3 is inside one of thespheres sj , then its reflection sj(p) is outside of sj and has a strictly larger y-value.This does not guarantee that sj(p) is outside all of the other spheres, but if it isinside some sk, then reflection through sk will again have even higher y-value. Thisprocedure must halt after finitely many iterations, since the only limit points of Aare in the boundary ∂H3, where y = 0. And it halts only when the image is outsideof the four geodesic hemispheres. Uniqueness follows since any reflection sj takesa point in Ω to a point inside sj , that is, not in Ω.

Two facts are evident from the above: first of all, A is geometrically finite,meaning it has a fundamental domain bounded by a finite number (here it is four)of geodesic2 hemispheres; on the other hand, A has infinite covolume, that is, anyfundamental domain has infinite volume with respect to the hyperbolic measure

y−3dx1dx2dy

in the coordinates (3.8). Note moreover that A has the structure of a Coxeter group,being free save the relations s2j = I for the generators. It is also the symmetry groupof all Mobius transformations fixing G .

3.1.4. Descartes’ Circle Theorem and integral gaskets. Next we need an observationattributed to Descartes in the year 1643 ([Des01, pp. 37–50], though his proof hada gap [Cox68]), that a quadruple v = (b1, b2, b3, b4)

t of signed curvatures of four

2A geodesic in hyperbolic space is a straight vertical line or a semicircle orthogonal to theboundary ∂H3.

Page 18: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

204 ALEX KONTOROVICH

Figure 9. Poincare extension: an A-orbit in H3.

mutually tangent circles lies on the cone

(3.11) Q(v) = 0,

where Q is the so-called “Descartes quadratic form”

(3.12) Q(v) := 2(κ21 + κ2

2 + κ23 + κ2

4

)− (κ1 + κ2 + κ3 + κ4)

2.

By a real linear change of variables, Q can be diagonalized to the form

x2 + y2 + z2 − w2,

that is, it has signature (3, 1). Arguably the most beautiful formulation of Descartes’Theorem (rediscovered on many separate occasions) is the following excerpt fromSoddy’s 1936 Nature poem [Sod36]:

Four circles to the kissing come. / The smaller are the bender. /The bend is just the inverse of / The distance from the center. /Though their intrigue left Euclid dumb / There’s now no need for rule of thumb. /Since zero bend’s a dead straight line / And concave bends have minus sign, /The sum of the squares of all four bends / Is half the square of their sum.

If κ1, κ2 and κ3 are given, then (3.11) is a quadratic equation in κ4 with twosolutions, κ4 and κ′

4, say; this is an algebraic proof of Apollonius’s theorem (3.1).It is then an elementary exercise to see that

κ4 + κ′4 = 2(κ1 + κ2 + κ3).

In other words, if the quadruple (κ1, κ2, κ3, κ4)t is given, then one obtains the

quadruple with κ4 replaced by κ′4 via a linear action:⎛⎜⎜⎝

11

12 2 2 −1

⎞⎟⎟⎠ ·

⎛⎜⎜⎝κ1

κ2

κ3

κ4

⎞⎟⎟⎠ =

⎛⎜⎜⎝κ1

κ2

κ3

κ′4

⎞⎟⎟⎠ .

Hence we have given an algebraic realization to the geometric action of C4 (or s4) onthe root quadruple; see again Figure 7. Call the above 4× 4 matrix S4. Of courseone could also send other κj to κ′

j keeping the three complementary curvatures

Page 19: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 205

fixed, via the matrices

(3.13) S1 =

⎛⎜⎜⎝−1 2 2 2

11

1

⎞⎟⎟⎠, S2 =

⎛⎜⎜⎝12 −1 2 2

11

⎞⎟⎟⎠, S3 =

⎛⎜⎜⎝1

12 2 −1 2

1

⎞⎟⎟⎠.Moreover one can iterate these actions, so we introduce the so-called Apolloniangroup Γ, isomorphic to A, generated by the Sj ,

(3.14) Γ := 〈S1, S2, S3, S4〉.Then the orbit

(3.15) O := Γ · v0

of the root quadruple v0 under the Apollonian group Γ consists of all quadruplescorresponding to curvatures of four mutually tangent circles in the gasket G . Wecan now explain the integrality of all curvatures in Figure 1: the group Γ has onlyinteger matrices, so if the root quadruple v0 (or for that matter any four curvaturesof mutually tangent circles in G ) is integral, then all curvatures in G are integers!This fact seems to have been first observed by Soddy [Sod37].

3.1.5. Reformulating the counting statement and the thin orbit. We note, moreover,that starting with v0, any new circle generated by a reflection is the smallest in itsconfiguration, and hence has largest curvature. That is, for v = γ ·v0 ∈ O, writingγ ∈ Γ as a reduced word in the generators γ = Sik · · ·Si1 , the last multiplicationby Sik changes one entry, which is the largest entry in v. Hence, setting ‖v‖∞ tobe the max-norm and for T large, we can rewrite NG (T ) in (3.2) as

(3.16) NG (T ) = 4 +# {v ∈ O : v �= v0, ‖v‖∞ < T} .Here the first “4” accounts for the root quadruple v0.

We have thus converted the circle counting problem into something seeminglymore tractable: the counting problem for a Γ-orbit. That said, we clearly need abetter understanding of the group Γ. Returning to the Descartes form Q in (3.12),we have by construction (and one can check directly) that for each j = 1, . . . , 4,

Q(Sj · v) = Q(v),

for any v. That is, each generator Sj lies in the so-called orthogonal group preservingthe quadratic form Q,

OQ := {g ∈ GL4 : Q(g · v) = Q(v), ∀v} .Hence Γ also sits inside OQ, and moreover inside OQ(Z), the group of matrices in OQ

with integer entries. The latter is a well understood algebraic group, again meaningthat any solution to a certain set of polynomial equations gives an element in OQ,and vice-versa. But Γ is quite a mysterious group, in particular having infiniteindex in OQ(Z) (this fact is equivalent to A having infinite covolume). It is alsoworth noting here that the general membership problem in a group is known to beundecidable [Nov55], so presenting a matrix group via its generators leaves muchto be desired.3

Just as in Zaremba’s problem, we can now again call this orbit O thin; indeed, forthe counting problem with Γ replaced by the full group OQ(Z) (which is an example

3That said, for our particular group Γ, one can use a reduction algorithm to root quadruplesto determine membership.

Page 20: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

206 ALEX KONTOROVICH

of what is called an “arithmetic lattice”), standard arguments in automorphic formsor ergodic theory [DRS93,EM93] show that

(3.17) #{v ∈ OQ(Z) · v0 : ‖v‖∞ < T} ∼ c T 2, as T → ∞,

for some c > 0. So comparing (3.17) to (3.16), (3.5) and (3.4), where the powerdrops from T 2 to T δ with δ < 2, we see that the Γ orbit is quite degenerate, havingmany fewer points.

3.1.6. Sketch of the counting statement. Finally, we explain the aforementionedspectral interpretation by first giving an analogous elementary example of a count-ing statement in another discrete group: the integers. Let us spectrally count thenumber of integers of size at most T ,

NZ(T ) := #{n ∈ Z : |n| < T}.

Of course this is a trivial problem,

(3.18) NZ(T ) = �2T + 1� = 2T +O(1),

but it will be instructive to analyze it by harmonic analysis. To this end, let

f(x) := 1{|x|<1},

where 1{·} is the indicator function. Scale f to

fT (x) := f(x/T ) = 1{|x|<T},

and periodize it with respect to the discrete group Z,

(3.19) FT (x) :=∑n∈Z

fT (n+ x).

Then we have

(3.20) FT (0) =∑n∈Z

1{|n|<T} = NZ(T ).

By construction, FT (x) = FT (x+1), that is, it takes values on the circle X := Z\R,and is square-integrable, FT ∈ L2(X). The Laplace operator

Δ := − div ◦ grad = − ∂2

∂x2

on smooth functions can be extended to act on the whole Hilbert space L2(X)and is self-adjoint and positive definite (by our choice of sign) with respect to thestandard inner product

〈F,G〉 =∫X

F (x)G(x)dx.

(Proof: partial integration.) Its spectrum Spec(Δ) is just the set of its eigenval-ues, with multiplicity. Elementary Fourier analysis shows that eigenfunctions of Δinvariant under Z-translations are scalar multiples of

ϕm : x → e2πimx

for m ∈ Z. This function has Laplace eigenvalue

λm = 4π2m2,

Page 21: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 207

and hence these numbers λm completely exhaust the spectrum (they have multi-plicity two, except when m = 0). Expanding spectrally gives

(3.21) FT (x) =∑

λm∈Spec(Δ)

〈FT , ϕm〉ϕm(x),

where equality is in the L2-sense. (Note that the ϕm are already scaled to haveunit L2-norm.) The bottom of the spectrum λ0 = 0 corresponds to the constantfunction ϕ0(x) = 1, and contributes the entire “main term” in (3.18) to (3.21):

〈FT , ϕ0〉 · ϕ0 =

(∫Z\R

∑n∈Z

fT (n+ x) · 1 dx)

· 1 = T

∫R

f(x)dx = 2T,

after inserting (3.19), a change of variables, and “unfolding”∫Z\R∑

Zto just

∫R.

That said, the equality (3.21) is in the L2 sense, not pointwise (we cannot evaluate(3.21) at the point x = 0, as needed in (3.20)). Moreover, the rest of the spectrumin (3.21), if bounded in absolute value,∑

λm∈Spec(Δ)λm �=λ0

∣∣∣∣〈FT , ϕm〉ϕm

∣∣∣∣,does not converge, the mth term being of size 1/m. (Exercise.) But there are stan-dard methods (smoothing and later unsmoothing) which overcome these technicalirritants.

A version of the above works with the Apollonian group Γ in place of Z, onceone overcomes a number of further technical obstructions. The reader may wish toomit the following paragraph on the first pass; it is not essential to the sequel.

We now need non-Abelian harmonic analysis on the space L2(X) with

X := A\H3,

the hyperbolic 3-fold in Figure 9. The (positive definite) hyperbolic Laplacian is

Δ = −y2(

∂2

∂x21

+∂2

∂x22

+∂2

∂y2

)+ y

∂y

in the coordinates (3.8). The spectrum in this setting, as studied by Lax and Phillips[LP82], has both continuous and discrete components (though only a finite numberof the latter). As X has infinite volume, the constant function is no longer square-integrable, and the bottom eigenvalue λ0 is strictly positive. A beautiful resultin Patterson–Sullivan theory [Pat76,Sul84] relates this eigenvalue to the Hausdorffdimension of the limiting gasket G , namely

λ0 = δ(2− δ).

The corresponding base eigenfunction ϕ0 replaces the role of the constant function.Here we have used crucially that A is geometrically finite and that δ > 1; see (3.4).Even this is insufficient: because of the non-Euclidean norm ‖ · ‖∞ in (3.16), onemust work not on X but its unit tangent bundle Y := T 1(X). And moreover we donot know how to handle the continuous spectrum directly, applying instead generalresults in the representation theory of semisimple groups about ergodic propertiesof flows on Y . At this point, we will not say more about the proof, inviting theinterested reader to consult the original references [KO11,Vin12,LO12].

Page 22: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

208 ALEX KONTOROVICH

3.2. The local-global problem. Assume now that G is not only bounded but alsointegral (recall that this means it has only integer curvatures). If the curvaturesare all even, say, then we can stretch the gasket by a factor of two, doubling theradii and halving the (still integral) curvatures. In this way, we can rescale anintegral gasket to make it primitive, meaning that there is no number other than±1 dividing all of the curvatures. In fact, all of the salient features of the problempersist if we fix G to be the packing shown in Figure 1, and we do so henceforth.Recall the problem we wish to now address is, How many curvatures are there upto some parameter N , counting without multiplicity, that is, counting only distinctcurvatures?

First some more notation. Let K = K (G ) be the set of all curvatures of circlesC in the gasket G ,

K := {n ∈ Z : ∃C ∈ G with κ(C) = n},

and call n represented if n ∈ K . Staring at Figure 1 for a moment or two, onemight observe that every curvature in our G is

(3.22) ≡ 2, 3, 6, 11, 14, 15, 18, or 23 (mod 24).

These are the local obstructions for G ; accordingly, we call n admissible if it satisfies(3.22), and we set A = A (G ) to be the set of admissible numbers. In general, onecalls n admissible if, as before, it is everywhere locally represented,

(3.23) n ∈ K (mod q), ∀q ≥ 1.

It cannot be the case that A = K , since, for example, n = 15 is admissible, but acircle of radius 1/15 does not appear in our gasket. Nevertheless, as in Zaremba’sproblem, we have the following

Conjecture A. Every sufficiently large admissible number is the curvature of somecircle in G .

This conjecture is stated by Graham, Lagarias, Mallows, Wilks, and Yan[GLM03, p. 37] in the first of a lovely series of papers on Apollonian gaskets andgeneralizations. They observe empirically that congruence obstructions for any in-tegral gasket seem to be to the modulus 24, and this is completely clarified (aswe explain below) by Fuchs [Fuc11] in her thesis. Further convincing numericalevidence toward the conjecture is given in Fuch and Sanden [FS11]. Here is somerecent progress.

Theorem A (Bourgain, and Kontorovich, 2012 [BK12]). Almost every admissiblenumber is the curvature of some circle in G .

Again, “almost every” is in the sense of density, that

(3.24)#(K ∩ [1, N ])

#(A ∩ [1, N ])→ 1,

as N → ∞. It follows from the congruence restrictions (3.22) that for N large,#(A ∩ [1, N ]) is about N/3 (there are eight admissible residue classes mod 24), so(3.24) is equivalent to

#(K ∩ [1, N ]) ∼ N

3.

Page 23: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 209

Some history on this problem: Graham et al. [GLM03] already made the firstprogress, proving that

(3.25) #(K ∩ [1, N ]) � N1/2.

Then Sarnak [Sar07] showed

(3.26) #(K ∩ [1, N ]) � N√logN

,

before Bourgain and Fuchs [BF11] settled the so-called Positive Density Conjecture,that

(3.27) #(K ∩ [1, N ]) � N.

A key observation in the proof of Theorem A is that the problem is nearlyidentical to Zaremba’s, in the following sense. Recall from (3.15) that the orbitO = Γ · v0 of the root quadruple v0 under the Apollonian group Γ contains allquadruples of curvatures, and in particular its entries consist of all curvatures inG . Hence the set K of all curvatures is simply the finite union of sets of the form

(3.28) 〈w0,O〉 = 〈w0,Γ · v0〉,as w0 ranges through the standard basis vectors

e1 = (1, 0, 0, 0)t, . . . , e4 = (0, 0, 0, 1)t,

each picking off one entry of O. A heuristic analogy between Zaremba and theApollonian problem is actually already given in [GLM03, p. 37], but it is crucialfor us that both problems are exactly of the form (3.28); compare to (2.29). Thatis, n is represented if and only if there is a γ in the Apollonian group Γ and somew0 ∈ {e1, . . . , e4} so that

(3.29) n = 〈w0, γ · v0〉.Before saying more about the proof of Theorem A, we first discuss admissibility ingreater detail.

3.2.1. Local obstructions. Through (3.28), the admissibility condition (3.23) is againreduced to the study of the projection of Γ modulo q. An important feature here isthat, as in the Zaremba case, the group Γ is Zariski dense in OQ. Recall that thismeans if P (γ) is a polynomial in the entries of a 4× 4 matrix γ which vanishes forevery γ ∈ Γ, then P also vanishes on all complex matrices in OQ.

We would like again to exploit strong approximation, but neither OQ nor itsorientation-preserving subgroup SOQ := OQ ∩ SL4 have this property (being noteven connected). But there is a standard method of applying strong approximationanyway, by first passing to a certain cover, as we now describe.

From the theory of rational quadratic forms [Cas78], special orthogonal groupsare covered by so-called spin groups, and it is a pleasant accident that, since Q hassignature (3, 1), the spin group of SOQ(R) is isomorphic to SL2(C). Let us explainthis covering map. The formulae are nicer if we first change variables (over Q) fromour quadratic form Q to the equivalent form

Q(x, y, z, w) := xw + y2 + z2.

Observe that the matrix

M :=

(−x y + iz

y − iz w

)

Page 24: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

210 ALEX KONTOROVICH

has determinant equal to −Q and is Hermitian, that is, fixed under transpose-conjugation. The group SL2(C), consisting of 2×2 complex matrices of determinantone, acts on M by

SL2(C) � g : M → g ·M · gt =: M ′ =

(−x′ y′ + iz′

y′ − iz′ w′

),

with M ′ also Hermitian and of determinant −Q. Then it is easy to see that(x′, y′, z′, w′)t is a linear change of variables from (x, y, z, w)t, via left multiplicationby a matrix whose entries are quadratic in the entries of g. Explicitly, if

(3.30) g =

(a+ αi b+ βic+ γi d+ δi

),

then the change of variables matrix is(3.31)

1

| det(g)|2

⎛⎜⎜⎝a2 + α2 2(ac+ αγ) 2(cα− aγ) −c2 − γ2

ab+ αβ bc+ ad+ βγ + αδ dα+ cβ − bγ − aδ −cd− γδaβ − bα −dα + cβ − bγ + aδ −bc+ ad− βγ + αδ dγ − cδ−b2 − β2 −2(bd+ βδ) 2(bδ − dβ) d2 + δ2

⎞⎟⎟⎠ .

Let ρ be the (rational) map from SL2(C) to GL4(R), sending (3.30) to (3.31); thenby construction (again one can verify directly) the image is in SOQ(R). (Someminor technical points: Being quadratic in the entries, ρ is a double cover, with±I having the same image. Moreover, SL2(C) is connected while SOQ(R) has two

connected components, so ρ only maps onto the identity component SO◦Q.) Then

changing variables from Q back to the Descartes form Q by a conjugation, one getsthe desired map

ρ : SL2(C) → SOQ(R).

It is straightforward then to compute the pullback of Γ ∩ SOQ under ρ (see[GLM05,Fuc11]), the answer being the following

Lemma 3.32. There is4 a homomorphism ρ : SL2(C) → SOQ(R) so that the group

Γ := ρ−1(Γ ∩ SOQ) sits in SL2(Z[i]) and is generated by

(3.33) Γ =

⟨±(

1 20 1

), ±(

1 02 1

), ±(

1 + 2i −2−2 1− 2i

)⟩.

Moreover, recalling the generators Sj for Γ in (3.13), one can arrange ρ so thatρ :(1 20 1

) → S2S3, and ρ :

(1 02 1

) → S4S3.

In fact, we have just realized a conjugate of the group A (or rather its index-two orientation-preserving subgroup) explicitly in terms of matrices in PSL(2,C) ∼=Isom+(H3).

From here, one follows the strategy outlined in §2.2. Fuchs [Fuc11] proved an

explicit version of strong approximation for Γ < SL2(Z[i]) (one considers reductionmod principal ideals (q)) via Goursat’s Lemma, some finite group theory, and otheringredients, enabling her to determine completely the reduction of Γ modulo anyq, and hence explaining all local obstructions. The answer is that all primes otherthan 2 and 3 are unramified, meaning, as in §2.2, that for (q, 6) = 1,

Γ ∩ SOQ (mod q) = SOQ(Z/qZ).

4And one can easily write it down explicitly: it is a conjugate of (3.31), but much messier andnot particularly enlightening. We spare the reader.

Page 25: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 211

Recall again that the right-hand side above is a well understood group. And more-over, the prime 2 stabilizes (with the same meaning as §2.2) at the power e0(2) = 3,that is at 8, and the prime 3 stabilizes immediately at e0(3) = 1. Then reducingΓ modulo 23 · 3 = 24, one obtains some explicit finite group, and looking at allthe values of (3.28) for the given root quadruple v0(G ), one immediately sees alladmissible residue classes.

3.2.2. Partial progress. Lemma 3.32 can already be quite useful; in particular, iteasily implies (3.25) and (3.26), as follows.

The Apollonian group Γ contains the matrix S4S3, which by Lemma 3.32 isthe image under ρ of

(1 02 1

). The latter (and hence the former) is a unipotent

matrix, meaning that all its eigenvalues are equal to 1. These have the importantproperty that they grow only polynomially under exponentiation; in particular,(1 02 1

)k=(

1 02k 1

), and one can check directly from the definitions (3.13) that

(S4S3)k =

⎛⎜⎜⎝1 0 0 00 1 0 0

4k2 − 2k 4k2 − 2k 1− 2k 2k4k2 + 2k 4k2 + 2k −2k 2k + 1

⎞⎟⎟⎠ .

Put the above matrix into (3.29) with the root quadruple v0 for our fixed gasketfrom (3.7), and take w0 = e4, say. Then for any k ∈ Z, the number

(3.34) 〈e4 , (S4S3)k · v0〉 = 32k2 + 24k + 27

is represented. That is, the set of represented numbers contains the values of thisquadratic polynomial. From this observation, made in [GLM03], it is immediatethat (3.25) holds. Geometrically, these curvatures correspond to circles in thepacking tangent to C1 and C2, since these are fixed by the corresponding reflectionsthrough C4 and C3. For example, the values k = −2,−1, 0, 1, 2 in (3.34) givecurvatures 107, 35, 27, 83, 203, respectively. These are visible in Figure 1; they areall tangent to the circles of curvature −10 (the bounding circle) and 18, skippingevery other such circle. Using w0 = e3 instead of e4 in (3.34) gives the polynomial32k2 − 8k + 23, the values of which correspond to the skipped circles.

To prove (3.26), we make the following observation, due to Sarnak [Sar07]. It iswell known that the matrices ±

(1 02 1

)and ±(1 20 1

)(which map under ρ to S4S3 and

S2S3, respectively) generate the group

(3.35) Λ(2) :=

{(a bc d

)∈ SL2(Z) :

a ≡ d ≡ 1(mod 2)b ≡ c ≡ 0(mod 2)

}.

This is the so-called level-2 principal congruence subgroup of SL2(Z). Hence byLemma 3.32, the group Γ contains

(3.36) Ξ := 〈 S2S3 , S4S3 〉 = ρ (Λ(2)) .

The point is that Λ(2) is arithmetic, being defined in (3.35) by congruences. Thenfor any integer � coprime to 2k, there is a matrix

( ∗ ∗2k �

)in Λ(2). One can work

out, with the same v0 and w0 as above, that

(3.37)

⟨e4 , ρ

(∗ ∗2k �

)· v0

⟩= 32k2 + 24k�+ 17�2 + 10.

For example, the choices (2k, �) = (4,−3), (2,−1), (4,−1), and (6,−1) give cur-vatures 147, 35, 107, and 243, respectively, visible up the left side of Figure 1, all

Page 26: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

212 ALEX KONTOROVICH

tangent to the bounding circle (since Ξ in (3.36) fixes C1). Observe also that setting� = 1 in (3.37) recovers (3.34). In this way, Sarnak [Sar07] proved that the set K ofrepresented numbers contains all primitive (meaning with 2k and � coprime) valuesof the shifted binary quadratic form in (3.37). Note that the quadratic form hasdiscriminant 242− 4 · 32 · 17 = −1600, and so (3.37) is definite, taking only positivevalues. The number of distinct primitive values of (3.37) up to N was determinedby Landau [Lan08]: it is asymptotic to a constant times N/

√logN , thereby proving

(3.26). A much more delicate and clever, but still “elementary” (no automorphicforms are harmed), argument goes into the proof of the Positive Density Conjecture(3.27), using an ensemble of such shifted binary quadratic forms. For Theorem A,one needs the theory of automorphic representations for the full Apollonian group,as hinted to at the end of §3.1.6.

We now leave the discussion of the Apollonian problem, returning to it again in§5.

4. The thin Pythagorean problem

A Pythagorean triple x = (x, y, z)t is a point on the cone

(4.1) Q(x) = 0,

where Q is the “Pythagorean quadratic form”

Q(x) := x2 + y2 − z2.

Throughout we consider only integral triples, x ∈ Z3, and assume that x, y, andz are coprime; such a triple is called primitive. Elementary considerations thenforce the hypotenuse z to be odd, and x and y to be of opposite parity; we assumehenceforth that x is odd and y is even. The cone has a singularity at the origin, sowe only consider its top half, assuming subsequently that the hypotenuse is positive,z > 0.

Diophantus (and likely the Babylonians [Pli] who preceded him by about as muchas he precedes us) knew how to parametrize Pythagorean triples: Given x, there isa pair v = (u, v) of coprime integers of opposite parity so that

(4.2)

⎧⎨⎩ x = u2 − v2

y = 2uvz = u2 + v2.

That the converse is true is elementary algebra: any such pair v inserted into (4.2)gives rise to a triple x satisfying (4.1). For example, it is easy to see that the triple

(4.3) x0 = (3, 4, 5)t

corresponds to the pair

(4.4) v0 = (2, 1)t.

4.1. Orbits and the spin representation. As in the Apollonian case, the Pytha-gorean form Q has a special (determinant one) orthogonal group preserving it:

(4.5) SOQ := {g ∈ SL3 : Q(g · x) = Q(x)}.And as before, this group is also better understood by passing to its spin cover.Since the Pythagorean form Q has signature (2, 1), there is an accidental isomor-phism between its spin group and SL2(R), given explicitly as follows.

Page 27: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 213

Observe that SL2 acts on a pair v by left multiplication; via (4.2), this actionthen extends to a linear action on x. In coordinates, it is an elementary computationthat the action of

(a bc d

)on v corresponds to left multiplication on x by

(4.6)1

ad− bc

⎛⎝ 12

(a2 − b2 − c2 + d2

)ac− bd 1

2

(a2 − b2 + c2 − d2

)ab− cd bc+ ad ab+ cd

12

(a2 + b2 − c2 − d2

)ac+ bd 1

2

(a2 + b2 + c2 + d2

)⎞⎠ .

One can check directly from the definition (4.5) that (4.6) is an element of SOQ, infact of the connected component SO◦

Q of the identity, and hence we have explicitlyconstructed the spin homomorphism

ρ : SL2(R) → SOQ(R) :

(a bc d

) → (4.6).

Given a Pythagorean triple x0, such as that in (4.3), the group Γ := SO◦Q(Z) of

all integermatrices in SO◦Q acts by left multiplication, giving the full orbit O = Γ·x0

of all Pythagorean triples (with our convention that z > 0, x is odd, and y is even).Via (4.2) again, this SOQ action on x is equivalent to the SL2 action on v. For a

primitive v ∈ Z2, both the integrality and primitivity are preserved by restrictingthe action to just the integral matrices SL2(Z). Moreover, one should preservethe parity condition on v by restricting further to only the principal 2-congruencesubgroup

Λ(2) =

{γ ∈ SL2(Z) : γ ≡ I(mod 2)

}=

⟨±(1 20 1

),±(1 02 1

)⟩,

which already appeared in §3.2.2. One can check directly that the image (4.6) ofany γ ∈ Λ(2) is an integral matrix, that is, in SOQ(Z). For v0 corresponding to x0,

the orbit O := Γ · v0 under the full group Γ := Λ(2) consists of all coprime (u, v)with u even and v odd.

Prompted by the Affine Sieve5 [BGS06,BGS10, SGS11] one may wish to studythin orbits O of Pythagorean triples. Here one replaces the full group SOQ(Z) bysome finitely generated subgroup Γ of infinite index. Equivalently one can consideran orbit O of v0 under an infinite index subgroup Γ of Λ(2). We illustrate thegeneral theory via the following concrete example.

We first give a sample O orbit: in comparison with the generators of Λ(2), let Γbe the group generated by the following two matrices

(4.7) Γ :=

⟨±(

1 20 1

), ±(

1 04 1

)⟩.

This group clearly sits inside Λ(2) but it is not immediately obvious whether it isof finite or infinite index; as we will see later, the index is infinite. Taking the basepair v0 in (4.4), we form the orbit

(4.8) O := Γ · v0.

Correspondingly, we can take the base triple x0 in (4.3), and form the orbit

(4.9) O := Γ · x0

5We have insufficient room to survey this beautiful theory, for which the reader is directed toany number of excellent surveys; see e.g. [SG12].

Page 28: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

214 ALEX KONTOROVICH

(a) View from the side. (b) View from below.

Figure 10. The thin Pythagorean orbit O in (4.9). Points aremarked according to whether the hypotenuse is prime ( ) or com-posite ( ).

of x0 under the group

(4.10) Γ := 〈M1,M2〉,

where M1 and M2 are the images under ρ of the matrices generating Γ. One canelementarily compute from (4.7) and (4.6) that

(4.11) M1 :=

⎛⎝ −1 −2 −22 1 22 2 3

⎞⎠ , M2 :=

⎛⎝ −7 4 8−4 1 4−8 4 9

⎞⎠ .

Figure 10 illustrates this orbit O. We can visually verify that the orbit looks thin,and in the next subsection we confirm this rigorously.

4.2. The orbit is thin. The group SL2(R) also acts on the hyperbolic upper half-plane

H := {z = x+ iy : x ∈ R, y > 0}by fractional linear transformations,

(4.12)

(a bc d

): z → az + b

cz + d.

The action of our group Γ in (4.7) on H has a fundamental domain (the definitionis similar to (3.10)) given by

{z ∈ H : |Re(z)| < 1, |z − 1/4| > 1/4, |z + 1/4| > 1/4},

where the distances above are Euclidean; see Figure 11a. The hyperbolic measureis y−2 dx dy, and hence this region again has infinite hyperbolic area. Equivalently,the index of Γ in Λ(2) is infinite (it is well known that Λ(2) has finite co-area), asclaimed.

Any orbit of a fixed basepoint in H under Γ has some limit set C = C (Γ) in theboundary ∂H. A piece of this Cantor-like set can already be seen in Figure 11a.But to see it fully, we show in Figure 11b the same Γ-orbit in the disk model

D = {z ∈ C : |z| < 1},

Page 29: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 215

(a) Upper half plane model. (b) Disk model.

Figure 11. The orbit of i ∈ H under Γ.

by composing the action of Γ with the map

H → D : z → z − i

z + i

(which encodes the observation that points in the upper half-plane are closer to ithan they are to −i). In the disk model, one more clearly sees the limit set as theset of “directions” in which the orbit O can grow—juxtapose Figure 10b on Figure11b. This limit set C has some Hausdorff dimension δ = δ(Γ) ∈ [0, 1]; one canestimate

(4.13) δ ≈ 0.59 · · · .This dimension (also called the “critical exponent of Γ”) is again an importantgeometric invariant, measuring the “thinness” of Γ, as illustrated in the followingcounting statement [Kon07,Kon09,KO12]. Let ‖x‖ be the Euclidean norm. Thereis some c > 0 so that

(4.14) #{x ∈ O : ‖x‖ < N} ∼ cNδ, as N → ∞.

Once again, (4.14) should be compared with the orbit of x0 under the full ambientgroup, SOQ(Z). Elementary methods show that

#{x ∈ SOQ(Z) · x0 : ‖x‖ < N} ∼ cN.

So in passing from the full orbit to O, the asymptotic drops from N to Nδ, withδ < 1. Thus the orbit O is thin.

The fact that ρ is a quadratic map in the entries (see (4.6)) implies that thecount (4.14) on triples x ∈ O is equivalent to the following asymptotic for the pairs

v ∈ O:

(4.15) #{v ∈ O : ‖v‖ < N} ∼ c′ ·N2δ,

as N → ∞. Note that the power of N is now 2δ. It can also be seen immediatelyfrom (4.1) and (4.2) that

(4.16) ‖x‖ =√x2 + y2 + z2 =

√2z =

√2(u2 + v2) =

√2‖v‖2.

(Geometrically, the cone (4.1) intersects the sphere of radius N at a circle of radius

N/√2.) Observe that (4.14) looks like the Apollonian asymptotic (3.5), while (4.15)

Page 30: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

216 ALEX KONTOROVICH

is more similar to Hensley’s estimate (2.17) in Zaremba’s problem. This is just aconsequence of choosing between working in the orthogonal group or its spin cover.

4.3. Diophantine problems. One can now pose a variety of Diophantine ques-tions about the values of various functions on such thin orbits. Given an orbitO = Γ · x0 and a function f : O → Z, call

(4.17) P := f(O) ⊂ Z

the set of represented numbers. That is, n is represented by the pair (O, f) if thereis some γ ∈ Γ so that n = f(γ · x0). And as before, we say n is admissible ifn ∈ P(mod q) for all q. For example, if f is the “hypotenuse” function, f(x) = z,one can ask whether (O, f) represents infinitely many admissible primes. Evidenceto the affirmative is illustrated in Figure 10, where a triple is highlighted if itshypotenuse is prime. Unfortunately this problem on thin orbits6 seems out ofreach of current technology.

But for a restricted class F of functions f , and orbits O which are “not too thin”,recent progress has been made toward the local-global problem in P. Let F be theset of functions f which are a linear, not on the triples x, but on the correspondingpairs v. For example, it is not particularly well known that in a Pythagorean triple,the sum of the hypotenuse z and the even side y is always a perfect square. Thisfollows immediately from the parametrization (4.2); in particular, y+ z = (u+ v)2.So the function

(4.18) f(x) =√y + z = u+ v

is integer valued on O and linear7 in v.Another way of saying this is to pass to the corresponding orbit O = Γ ·v0. Any

such linear function on v is of the form

(4.19) f(v) = 〈w0,v〉,for some fixed w0 ∈ Z2. In the example (4.18), take w0 = (1, 1)t. Then F consists

of all functions on O which, pulled back to O, are of the form (4.19).

Theorem P (Bourgain and Kontorovich, 2010 [BK10]). Fix any such linear f ∈ Fand Pythagorean triple x0. There is some δ0 < 1 (the value δ0 = 0.99995 suffices)so that if the orbit O = Γ · x0 is not too thin, meaning the exponent δ of Γ satisfies

(4.20) δ > δ0,

then almost every admissible number is represented in P = f(O).

We are finally in position to relate this Pythagorean problem to the Apollonianand Zaremba’s. Indeed, passing to the corresponding orbit O = Γ · v0 and fixingthe function f(v) = 〈w0,v〉, we have that n is represented if there is a γ ∈ Γ sothat

(4.21) n = 〈w0, γ · v0〉.

6For the full orbit of all Pythagorean triples, infinitely many hypotenuses are prime. Thisfollows from (4.2) that z = u2 + v2 and Fermat’s theorem that all primes ≡ 1(mod4) are sums oftwo squares.

7Really we want the values of |u + v|, which within the positive integers are the union of the

values of u+ v and −u− v. Alternatively, we can assume that −I ∈ Γ, as is the case for (4.7).

Page 31: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 217

That is,

(4.22) P = 〈w0, Γ · v0〉,which is of the same form as (2.29) and (3.28). The condition of admissibility

is analyzed again given the generators of Γ by strong approximation, Goursat’sLemma, and finite group theory, as in §2.2.

Note that in light of the asymptotic counting formula (4.15), the minimal dimen-sion δ0 in (4.20) cannot go below 1/2: the numbers in P up to N (counted withmultiplicity) have cardinality roughly N2δ, so if δ is less than 1/2, then certainly alocal-global principle fails miserably. (Such a phenomenon appeared already in thecontext of Hensley’s conjecture (2.15) in Zaremba’s problem.)

5. The circle method: tools and proofs

We briefly review the previous three sections, unifying the (re)formulations of theproblems. The Apollonian, Pythagorean, and Zaremba theorems will henceforthbe referred to as Theorem X, where

X = A,P, or Z,

respectively. Theorem X concerns the set S of numbers of the form

(5.1) S = 〈w0,Γ · v0〉.Here

S =

⎧⎪⎪⎪⎨⎪⎪⎪⎩K = the set of curvatures (3.28) if X = A,

P = the set of square-roots of sums of

hypotenuses and even sides (4.22), (4.18)if X = P,

DA = the set of denominators (2.29) if X = Z,

Γ =

⎧⎪⎨⎪⎩the Apollonian group Γ if X = A,

an infinite index subgroup Γ < Λ(2) if X = P,

the semigroup ΓA if X = Z,

v0 =

⎧⎪⎨⎪⎩the root quadruple if X = A,

any coprime pair of opposite parity if X = P,

(0, 1)t if X = Z,

and

w0 =

⎧⎪⎨⎪⎩a standard basis vector ej if X = A,

any fixed pair if X = P,

(0, 1)t if X = Z.

But now we can forget the individual problems and just focus on the general setting(5.1); one need not keep the above taxonomy in one’s head throughout.

To study the local-global problem for S , we introduce the representation func-tion

(5.2) RN (n) :=∑

γ∈ΩN

1{n=〈w0,γ·v0〉}.

Here N is a growing parameter and ΩN is a certain subset of the radius N ball inΓ,

ΩN ⊂ {γ ∈ Γ : ‖γ‖ < N},

Page 32: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

218 ALEX KONTOROVICH

which we will describe in more detail later. For now, one can just think of ΩN asthe whole radius N ball. To get our bearings, let us recall roughly8 the size of ΩN :

#{γ ∈ Γ : ‖γ‖ < N} �

⎧⎪⎨⎪⎩Nδ, if X = A, see (3.5),

N2δ, if X = P, see (4.15),

N2δA , if X = Z, see (2.17).

We can write this uniformly by introducing the parameter α, defined by

α :=

⎧⎪⎨⎪⎩δ, the dimension of an Apollonian packing if X = A, see (3.4),

2δ, where δ is the dimension of C (Γ), if X = P, see (4.13)

2δA, where δA is the dimension of CA if X = Z, see (2.10).

In each case α satisfies

(5.3) 1 < α < 2.

Then the cardinality of such a ball ΩN is roughly

(5.4) |ΩN | � Nα.

Returning to (5.2), we see by construction that RN is nonnegative. Moreoverobserve that

(5.5) if RN (n) > 0, then certainly n is represented in S .

Also record that

(5.6) RN is supported on n of size |n| N.

Recalling the notation e(x) = e2πix, the Fourier transform

SN (θ) := RN (θ) =∑n∈Z

RN (n)e(nθ)

=∑

γ∈ΩN

e(θ〈w0, γ · v0〉)(5.7)

is a wildly oscillating exponential sum on the circle R/Z = [0, 1), whose graph lookssomething like Figure 12. One recovers RN through elementary Fourier inversion,

(5.8) RN (n) =

∫R/Z

SN (θ)e(−nθ)dθ,

but without further ingredients, one is going around in circles (no pun intended).Hardy and Littlewood had the idea that the bulk of the integral (5.8) could

be captured just by integrating over frequencies θ that are very close to rationalnumbers a/q, (a, q) = 1, with very small denominators q; some of these intervalsare shaded in Figure 12. These are now called the major arcs M; the name refersnot to their total length (they comprise a tiny fraction of the circle R/Z) but to thefact that they are supposed to account for a preponderance of RN (n). Accordingly,we decompose (5.8) as

RN (n) = MN (n) + EN (n),

where the major arc contribution

(5.9) MN (n) :=

∫M

SN (θ)e(−nθ)dθ

8Technically the quoted results are about counting in the corresponding orbits O and not inthe groups Γ, but the order of magnitude is the same for both.

Page 33: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 219

0 12

13

23

14

34

15

25

35

45

16

56

0.0 0.2 0.4 0.6 0.8 1.0

Figure 12. The real part of an exponential sum of the form (5.7).

is supposed to give the “main” term, and

(5.10) EN (n) :=

∫m

SN (θ)e(−nθ)dθ

should be the “error”. Here m := [0, 1) \M are the complemenary so-called minorarcs. If MN (n) is positive and bigger than |EN (n)|, then certainly

(5.11) RN (n) ≥ MN (n)− |EN (n)| > 0,

so again, n is represented. In practice, one typically tries to prove an asymptoticformula (or at least a lower bound) for MN , and then give an upper bound for|EN |.

The reason for this decomposition is that exponential sums such as SN shouldbe mostly supported on M, having their biggest peaks and valleys at (or very near)these frequencies (some of this phenomenon is visible in Figure 12). Indeed, thevalue θ = 0 is as big as SN will ever get,

(5.12) |SN (θ)| ≤ SN (0) = |ΩN |,

which follows trivially (and is thus called the trivial bound) from the triangle in-equality: every summand in (5.7) is a complex number of absolute value 1. Alsofor other θ ∈ M, θ ≈ a/q, the summands should all point in a limited number ofdirections, colluding to give a large contribution to SN . As we will see later, atthese frequencies, one is in a sense measuring the distribution of S (or equivalentlyΩN ) along certain arithmetic progressions. This strategy of coaxing out the (con-jectural) main term for RN works in surprisingly great generality, but can also givefalse predictions (even for the Prime Number Theorem, see e.g. [Gra95]).

Page 34: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

220 ALEX KONTOROVICH

Having made this decomposition, we should determine what we expect for themain term. From (5.7), we have that∑

n

RN (n) = SN (0) = |ΩN |,

so recalling the support (5.2) of RN , one might expect that an admissible numberof size about n � N is represented roughly |ΩN |/N times. In particular, since everyadmissible number is expected to be represented, one would like to show, say, forN/2 ≤ n < N , that

(5.13) MN (n) � S(n)|ΩN |N

.

Here S(n) ≥ 0 is a certain product of local densities called the singular series; italone is responsible for the notion of admissibility, vanishing on nonadmissible n.For admissible n, it typically does not fluctuate too much; crudely, one can showin many contexts the lower bound � N−ε for any ε > 0. For ease of exposition,let us just pretend for now that every n is admissible and remove the role of thesingular series, allowing ourselves to assume that

(5.14) S(n) = 1.

Observe also that, in light of the cardinality (5.4) of |ΩN | and with exponent αranging in (5.3), the lower bound in (5.13) is of the order Nα−1, with α > 1. Thatis, there should be quite a lot of representations of an admissible n � N large,giving further indication that every sufficiently large admissible number may berepresented.

One is then left with the problem of estimating away the remainder term EN ,and this is why (as Peter Sarnak likes to say) the circle method is a “method”and not a “theorem”: establishing such estimates is much more of an art than ascience. The Hardy–Littlewood procedure suggests somehow exploiting the factthat on the minor arc frequencies, θ ∈ m, the exponential sum SN in (5.7) shoulditself already be quite small, being a sum of canceling phases. If one could indeedprove at the level of individual n an upper bound for the error term EN , whichis asymptotically smaller than the lower bound (5.13) for MN , then one wouldimmediately conclude the full local-global conjecture that every sufficiently largeadmissible n is represented. Unfortunately, at present we do not know how to givesuch strong upper bounds on the minor arcs.

Instead, we settle for an “almost” local-global statement, by proving a sharpbound not for individual n, but for n in an average sense, as follows. Parseval’stheorem states that the L2-norm of a function is equal to that of its Fourier trans-form; that is, the Fourier transform is a unitary operator on these Hilbert spaces.Using the definition (5.10), Parseval’s theorem then gives

(5.15)∑n

|EN (n)|2 =

∫m

|SN (θ)|2dθ.

Inserting our trivial bound (5.12) for SN into the above yields a trivial bound for(5.15) of

(5.16)

∫m

|SN (θ)|2dθ ≤ |ΩN |2.

Page 35: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 221

We claim that it suffices for our applications to establish a bound of the form

(5.17)

∫m

|SN (θ)|2dθ = o

(|ΩN |2N

).

That is, the above saves a little more than√N on average over m off of each term

SN relative to the trivial bound (5.16). We first explain why this suffices.

5.1. Proof of Theorem X, assuming (5.13) and (5.17). Let E(N) be the set ofexceptional n (those that are admissible but not represented) in the range N/2 ≤n < N . Recalling the sufficient condition (5.11) for representation, the number ofexceptions is bounded by

#E(N) ≤∑

N/2<|n|<Nn is admissible

1{|EN (n)|≥MN (n)}.

For admissible n, we have the supposed major arc lower bound (5.13) and recallour simplifying assumption (5.14) to ignore the singular series; thus

(5.18) #E(N) ≤∑n

1{|EN (n)|�|ΩN |/N}.

Here is a pleasant (standard) trick: For those n contributing a 1 rather than 0 to(5.18), we have

1 |EN (n)||ΩN |/N ,

both sides of which may be squared. Hence (5.18) implies that

#E(N) N2

|ΩN |2 ·∑n

|EN (n)|2.

Now we apply Parseval’s theorem (5.15) and the supposed minor arcs bound (5.17).This gives

#E(N) = o

(N2

|ΩN |2 · |ΩN |2N

)= o(N),

and thus 100% of the admissible numbers in the range [N/2, N) are represented.Combining such dyadic intervals, we conclude that almost every admissible numberis represented.

Now “all” that is left is to establish the major arcs bound (5.13) and the errorbound (5.17). In the next two subsections, we focus individually on the tools neededto prove these claims.

5.2. The major arcs. Recall that MN in (5.9) is an integral over the major arcsθ ∈ M; here θ is very close to a fraction a/q, with q “small” (the meaning of whichis explained below). Also let us pretend for now that ΩN is just the whole Γ-ball,

(5.19) ΩN = {γ ∈ Γ : ‖γ‖ < N}.We begin by trying to evaluate (5.7) at θ = a/q:

SN

(a

q

)=∑γ∈Γ

‖γ‖<N

e

(a

q〈w0, γ · v0〉

).

An important observation in the above is that the summation may be groupedaccording to the residue class mod q of the integer 〈w0, γ ·v0〉. Or what is essentially

Page 36: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

222 ALEX KONTOROVICH

1 ±20 1 ± 2:x x→

1 0±4 1 :x x→

Figure 13. An expander; shown with q = 101.

the same, we can decompose the sum according to the the residue class of γ(mod q).To this end, let Γq = Γ(mod q) be the set of such residue classes (which we havealready studied in the context of admissibility and strong approximation). Thenwe split the sum as

(5.20) SN

(a

q

)= |ΩN |

∑γ0∈Γq

e

(a

q〈v0 · γ0,w0〉

⎡⎢⎣ 1

|ΩN |∑γ∈Γ

‖γ‖<N

1{γ≡γ0(mod q)}

⎤⎥⎦ ,where we have artificially multiplied and divided by the cardinality of ΩN . Nowfor γ0 fixed, the bracketed term is then measuring the “probability” that γ ≡γ0(mod q). As one may suspect, our groups do not have particular preferences forcertain residue classes over others; that is, this probability becomes equidistributedas N grows, with q also allowed to grow, but at a much slower rate. (In fact, this isexactly what we mean by the denominator q being “small”—relative to N—in themajor arcs M.) To explain how this happens, we briefly discuss the crucial notionof an expander.

Rather than going into the general theory (for which we refer the reader tothe beautiful survey [Lub12]; see also [Sar04]), we content ourselves with but oneillustrative example of expansion. Figure 13 shows the following graph. For q = 101,say, take the vertices to be the elements of Z/qZ, organized around the unit circleby placing x ∈ Z/qZ at e(x/q). For the edges, connect each

(5.21) x to x± 2, and also to x(±4x+ 1)−1,

when inversion (mod q) is possible. This is nothing more than the fractional linearaction (see (4.12)) of the generating matrices in (4.7) (and their inverses) on Z/qZ.We first claim that our graph on q vertices is “sparse”. Indeed, the complete graph(connecting any vertex to any other) has on the order of q2 edges, whereas our graphhas only on the order of q edges (since (5.21) implies that any vertex is connectedto at most four others). So we have square-root the total number of possible edges,and our graph is indeed quite sparse.

Page 37: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 223

Despite having few edges, it is a fact that this graph is nevertheless highlyconnected, in the sense that a random walk on it is rapidly mixing. Moreover,this rate of mixing, properly normalized, is independent of the choice of q above.That is, by varying q, we in fact have a whole family of such sparse but highlyconnected graphs, and with a uniform mixing rate; this is exactly what characterizesan expander.

Proofs of expansion use, among other things, tools from additive combina-torics, in particular, the so-called sum-product [BKT04,Bou08] and triple-product[Hel08,BGT11,PS10] estimates, and quite a lot of other work which we will not sur-vey; see e.g. [SX91,Gam02,BG08,BGS10,Var10,BV11, SGV11]. Once one provesuniform expansion for such finite graphs, the statements must be converted intothe Archimedean form needed for the bracketed term in (5.20). To handle suchcounting statements, one uses⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

infinite volume spectral and representation theory

a la §3.1.6, specifically Vinogradov’s thesis [Vin12], if X = A,

similar techniques developed by Bourgain,

Kontorovich, and Sarnak [BKS10],if X = P,

the thermodynamic formalism, analytically continuing

certain Ruelle transfer operators [Lal89,Dol98,Nau05]

and their “congruence” extensions; see [BGS11],

if X = Z.

Without going into details, the upshot is that, up to acceptable errors, thebracketed term in (5.20) is just 1/|Γq|, confirming the desired equidistribution.Inserting this estimation into MN in (5.9), one uses these techniques and somemore standard circle method analysis to eventually conclude (5.13).

5.3. The minor arcs. We use different strategies to prove the minor arcs bound(5.17) for the Pythagorean and Zaremba settingsX = P or Z, versus the ApolloniansettingX = A, so we present them individually. As the details quickly become quitetechnical, we will only scratch the surface, inviting the interested reader to studythe original manuscripts [BK10,BK11,BK12]. Hopefully the short sketches belowgive some indication for the flavor of the arguments involved.

5.3.1. Pythagorean and Zaremba settings. To handle the minor arcs here, we makethe observation that the ensemble ΩN in the definition of SN from (5.7) need notbe an exact Γ-ball as in (5.19), but can be replaced by, say, a product of two such.That is, the definition of SN can be changed to

(5.22) SN (θ) :=∑γ1∈Γ

‖γ1‖<√

N

∑γ2∈Γ

‖γ2‖<√

N

e(θ〈v0γ1γ2,w0〉),

without irreparably damaging the major arcs analysis. This new sum encodes muchmore of the (semi)group structure of Γ, while preserving the “nonvanishing impliesrepresented” property (5.5), where RN is redefined by Fourier inversion (5.8). (Inreality, we use even more complicated exponential sums.) The advantage of (5.22)is that we can now exploit this structure a la Vinogradov’s method [Vin37] forestimating bilinear forms: one can think of (5.22) as the sum of all entries in

Page 38: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

224 ALEX KONTOROVICH

the matrix indexed by γ1 and γ2 with entries e(θ〈v0γ1γ2, ω0〉). Just one standardmaneuver in estimating bilinear forms is the following.

Apply the Cauchy–Schwarz inequality to (5.22) in the γ1 variable:

|SN (θ)| ≤

⎛⎜⎝ ∑γ1∈Γ

‖γ1‖<√

N

1

⎞⎟⎠1/2⎛⎜⎝ ∑

γ1∈SL2(Z)

‖γ1‖<√

N

∣∣∣∣∣∣∣∑γ2∈Γ

‖γ2‖<√

N

e(θ〈v0γ1γ2,w0〉)

∣∣∣∣∣∣∣2⎞⎟⎠

1/2

.

Notice in the second appearance of a γ1 sum, we have replaced the thin and myste-rious group Γ (or semigroup ΓA) by the full ambient group SL2(Z). On one hand,this allows us to now use more classical tools to get the requisite cancellation (5.17)in the minor arcs integral. On the other hand, this type of perturbation argumentonly succeeds when δ is near 1, explaining the dimension restrictions (2.23) and(4.20).

5.3.2. The Apollonian case. The above strategy fails for the Apollonian problem,because the Hausdorff dimension (3.4) is a fixed invariant which refuses to be ad-justed to suit our needs. Instead, we recall that the Apollonian group Γ containsthe special (arithmetic) subgroup Ξ from (3.36). Then, like (5.22), we change thedefinition of the exponential sum to something of the (again, bilinear) form

(5.23) SN (θ) :=∑ξ∈Ξ

‖ξ‖<X

∑γ∈Γ

‖γ‖<T

e(θ〈v0 · ξ γ,w0〉),

for certain parameters X and T chosen optimally in relation to N . One uses thefull sum over the group Γ to capture the major arcs and admissibility conditions.For the minor arcs bound, one keeps γ fixed and uses the classical arithmetic groupΞ to get sufficient cancellation to prove the desired bound (5.17). Again, we sparethe reader all details.

5.4. Conclusion. Putting together the above-sketched minor arcs upper bound(5.17) with the major arcs lower bound (5.13) discussed in §5.2, we prove the mainTheorem X, as explained in §5.1. We end by emphasizing again that, though theproblems have nearly identical reformulations, the circle method is only a methodand not an applicable theorem: while the idea of breaking the integral (5.8) into

Table 1

Theorem Tools for Major Arcs Ingredients for Minor Arcs

A

infinite volumehyperbolic 3-folds,automorphic forms,

representations, expanders

that Γ contains the arithmeticsubgroup Ξ ∼= Λ(2),

bilinear forms

P

infinite volumehyperbolic 2-folds,automorphic forms,

representations, expanders

replacing Γ by SL2(Z)for δ near 1,bilinear forms

Zthermodynamic formalism,

congruence transfer operators,expanders

replacing ΓA by SL2(Z)for δA near 1,bilinear forms

Page 39: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 225

major and minor arcs is ubiquitous, the actual execution of this idea is handledby very different tools in each case; see Table 1. Besides the circle method, theonly other pervasive and critical ingredients are expanders for the major arcs andbilinear forms for the minor arcs.

Acknowledgments

The author wishes to thank Andrew Granville for encouraging him to pen thesenotes and for his insightful and detailed input on various drafts. The author alsothanks Mel Nathanson for inviting him to give a mini-course at CANT 2012, as aresult of which these notes were finally assembled. The author is grateful to Pe-ter Sarnak for introducing him to Apollonian gaskets and infinite volume spectralmethods, to Hee Oh for introducing him to homogeneous dynamics, and to DorianGoldfeld for his constant support and advice. Thanks to Elena Fuchs, Aryeh Kon-torovich, Sam Payne, and especially the referee for detailed comments on an earlierdraft. Most of all, the author owes a huge debt of gratitude to Jean Bourgain forhis generous tutelage and collaboration.

About the author

Alex Kontorovich is assistant professor of mathematics at Yale University. Hereceived his Ph.D. at Columbia University in 2007 under Dorian Goldfeld and PeterSarnak, and has held positions at Brown University, the Institute for AdvancedStudy, and Stony Brook University.

References

[BF11] Jean Bourgain and Elena Fuchs, A proof of the positive density conjecture for inte-ger Apollonian circle packings, J. Amer. Math. Soc. 24 (2011), no. 4, 945–967, DOI10.1090/S0894-0347-2011-00707-8. MR2813334 (2012d:11072)

[BG08] Jean Bourgain and Alex Gamburd, Uniform expansion bounds for Cayley graphs

of SL2(Fp), Ann. of Math. (2) 167 (2008), no. 2, 625–642, DOI 10.4007/an-nals.2008.167.625. MR2415383 (2010b:20070)

[BGS06] Jean Bourgain, Alex Gamburd, and Peter Sarnak, Sieving and expanders, C. R. Math.Acad. Sci. Paris 343 (2006), no. 3, 155–159, DOI 10.1016/j.crma.2006.05.023 (English,with English and French summaries). MR2246331 (2007b:11139)

[BGS10] Jean Bourgain, Alex Gamburd, and Peter Sarnak, Affine linear sieve, expanders, andsum-product, Invent. Math. 179 (2010), no. 3, 559–644, DOI 10.1007/s00222-009-0225-3. MR2587341 (2011d:11018)

[BGS11] Jean Bourgain, Alex Gamburd, and Peter Sarnak, Generalization of Selberg’s 316

the-

orem and affine sieve, Acta Math. 207 (2011), no. 2, 255–290, DOI 10.1007/s11511-

012-0070-x. MR2892611[BGT11] Emmanuel Breuillard, Ben Green, and Terence Tao, Approximate subgroups of linear

groups, Geom. Funct. Anal. 21 (2011), no. 4, 774–819, DOI 10.1007/s00039-011-0122-y.MR2827010

[BK10] Jean Bourgain and Alex Kontorovich, On representations of integers in thin subgroupsof SL2(Z), Geom. Funct. Anal. 20 (2010), no. 5, 1144–1174, DOI 10.1007/s00039-010-0093-4. MR2746949 (2012i:11008)

[BK11] Jean Bourgain and Alex Kontorovich, On Zaremba’s conjecture, C. R. Math. Acad.Sci. Paris 349 (2011), no. 9-10, 493–495, DOI 10.1016/j.crma.2011.03.023 (English,with English and French summaries). MR2802911 (2012e:11012)

[BK12] J. Bourgain and A. Kontorovich. On the local-global conjecture for integral Apolloniangaskets, 2012. Preprint, arXiv:1205.4416v1.

[BKS10] Jean Bourgain, Alex Kontorovich, and Peter Sarnak, Sector estimates for hyperbolicisometries, Geom. Funct. Anal. 20 (2010), no. 5, 1175–1200, DOI 10.1007/s00039-010-0092-5. MR2746950

Page 40: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

226 ALEX KONTOROVICH

[BKT04] J. Bourgain, N. Katz, and T. Tao, A sum-product estimate in finite fields, and appli-cations, Geom. Funct. Anal. 14 (2004), no. 1, 27–57, DOI 10.1007/s00039-004-0451-1.MR2053599 (2005d:11028)

[Bou08] Jean Bourgain, The sum-product theorem in Zq with q arbitrary, J. Anal. Math. 106(2008), 1–93, DOI 10.1007/s11854-008-0044-2. MR2448982 (2009i:11010)

[Boy73] David W. Boyd, The residual set dimension of the Apollonian packing, Mathematika20 (1973), 170–174. MR0493763 (58 #12732)

[Boy82] David W. Boyd, The sequence of radii of the Apollonian packing, Math. Comp. 39(1982), no. 159, 249–254, DOI 10.2307/2007636. MR658230 (83i:52013)

[BV11] Jean Bourgain and Peter P. Varju, Expansion in SLd(Z/qZ), q arbitrary, Invent. Math.188 (2012), no. 1, 151–173, DOI 10.1007/s00222-011-0345-4. MR2897695

[Cas78] J. W. S. Cassels, Rational quadratic forms, London Mathematical Society Monographs,vol. 13, Academic Press Inc. [Harcourt Brace Jovanovich Publishers], London, 1978.MR522835 (80m:10019)

[Cox68] H. S. M. Coxeter, The problem of Apollonius, Amer. Math. Monthly 75 (1968), 5–15.MR0230204 (37 #5767)

[Des01] Rene Descartes. Œuvres, volume 4. Paris, 1901. C. Adams and P. Tannery, eds.[Dol98] Dmitry Dolgopyat, On decay of correlations in Anosov flows, Ann. of Math. (2) 147

(1998), no. 2, 357–390, DOI 10.2307/121012. MR1626749 (99g:58073)[DRS93] W. Duke, Z. Rudnick, and P. Sarnak, Density of integer points on affine homogeneous

varieties, Duke Math. J. 71 (1993), no. 1, 143–179, DOI 10.1215/S0012-7094-93-07107-4. MR1230289 (94k:11072)

[DSV03] Giuliana Davidoff, Peter Sarnak, and Alain Valette, Elementary number theory, grouptheory, and Ramanujan graphs, London Mathematical Society Student Texts, vol. 55,Cambridge University Press, Cambridge, 2003. MR1989434 (2004f:11001)

[EL07] Nicholas Eriksson and Jeffrey C. Lagarias, Apollonian circle packings: number theory.II. Spherical and hyperbolic packings, Ramanujan J. 14 (2007), no. 3, 437–469, DOI10.1007/s11139-007-9052-6. MR2357448 (2008i:11095)

[EM93] Alex Eskin and Curt McMullen, Mixing, counting, and equidistribution in Lie groups,Duke Math. J. 71 (1993), no. 1, 181–209, DOI 10.1215/S0012-7094-93-07108-6.

MR1230290 (95b:22025)[FS11] Elena Fuchs and Katherine Sanden, Some experiments with integral Apollonian circle

packings, Exp. Math. 20 (2011), no. 4, 380–399, DOI 10.1080/10586458.2011.565255.MR2859897 (2012j:52039)

[Fuc11] Elena Fuchs, Strong approximation in the Apollonian group, J. Number Theory 131(2011), no. 12, 2282–2302, DOI 10.1016/j.jnt.2011.05.010. MR2832824 (2012g:11123)

[Gam02] Alex Gamburd, On the spectral gap for infinite index “congruence” subgroups ofSL2(Z), Israel J. Math. 127 (2002), 157–200, DOI 10.1007/BF02784530. MR1900698(2003b:11050)

[GLM03] Ronald L. Graham, Jeffrey C. Lagarias, Colin L. Mallows, Allan R. Wilks, and Cather-ine H. Yan, Apollonian circle packings: number theory, J. Number Theory 100 (2003),no. 1, 1–45, DOI 10.1016/S0022-314X(03)00015-5. MR1971245 (2004d:11055)

[GLM05] Ronald L. Graham, Jeffrey C. Lagarias, Colin L. Mallows, Allan R. Wilks, and Cather-ine H. Yan, Apollonian circle packings: geometry and group theory. I. The Apolloniangroup, Discrete Comput. Geom. 34 (2005), no. 4, 547–585, DOI 10.1007/s00454-005-1196-9. MR2173929 (2009a:11090a)

[GLM06a] Ronald L. Graham, Jeffrey C. Lagarias, Colin L. Mallows, Allan R. Wilks, andCatherine H. Yan, Apollonian circle packings: geometry and group theory. II. Super-Apollonian group and integral packings, Discrete Comput. Geom. 35 (2006), no. 1,1–36, DOI 10.1007/s00454-005-1195-x. MR2183489 (2009a:11090b)

[GLM06b] Ronald L. Graham, Jeffrey C. Lagarias, Colin L. Mallows, Allan R. Wilks, and Cather-ine H. Yan, Apollonian circle packings: geometry and group theory. III. Higher dimen-

sions, Discrete Comput. Geom. 35 (2006), no. 1, 37–72, DOI 10.1007/s00454-005-1197-8. MR2183490 (2009a:11090c)

[Gra95] Andrew Granville, Harald Cramer and the distribution of prime numbers, Scand. Ac-tuar. J. 1 (1995), 12–28, DOI 10.1080/03461238.1995.10413946. Harald Cramer Sym-posium (Stockholm, 1993). MR1349149 (96g:01002)

Page 41: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

FROM APOLLONIUS TO ZAREMBA 227

[Hel08] H. A. Helfgott, Growth and generation in SL2(Z/pZ), Ann. of Math. (2) 167 (2008),no. 2, 601–623, DOI 10.4007/annals.2008.167.601. MR2415382 (2009i:20094)

[Hen89] Doug Hensley, The distribution of badly approximable numbers and continuants withbounded digits, Theorie des nombres (Quebec, PQ, 1987), de Gruyter, Berlin, 1989,pp. 371–385. MR1024576 (91e:11078)

[Hen92] Doug Hensley, Continued fraction Cantor sets, Hausdorff dimension, and func-tional analysis, J. Number Theory 40 (1992), no. 3, 336–358, DOI 10.1016/0022-

314X(92)90006-B. MR1154044 (93c:11058)[Hen96] Douglas Hensley, A polynomial time algorithm for the Hausdorff dimension of

continued fraction Cantor sets, J. Number Theory 58 (1996), no. 1, 9–45, DOI10.1006/jnth.1996.0058. MR1387719 (97i:11085a)

[JP01] Oliver Jenkinson and Mark Pollicott, Computing the dimension of dynamically definedsets: E2 and bounded continued fractions, Ergodic Theory Dynam. Systems 21 (2001),no. 5, 1429–1445, DOI 10.1017/S0143385701001687. MR1855840 (2003m:37027)

[KO11] Alex Kontorovich and Hee Oh, Apollonian circle packings and closed horosphereson hyperbolic 3-manifolds, J. Amer. Math. Soc. 24 (2011), no. 3, 603–648, DOI10.1090/S0894-0347-2011-00691-7. With an appendix by Oh and Nimish Shah.MR2784325

[KO12] Alex Kontorovich and Hee Oh, Almost prime Pythagorean triples in thin orbits, J.Reine Angew. Math. 667 (2012), 89–131. MR2929673

[Kon07] Alex V. Kontorovich, The hyperbolic lattice point count in infinite volume with ap-plications to sieves, ProQuest LLC, Ann Arbor, MI, 2007. Thesis (Ph.D.)–ColumbiaUniversity. MR2710911

[Kon09] Alex V. Kontorovich, The hyperbolic lattice point count in infinite volume with appli-cations to sieves, Duke Math. J. 149 (2009), no. 1, 1–36, DOI 10.1215/00127094-2009-035. MR2541126 (2011f:11125)

[Lal89] Steven P. Lalley, Renewal theorems in symbolic dynamics, with applications to geodesicflows, non-Euclidean tessellations and their fractal limits, Acta Math. 163 (1989),no. 1-2, 1–55, DOI 10.1007/BF02392732. MR1007619 (91c:58112)

[Lan08] E. Landau. Uber die Einteilung der positiven ganzen Zahlen in vier Klassen nach derMindestzahl der zu ihrer additiven Zusammensetzung erforderlichen Quadrate. Arch.der Math. u. Phys., 13(3):305–312, 1908.

[LMW02] Jeffrey C. Lagarias, Colin L. Mallows, and Allan R. Wilks, Beyond the Descartes circletheorem, Amer. Math. Monthly 109 (2002), no. 4, 338–361, DOI 10.2307/2695498.MR1903421 (2003e:51030)

[LO12] M. Lee and H. Oh. Effective circle count for Apollonian packings and closed horo-spheres, 2012. Preprint, arXiv:1202.1067.

[LP82] Peter D. Lax and Ralph S. Phillips, The asymptotic distribution of lattice points inEuclidean and non-Euclidean spaces, J. Funct. Anal. 46 (1982), no. 3, 280–350, DOI10.1016/0022-1236(82)90050-7. MR661875 (83j:10057)

[Lub12] Alexander Lubotzky, Expander graphs in pure and applied mathematics, Bull. Amer.Math. Soc. (N.S.) 49 (2012), no. 1, 113–162, DOI 10.1090/S0273-0979-2011-01359-3.MR2869010 (2012m:05003)

[Mar54] J. M. Marstrand, Some fundamental geometrical properties of plane sets of fractionaldimensions, Proc. London Math. Soc. (3) 4 (1954), 257–302. MR0063439 (16,121g)

[McM98] Curtis T. McMullen, Hausdorff dimension and conformal dynamics. III. Computationof dimension, Amer. J. Math. 120 (1998), no. 4, 691–721. MR1637951 (2000d:37055)

[MVW84] C. R. Matthews, L. N. Vaserstein, and B. Weisfeiler, Congruence properties of Zariski-dense subgroups. I, Proc. London Math. Soc. (3) 48 (1984), no. 3, 514–532, DOI10.1112/plms/s3-48.3.514. MR735226 (85d:20040)

[Nau05] Frederic Naud, Expanding maps on Cantor sets and analytic continuation of

zeta functions, Ann. Sci. Ecole Norm. Sup. (4) 38 (2005), no. 1, 116–153,

DOI 10.1016/j.ansens.2004.11.002 (English, with English and French summaries).MR2136484 (2006e:37033)

[Nie78] Harald Niederreiter, Quasi-Monte Carlo methods and pseudo-random numbers, Bull.Amer. Math. Soc. 84 (1978), no. 6, 957–1041, DOI 10.1090/S0002-9904-1978-14532-7.MR508447 (80d:65016)

Page 42: FROM APOLLONIUS TO ZAREMBA:  LOCAL-GLOBAL PHENOMENA IN THIN ORBITS

228 ALEX KONTOROVICH

[Nov55] P. S. Novikov, Ob algoritmiceskoı nerazresimosti problemy tozdestva slov v teoriigrupp, Trudy Mat. Inst. im. Steklov. no. 44, Izdat. Akad. Nauk SSSR, Moscow, 1955(Russian). MR0075197 (17,706b)

[OEI] http://oeis.org/A195901.[Oh10] Hee Oh, Dynamics on geometrically finite hyperbolic manifolds with applications to

Apollonian circle packings and beyond, Proceedings of the International Congress ofMathematicians. Volume III, Hindustan Book Agency, New Delhi, 2010, pp. 1308–

1331. MR2827842 (2012f:37004)[OS12] Hee Oh and Nimish Shah, The asymptotic distribution of circles in the orbits of

Kleinian groups, Invent. Math. 187 (2012), no. 1, 1–35, DOI 10.1007/s00222-011-0326-7. MR2874933 (2012k:37011)

[Pat76] S. J. Patterson, The limit set of a Fuchsian group, Acta Math. 136 (1976), no. 3-4,241–273. MR0450547 (56 #8841)

[Pli] http://en.wikipedia.org/wiki/Plimpton_322.[PS10] L. Pyber and E. Szabo. Growth in finite simple groups of lie type of bounded rank,

2010. Preprint arXiv:1005.1858.[Rap12] A. Rapinchuk. On strong approximation for algebraic groups, 2012. Preprint

arXiv:1207.4425.[Sar04] Peter Sarnak, What is. . . an expander?, Notices Amer. Math. Soc. 51 (2004), no. 7,

762–763. MR2072849[Sar07] P. Sarnak. Letter to J. Lagarias, 2007. http://web.math.princeton.edu/sarnak/

AppolonianPackings.pdf.[Sar08] Peter Sarnak, Equidistribution and primes, Asterisque 322 (2008), 225–240. Geometrie

differentielle, physique mathematique, mathematiques et societe. II. MR2521658(2010k:11146)

[Sar11] Peter Sarnak, Integral Apollonian packings, Amer. Math. Monthly 118 (2011), no. 4,291–306, DOI 10.4169/amer.math.monthly.118.04.291. MR2800340 (2012e:52047)

[Sch72] Wolfgang M. Schmidt, Irregularities of distribution. VII, Acta Arith. 21 (1972), 45–50.MR0319933 (47 #8474)

[SG12] Alireza Salehi Golsefidy, Counting lattices in simple Lie groups: the positive character-

istic case, Duke Math. J. 161 (2012), no. 3, 431–481, DOI 10.1215/00127094-1507421.MR2881228 (2012k:22015)

[SGS11] A. Salehi Golsefidy and P. Sarnak. Affine sieve, 2011. Preprint.[SGV11] A. Salehi Golsefidy and P. Varju. Expansion in perfect groups, 2011. Preprint.[Sod36] F. Soddy. The kiss precise. Nature, 137:1021, 1936.[Sod37] F. Soddy. The bowl of integers and the hexlet. Nature, 139:77–79, 1937.[Sul84] Dennis Sullivan, Entropy, Hausdorff measures old and new, and limit sets of ge-

ometrically finite Kleinian groups, Acta Math. 153 (1984), no. 3-4, 259–277, DOI10.1007/BF02392379. MR766265 (86c:58093)

[SX91] Peter Sarnak and Xiao Xi Xue, Bounds for multiplicities of automorphic representa-tions, Duke Math. J. 64 (1991), no. 1, 207–227, DOI 10.1215/S0012-7094-91-06410-0.MR1131400 (92h:22026)

[Var10] P. Varju. Expansion in SLd(OK/I), I square-free, 2010. arXiv:1001.3664v1.[Vin37] I. M. Vinogradov. Representation of an odd number as a sum of three primes. Dokl.

Akad. Nauk SSSR, 15:291–294, 1937.[Vin12] I. Vinogradov. Effective bisector estimate with application to Apollonian circle pack-

ings, 2012. Princeton University Thesis, arxiv:1204.5498v1.[Zar66] S. C. Zaremba, Good lattice points, discrepancy, and numerical integration, Ann. Mat.

Pura Appl. (4) 73 (1966), 293–317. MR0218018 (36 #1107)[Zar72] S. K. Zaremba, La methode des “bons treillis” pour le calcul des integrales multiples,

Applications of number theory to numerical analysis (Proc. Sympos., Univ. Montreal,Montreal, Que., 1971), Academic Press, New York, 1972, pp. 39–119 (French, with

English summary). MR0343530 (49 #8271)

Department of Mathematics, Yale University, New Haven, Connecticut

E-mail address: [email protected]