cutoff on all ramanujan graphs - nyu couranteyal/papers/ramanujan.pdfcutoff on all ramanujan graphs...
TRANSCRIPT
CUTOFF ON ALL RAMANUJAN GRAPHS
EYAL LUBETZKY AND YUVAL PERES
Abstract. We show that on every Ramanujan graph G, the simple
random walk exhibits cutoff: when G has n vertices and degree d, the
total-variation distance of the walk from the uniform distribution at
time t = dd−2
logd−1 n + s√
logn is asymptotically P(Z > c s) where
Z is a standard normal variable and c = c(d) is an explicit constant.
Furthermore, for all 1 ≤ p ≤ ∞, d-regular Ramanujan graphs minimize
the asymptotic Lp-mixing time for SRW among all d-regular graphs.
Our proof also shows that, for every vertex x in G as above, its distance
from n− o(n) of the vertices is asymptotically logd−1 n.
1. Introduction
A family of d-regular graphs Gn with d ≥ 3 fixed is called an expander,
following the works of Alon and Milman [4,6] from the 1980’s, if all nontrivial
eigenvalues of the adjacency matrices are uniformly bounded away from d.
Lubotzky, Phillips, and Sarnak [25] defined a connected d-regular graph G
with d ≥ 3 to be Ramanujan iff every eigenvalue λ of its adjacency matrix
is either ±d or satisfies |λ| ≤ 2√d− 1. Such expanders, which in light of
the Alon–Boppana Theorem [29] have an asymptotically optimal spectral
gap, were first constructed, using deep number theoretic tools, in [25] and
independently by Margulis [28] (see also [13, 24] and Fig. 1). Due to their
remarkable expansion properties, Ramanujan graphs have found numerous
applications (cf. [19] and the references therein). However, after 25 years of
study, the geometry of these objects is still mysterious, and in particular,
determining the profile of distances between vertices in such a graph and
the precise mixing time of simple random walk (SRW) remained open.
Formally, letting ‖µ − ν‖tv = supA[µ(A) − ν(A)] denote total-variation
distance, the (L1) mixing time of a finite Markov chain with transition kernel
P and stationarity distribution π is defined as
tmix(ε) = min{t : Dtv(t) ≤ ε} where Dtv(t) = maxx‖P t(x, ·)− π‖tv .
A sequence of finite ergodic Markov chains is said to exhibit cutoff if its
total-variation distance from stationarity drops abruptly, over a period of
time referred to as the cutoff window, from near 1 to near 0; that is, there
is cutoff iff tmix(ε) = (1 + o(1))tmix(ε′) for any fixed 0 < ε, ε′ < 1.
Our main result shows that the Ramanujan assumption implies cutoff
with tmix(ε) = ( dd−2 + o(1)) logd−1 n and window O(
√log n).
1
2 E. LUBETZKY AND Y. PERES
Figure 1. A ball of radius 4 in the Lubotzky–Phillips–Sarnak6-regular Ramanujan graph on n = 12180 vertices via PSL(2,F29).
Theorem 1. On any sequence of d-regular non-bipartite Ramanujan graphs,
SRW exhibits cutoff. More precisely, let G be such a graph on n vertices and
t?(n) := dd−2 logd−1 n .
Then for every fixed s ∈ R and every initial vertex x, the SRW satisfies
Dtv
(t?(n) + s
√logd−1 n
)→ P (Z > cd s) as n→∞ , (1.1)
where Z is a standard normal random variable and cd = (d−2)3/2
2√d(d−1)
.
Consequently, we obtain that the profile of graph distances from every
vertex x in a d-regular Ramanujan graph G concentrates on logd−1 n (the
minimum possible value it can concentrate on in a d-regular graph).
Corollary 2. Let G be a d-regular Ramanujan graph on n vertices. Then
for every vertex x in G,
#{y :∣∣dist(x, y)− logd−1 n
∣∣ > 3 logd−1 log n}
= o(n) ,
and furthermore, for all except o(n) vertices y there is a nonbacktracking
cycle1 through x, y of length at most 2 logd−1 n+ 6 logd−1 log n.
More can be said about high-girth Ramanujan graphs, e.g., the bipartite
LPS expanders whose girth is asymptotically 43 logd−1 n (see, e.g., [24, §7]).
Corollary 3. Let G be a d-regular Ramanujan graph with n vertices and
girth g, and set R = dlogd−1 n + 5 logd−1 log ne. For every k ≤ g − R and
simple path (xi)ki=1 in G, for all but a o(1)-fraction of simple paths (yi)
ki=1
in G, there are vertex-disjoint paths of length R from xi to yi for all i.
1A nonbacktracking cycle is a sequence of adjacent vertices v0, . . . , vk such that v0 = vkand vi 6= v(i+2) mod k for all i.
CUTOFF ON ALL RAMANUJAN GRAPHS 3
0.2
0.4
0.6
0.8
1.0
5 10 15 20 25 30
0.2
0.4
0.6
0.8
1.0
dd−2 logd−1 n 1
2 log1/ρ n
Figure 2. Distance of SRW from equilibrium in L1 (blue) and L2
(red, capped at 1). On left, the LPS graph on PSL(2,F29) shown inFig. 1; on right, asymptotics via Theorem 1 and Proposition 6.
0.2
0.4
0.6
0.8
1.0
5 10 15 20 25 30
0.2
0.4
0.6
0.8
1.0
logd−1 n
Figure 3. The analogue of Fig. 2 for the NBRW (see §1.4).
That Corollaries 2–3 also cover bipartite Ramanujan graphs (recently
shown to exist for every degree d ≥ 3 in [27]) follows form an extension of
the proof of Theorem 1 to the bipartite setting (see Corollary 3.9). Moreover,
it extends to the case where the graph G is weakly Ramanujan (see §1.2).
1.1. Background and related work. The cutoff phenomenon was first
identified in pioneering studies of Diaconis, Shahshahani and Aldous [1,2,14]
in the early 1980’s, and while believed to be widespread, rigorous examples
where it was confirmed were scarce. In view of the canonical example where
there is no cutoff—SRW on a cycle—and the fact that a necessary condition
for any reversible Markov chain to have cutoff is for its inverse spectral-gap to
be negligible compared to its mixing time, the second author conjectured [30]
in 2004 that on every transitive expander SRW has cutoff.
Durrett [15, §6] conjectured in 2008 that the random walk should have
cutoff on a uniformly chosen d-regular graph on n vertices (typically a good
expander) with probability tending to 1 as n→∞; indeed this is the case, as
was verified by the first author and Sly [22] in 2010. Subsequently, expanders
without cutoff were constructed in [23], but these were highly asymmetric.
The conjectured behavior of cutoff for all transitive expanders was reiterated
in the latter works (see [22, Conjecture 6.1] and [23, §3]), yet this was not
verified nor refuted on any single example to date.
4 E. LUBETZKY AND Y. PERES
As a special case, Theorem 1 confirms cutoff on all transitive Ramanujan
graphs—in particular for the Lubotzky–Phillips–Sarnak graphs (see Fig. 2).
The concentration of measure phenomenon in expanders, discovered by
Alon and Milman [6], implies that the distance from a prescribed vertex
is concentrated up to an O(1)-window. Formally, for every sequence of
expander graphs Gn on n vertices and vertex x ∈ V (Gn) there exists a
sequence mn,x and constants a,C > 0 so that, for every r > 0,
# {y ∈ V (Gn) : |distGn(x, y)−mn,x| > r} ≤ Ca−rn . (1.2)
Corollary 2 shows that mn,x = logd−1 n+O(log log n) for Ramanujan graphs.
As for the diameter, Alon and Milman ([6, Theorem 2.7]) showed that
diam(G) ≤ 2√
2d/(d− λ) log2 n for every d-regular graph G on n vertices
where all nontrivial eigenvalues are at most λ in absolute value. This bound
was improved to dlogd/λ(n − 1)e by Chung [11, Theorem 1], and then to⌊ cosh−1(n−1)cosh−1(d/λ)
⌋+1 in [12] using properties of Tk(x), the Chebyshev polynomials
of the first kind. Since cosh(12 log(d−1)) = d/(2√d− 1) for any d, this bound
translates to 2 logd−1 n+O(1) for Ramanujan graphs, and remains the best
known upper bound on the diameter of the LPS expanders (for which this
was proved directly in [25] via the polynomials Tk(x) as later used in [12]).
Corollary 2 implies this asymptotically for every Ramanujan graph: as the
distance from any vertex x to most of the vertices is (1 + o(1)) logd−1 n,
the distance between any two vertices x, y is at most (2 + o(1)) logd−1 n.
Moreover, one can deduce that for every two vertices x, y and every integer
` ≥ (2 + o(1)) logd−1 n, there exists a path of length exactly ` between x, y.
A new impetus for understanding distances in Ramanujan graphs is due
to their role as building blocks in quantum computing; see the influential
letter by P. Sarnak [32]. Some of Sarnak’s ideas were developed further
by his student N.T. Sardari in an insightful paper [31] posted to the arXiv
a few months after the initial posting of the present paper. For a certain
infinite family of (p + 1)-regular n-vertex Ramanujan graphs, Sardari [31]
shows that the diameter is at least b43 logp(n)c and also gives an alternative
proof of the first part of Corollary 2.
1.2. Extensions. A sequence of connected d-regular graphs (d ≥ 3 fixed)
Gn on n vertices is called weakly Ramanujan if, for some δn = o(1) as
n→∞, every eigenvalue λ of Gn is either ±d or has |λ| ≤ 2√d− 1 + δn.
Theorem 4. On any sequence of d-regular non-bipartite weakly Ramanujan
graphs, SRW exhibits cutoff. More precisely, if Gn is such a graph on n
vertices then for every initial vertex x, the SRW has
tmix(ε) =(
dd−2 + o(1)
)logd−1 n for every fixed 0 < ε < 1 .
CUTOFF ON ALL RAMANUJAN GRAPHS 5
Corollary 5. Let Gn be a d-regular weakly Ramanujan sequence of graphs
on n vertices. Then for every vertex x in Gn,
#{y :∣∣∣dist(x, y)
logd−1 n− 1∣∣∣ > ε
}= o(n) for every fixed ε > 0 .
Remark. The weakly Ramanujan hypothesis in Theorem 4 and Corollary 5
may be relaxed to allow some exceptional eigenvalues; for instance, we can
allow no(1) eigenvalues λ to only satisfy |λ| < d− ε′ for some ε′ > 0 fixed.
By the result of Friedman [17] that a random (uniformly chosen) d-regular
graph on n vertices is typically weakly Ramanujan (as conjectured by Alon),
Theorem 4 then implies cutoff, re-deriving the above mentioned result of [22].
More generally, for two graphs F andG, a covering map φ : V (G)→ V (F )
is a graph homomorphism that, for every x ∈ V (G), induces a bijection
between the edges incident to x and those incident to φ(x). If such a map
exists, we say G is a lift (or a cover) of F ; a random n-lift of F is a uniformly
chosen lift out of all those with cover number n (i.e., |φ−1(x)| = n for all x).
Friedman and Kohler [18] recently proved (see also [9, Corollary 20] by
Bordenave) that for every fixed d-regular base graph F and δ > 0, if G is
a random n-lift of F then typically all of its “new” eigenvalues (those not
inherited from F via pullback) are at most 2√d− 1 + δ. By the remark
above, Theorem 4 and Corollary 5 apply here (for any fixed regular F ).
1.3. Cutoff in Lp-distance. Theorem 1 showed that Ramanujan graphs
have an optimal tmix for SRW: the total-variation distance from (1.1) matches
a lower bound valid for every d-regular graph on n vertices (Fact 2.1 in §2).
It turns out that Ramanujan graphs are extremal for Lp-mixing for all p ≥ 1.
For 1 ≤ p ≤ ∞, the Lp-mixing time of a Markov chain with transition
kernel P from its stationary distribution π is defined as
t(Lp)mix (ε) = min{t : Dp(t) ≤ ε} where Dp(t) = max
x
∥∥∥P t(x,·)π − 1
∥∥∥Lp(π)
(note that p = 1 measures total-variation mixing since D1(t) = 2Dtv(t),
whereas the L2-distance D2(t) is also known as the chi-square distance).
Chen and Saloff-Coste [10, Theorem 1.5] showed that a lazy random walk
on a family of expander graphs exhibits Lp-cutoff, at some unknown location,
for all p ∈ (1,∞]. (On the notable exception of p = 1, it is said in [10] that
there “the question is more subtle and no good general answer is known.”)
The following theorem gives a lower bound on for SRW on a d-regular
graph, asymptotically achieved by Ramanujan graphs for all p ∈ (1,∞].
6 E. LUBETZKY AND Y. PERES
Figure 4. Normalized eigenvalues of Ramanujan graphs on n ≈ 104
vertices: the 6-regular LPS expanders on PSL(2,Fq) for q = 29
(front; every nontrivial eigenvalue has multiplicity at least q−12 ) and
a 1000-lift of the 3-regular Peterson graph (back).
Proposition 6. Fix d ≥ 3 and let ρ = 2√d− 1/d. Then for all connected
d-regular graphs G on n vertices and every fixed ε > 0, the SRW satisfies
t(Lp)mix (ε) ≥
{cd,p logd−1 n−O(log log n) if p ∈ (1, 2]p−1p log1/ρ n−O(log log n) if p ∈ [2,∞]
, (1.3)
where cd,p = [2β−1+ pp−1Hd−1
(β ‖ d−1
d
)]−1 for β = [1+(d−1)(p−2)/p]−1 and
Hb(β ‖ α) = β logb(βα) + (1 − β) logb(
1−β1−α) is the relative entropy function.
Furthermore, if G is non-bipartite Ramanujan then, with the same notation,
t(Lp)mix (ε) =
{cd,p logd−1 n+O(log log n) if p ∈ (1, 2]p−1p log1/ρ n+O(log log n) if p ∈ [2,∞]
. (1.4)
1.4. Method of proof. The natural route to exploit spectral details on
the transition kernel P for an upper bound on the L1-distance from the
stationary distribution π is via the L2-distance (see, e.g., [19, Theorem 3.2]).
However, this fails to give the sought bound d+o(1)d−2 logd−1 n for the SRW, as
we see from Proposition 6 that the SRW on Ramanujan graphs exhibits an
L2-cutoff at 12 log1/ρ n > (1 + η) d
d−2 logd−1 n for some η(d) > 0 (see (2.7)).
To remedy this, we turn to the nonbacktracking random walk (NBRW),
which moves from the directed edge (x, y) to a uniformly chosen edge (y, z)
such that z 6= x. In recent years, delicate spectral information on random
graphs has been extracted by counting nonbacktracking paths; notably, this
was essential in the proofs that random d-regular graphs and random lifts
are weakly Ramanujan [9,17,18]. Here we follow the reverse route, and use
spectral information on the graph to control the nonbacktracking paths.
CUTOFF ON ALL RAMANUJAN GRAPHS 7
●●● ●●●
-2 -1 1 2 3 4 5
-2
-1
1
2
●●● ●●●
-2 -1 1 2
-2
-1
1
2
Figure 5. Eigenvalues of the nonbacktracking operator B of thegraphs from Fig. 4 (with the LPS expander on the left). Colors ofchords between pairs of eigenvalues θ, θ depict the inner product oftheir corresponding eigenvectors w,w′ (blue near 1).
The known relation between the spectrum of G and the spectrum of
the nonbacktracking operator B implies that if G is Ramanujan, each of
its nontrivial eigenvalues λ is mapped to eigenvalues θ, θ′ ∈ C of B with
modulus√d− 1 (see Fig. 4–5 showing this effect for two Ramanujan graphs
with drastically different spectral features). For intuition, note that, had
the operator B been self-adjoint and transitive (it is neither), we would
have gotten that the L2-distance at time t is O(√n(d − 1)−t/2), implying
the correct upper bound of (1 + o(1)) logd−1 n for the NBRW.
Fortunately, it turns out that, while not a normal operator, B is unitarily
similar to a matrix Λ that is block-diagonal with n− 1 non-singleton blocks
(n− 2 if G is bipartite), each of which has size 2× 2 (despite potential high
multiplicities in the eigenvalues of G) and corresponds to an eigenvector
pair w,w′ with matching eigenvalues θ, θ. This description of B appears in
Proposition 3.1 and may be of independent interest.
1.5. Organization. The rest of this paper is organized as follows. Section 2
describes the reduction of L1-mixing for the SRW to that of the NBRW and
establishes the optimality of the Lp-cutoff of SRW on Ramanujan graphs for
all p > 1 (Proposition 6). Section 3 studies the NBRW, beginning in §3.1
with the aforementioned spectral decomposition and its properties (an exact
computation of the off-diagonal entries is deferred to Proposition 4.1 in §4).
In §3.2 we give the proof of the non-bipartite case, which implies Theorem 1
and Corollary 2; and §3.3 includes the proofs of the extensions to bipartite
and weakly Ramanujan graphs, which imply Theorem 4 and Corollary 5.
8 E. LUBETZKY AND Y. PERES
2. Simple random walk
2.1. Reduction to NBRW. As described in [22] (see §2.3 and §5.2 there),
cutoff for SRW can be reduced to cutoff for the NBRW as follows. Let
G = (V,E) be a d-regular graph and let Td be the infinite regular tree
rooted at ξ, the universal cover of G. For a given vertex x ∈ V , consider a
cover map φ : Td → V with φ(ξ) = x, and observe that if (Xt) is SRW on Tdstarted at ξ, then Xt = φ(Xt) is SRW on G started at x. (This was also used
in the proof of the Alon–Boppana Theorem given in [25, Proposition 4.2].)
Similarly, if Yt is NBRW on Td started at (ξ, σ) ∈ ~E(Td), and we write
Yt = (Y ′t,Y ′′t ) to denote its endpoint vertices, then Yt = (Y ′t , Y′′t ) given
by Y ′t = φ(Y ′t) and Y ′′t = φ(Y ′′t ) is NBRW on G started at (x, φ(σ)). By
symmetry, if
Et,` := {dist(ξ,Xt) = `} ,then the conditional distribution of Xt given Et,` is uniform over the vertices
at distance ` from ξ in Td. Therefore,
Px (Xt ∈ · | Et,`) =1
d
∑σ:ξσ∈E(Td)
P(x,φ(σ))
(Y ′′`−1 ∈ ·
).
As a projection can only decrease total-variation distance, letting ` = tmix(ε)
for the NBRW on G and π be the uniform distribution over V (G), we get
‖Px (Xt ∈ ·)− π‖tv ≤ ε+ P (dist(ξ,Xt) < `) ,
and in particular, taking a maximum over x shows that the SRW on G has
Dtv(t) ≤ ε+ P(dist(ξ,Xt) < `) . (2.1)
Finally, since SRW on Td (d ≥ 3) is transient, Xt returns to ξ only a finite
number of times almost surely. If Xt 6= ξ then dist(Xt+1, ξ) − dist(Xt, ξ) is
equal to −1 with probability 1/d and +1 otherwise. Therefore, by the CLT,
dist(Xt, ξ)− ((d− 2)/d)t
(2√d− 1/d)
√t
⇒ N (0, 1) . (2.2)
Thus, if `→∞ then by (2.1), for every fixed s ≥ 0 the SRW on G satisfies
lim supn→∞
Dtv
(dd−2`+ s
√`)≤ ε+ P (Z > cd s) , (2.3)
where Z is a standard normal random variable and cd = (d−2)3/2
2√d(d−1)
.
Conversely, the number of vertices at distance ` from a given vertex x ∈ Vis at most d(d − 1)`. So, on the event dist(Xt, ξ) < logd−1(εn/d), the SRW
Xt is confined to a set of at most εn vertices of G, thus its total-variation
distance from π is at least 1− ε. Altogether, (2.2) implies the following.
CUTOFF ON ALL RAMANUJAN GRAPHS 9
0 5 10 15 20 25 300.0
0.5
1.0
1.5
2.0
2.5
3.0
L1
L32 L2 L3 L5 L25 L∞
Figure 6. The Lp-distance (p ≥ 1) from equilibrium of SRW on theLPS graph on PSL(2,F29) shown in Fig. 1, highlighting p = 1, 2,∞.
Fact 2.1. For every d-regular graph on n vertices with d ≥ 3 fixed, and
every fixed s, ε > 0, the SRW on G satisfies
lim infn→∞
Dtv
(t− s
√logd−1 n
)≥ 1− ε− P (Z > cd s) (2.4)
at t = dd−2 logd−1(εn/d), where cd = (d−2)3/2
2√d(d−1)
and Z ∼ N (0, 1).
Comparing the two bounds (2.3)–(2.4) with the desired estimate (1.1) for
the SRW in Theorem 1, we see that the latter will follow if we show that the
NBRW has cutoff at time logd−1 n+ o(√
log n) with window o(√
log n). This
will be achieved in §3 via a spectral analysis of the nonbacktracking walk.
2.2. Optimal Lp-mixing on Ramanujan graphs. We begin with the
special case p = 2 of Proposition 6.
Lemma 2.2. Fix d ≥ 3 and let ρ = 2√d− 1/d. For every fixed ε > 0 and
every connected d-regular graph G on n vertices, the SRW satisfies
t(L2)mix (ε) ≥ 1
2 log1/ρ n−O(log log n) .
Moreover, if G is non-bipartite Ramanujan then this is tight: SRW has an
L2-cutoff at 12 log1/ρ n > (1+η) d
d−2 logd−1 n for some constant η = η(d) > 0.
Proof of Lemma 2.2. Let P t be the t-step transition kernel of SRW, and
let π be the uniform distribution on V (G). For any x ∈ V (G),∑y
P t(x, y)2 = P 2t(x, x) ≥ Q2t(ξ, ξ) ,
10 E. LUBETZKY AND Y. PERES
where Qt is the t-step transition kernel of SRW on Td, the infinite d-regular
tree rooted at ξ; indeed, as argued above, if Xt is SRW on the cover tree Tdthen Xt = φ(Xt) is SRW on G, where φ is the cover map, and in particular
a return to the root in the former implies a return to the origin in the latter.
The probability Q2t(ξ, ξ) is nothing but the probability of a 1d biased
walk, reflected at 0, to be 0 at time 2t, well-known (cf. [35, §5, p128]) to be
Q2t(ξ, ξ) =2ρ2
1− ρρ2t
t 22t−2
(2t− 2
t− 1
)∼ 2ρ2
1− ρ2ρ2t√πt3
for ρ =2√d− 1
d.
(2.5)
In particular, using the standard expansion of the L2-distance,∑x
(µ(x)
π(x)− 1
)2
π(x) =∑x
µ2(x)
π(x)− 1 (2.6)
which holds for every probability distribution µ, thus we have∥∥∥P t(x, ·)π
− 1∥∥∥2L2(π)
≥ cd nρ2tt−3/2 − 1
for cd = 2ρ2[(1− ρ2)√π]−1. Consequently, from any initial x ∈ V we have
t(L2)mix (x, ε) ≥ log(n/ε)
2 log(1/ρ)−O(log log n) ,
where t(L2)mix (x, ε) is the first t where ‖P t(x, ·)/π−1‖L2(π) becomes at most ε.
We next argue that
12 log1/ρ n >
dd−2 logd−1 n for every real d ∈ (2,∞) and n ≥ 2 . (2.7)
Indeed, (2.7) is equivalent to having d−2d log(d − 1) > 2 log
(d
2√d−1
)for all
real d ∈ (2,∞), which, in turn, immediately follows from the fact that
f(d) :=d− 2
dlog(d− 1)− 2 log
( d
2√d− 1
)has f ′(d) = 2d−2 log(d− 1), so f(2) = 0 whereas f ′(d) > 0 for all d > 1.
Finally, whenG is Ramanujan, the sought upper bound on the L2-distance
follows from considering the spectral representation (see, e.g., [3])
‖P t(x, ·)/π − 1‖2L2(π) = n
n∑i=2
|fi(x)|2(λi/d)2t (2.8)
for {fi}ni=1 an orthonormal basis of eigenfunctions with eigenvalues {λi}ni=1
of the adjacency matrix and λ1 = d, and plugging in |λi| ≤ 2√d− 1. �
Remark 2.3. A different perspective on Lemma 2.2 is given by the next
proof of a slightly weaker statement. By the generalization by Serre [33]
(see [13, Theorem 1.4.9]) of the Alon–Boppana Theorem [29], for every ε > 0
there exists cε,d > 0 such that G has at least cε,d n eigenvalues λ with
CUTOFF ON ALL RAMANUJAN GRAPHS 11
|λ| > 2√d− 1 − ε. Applying this fact for some ε(d) > 0 to be specified
later, since 1n
∑x ‖
P t(x,·)π − 1‖2L2(π) =
∑ni=2(λi/d)2t where λ2, . . . , λn are the
nontrivial eigenvalues of G (this follows from (2.8) since an average over x
allows one to replace∑
x |fi(x)|2 by 1 for each i), we deduce that
maxx
∥∥∥P t(x, ·)π
−1∥∥∥2L2(π)
≥ 1
n
∑x
∥∥∥P t(x, ·)π
−1∥∥∥2L2(π)
≥ cε,dn(2√d− 1− εd
)2t.
Consequently,
t(L2)mix (δ) ≥ log(n/δ)
2 log(
d2√d−1−ε
) −O(1) . (2.9)
The proof now follows from (2.7) as we may choose ε(d), η(d) > 0 so that the
right-hand of (2.9) would be at least (1 + η − o(1)) dd−2 logd−1 n, as needed.
For the general case of p ∈ [1,∞], we need the following simple claims.
Claim 2.4. Let G be a d-regular graph on n vertices, and let Td be the
infinite d-regular tree rooted at ξ. For every 1 ≤ p <∞, SRW on G satisfies
‖P t(x, ·)/π − 1‖Lp(π) ≥ n(p−1)/p‖Qt(ξ, ·)‖p − 1
for all x, t, where P and Q are the transition kernels of SRW on G and Td.
Proof. By the triangle inequality w.r.t. ‖ · ‖Lp(π),
‖P t(x, ·)/π − 1‖Lp(π) ≥ n(p−1)/p‖P t(x, ·)‖p − 1 .
Since P t(x, y) =∑
η∈φ−1(y)Qt(ξ, η) for every cover map φ : V (Td)→ V (G)
with φ(ξ) = x, using the fact (∑k
i=1 ai)p ≥
∑ki=1 a
pi for every a1, . . . , ak > 0
and p ≥ 1 gives (P t(x, y)
)p ≥ ∑η∈φ−1(y)
(Qt(ξ, η)
)p.
Summing over all y gives ‖P t(x, ·)‖p ≥ ‖Qt(ξ, ·)‖p, as required. �
Claim 2.5. Fix d ≥ 3 and let Td be the infinite d-regular tree rooted at ξ.
There exist constants c1(d), c2(d) > 0 such that, for all k and t,
c1(d) ≤Pξ(|Xt| = k)
k+1t P
(Zt = k+t
2
) ≤ c2(d)
where |Xt| is the distance of Xt from its origin ξ, and Zt ∼ Bin(t, d−1d ).
Proof. The case k = 0 follows from (2.5), since P(Z2t = t) = (d−1d )td−t(2tt
),
which is (ρ/2)2t(2tt
). This extends to all k using the decomposition
Pξ(|Xt| = k) =t−1∑`=0
Pξ(|X`| = 0)Pξ ({|Xj | > 0 : 1 ≤ j ≤ t− `} , |Xt−`| = k)
and the Ballot Theorem (see, e.g., [16, §III.1]). �
12 E. LUBETZKY AND Y. PERES
2 3 4 5 6 7 8
2
4
6
8
10
d = 3
d = 12
Figure 7. Lp-cutoff location (normalized by logd−1 n) as a functionof p ≥ 1 for Ramanujan graphs with degree d = 3, . . . , 12. Thefunctions are C1, but not C2 at p = 2.
For more general local limit theorems on trees, see, e.g., [21].
Proof of Proposition 6. With Claims 2.4 and 2.5 in mind, and using their
notation, for every t and p ∈ [1,∞] we have
‖Qt(ξ, ·)‖pp =∑k≥0
(d− 1)k(
(d− 1)−kPξ(|Xt| = k))p
≥(c1(d)
t
)p∑k≥0
((d− 1)k(1−p)/p P (Zt = (k + t)/2)
)p. (2.10)
Writing βt = (k+ t)/2 (so that k = (2β − 1)t), the large deviation estimate
P(Zt = βt) � t−1/2 exp[−He(β ‖ d−1d )t] (2.11)
for the binomial variable Zt thus leads to the following optimization problem:
min
{p− 1
p(2β − 1) +Hd−1
(β ‖ d−1
d
): 1
2 ≤ β ≤ 1
}. (2.12)
(Observe that in fact β ≤ d−1d since for β > d−1
d both terms are increasing.)
Let f(β) denote the objective in (2.12). Then
f ′(β) =2(p− 1)
p+ logd−1
( β
(1− β)(d− 1)
),
and solving f ′(β) = 0 we get 1−ββ = (d − 1)(p−2)/p. Since f ′′(β) is positive,
it follows that the minimizer of (2.12) is at
β∗ =1
(d− 1)(p−2)/p + 1∨ 1
2. (2.13)
(Observe that β∗ = 1/2 iff p ≥ 2, hence the two regimes for the Lp-cutoff
location as a function of p.) By (2.10)–(2.11), for some c = cd > 0,
‖Qt(ξ, ·)‖p ≥ cd t−3/2(d− 1)−f(β∗)t ,
CUTOFF ON ALL RAMANUJAN GRAPHS 13
and therefore, by Claim 2.4, for every starting vertex x,∥∥∥P t(x, ·)π
− 1∥∥∥Lp(π)
≥ cd n(p−1)/pt−3/2(d− 1)−f(β∗)t − 1 . (2.14)
This implies (1.3) (and is furthermore valid for every starting vertex x).
For matching upper bounds in case G is a Ramanujan graph, first take
p ≥ 2. The lower bound established above is Dp(t) ≥ cd n(p−1)/pt−3/2ρt − 1.
Recalling Lemma 2.2, for Ramanujan graphs,
D2(t) ≤√nρt , D∞(t) ≤ D2(bt/2c)2 ≤ nρt ,
using the well-known fact (a routine application of Cauchy–Schwarz) that
D∞(s+ t) ≤ D2(t)D∗2(s) , (2.15)
where D∗2(s) corresponds to the reversed chain (here D2(s) = D∗2(s) as SRW
is reversible). So, by the Riesz–Thorin Interpolation Theorem (see, e.g., [34,
Theorem 1.3, p. 179]), for 2 ≤ p ≤ ∞, we deduce that Dp(t) ≤ n(p−1)/pρt.Having established (1.4) for p ≥ 2, now take 1 < p ≤ 2. Let
P t(x, ·) =∑
k Pξ(|Xt| = k)µk(x, ·) (2.16)
where µk is the law of the projection of NBRW on the endpoint of its directed
edge, started at a uniform edge originating from x. By Jensen’s inequality,(d(d− 1)k−1
)−1/p‖µk(x, ·)− 1n‖p ≤
(d(d− 1)k−1
)−1/2‖µk(x, ·)− 1n‖2 .
Notice that
n∥∥∥µk(x, ·)− 1
n
∥∥∥22
=∥∥∥ µk(x, ·)
π−1∥∥∥2L2(π)
≤ maxy:xy∈E(G)
∥∥∥µk−1((x, y), ·)
π ~E−1∥∥∥2L2(π~E
)
where µk is the k-step transition kernel of the NBRW and π ~E is its stationary
distribution. In our analysis of the NBRW in §3, we will show (see (3.13))
that the right-hand side of the last display is O(nk2(d− 1)−k), whence
‖µk(x, ·)− 1n‖p ≤ cd k(d− 1)k(1−p)/p .
Recalling (2.16), it now follows that
‖P t(x, ·)− 1n‖p ≤ (t+ 1) max
0≤k≤t(Pξ(|Xt| = k)) ‖µk(x, ·)− 1
n‖p
≤ c′dt2 max0≤k≤t
(d− 1)k(1−p)/p (Pξ(|Xt| = k)) ,
which, in view of (2.10), gives rise to the same optimization problem (2.12).
Therefore, the right-hand side of the last display is at most tC(d− 1)−f(β∗)t
for C > 0 fixed. Taking t as in (1.4) with a suitable additive O(log log n)
term gives (d− 1)f(β∗)t ≥ n(p−1)/pt2C . Thus,
‖P t(x, ·)/π − 1‖Lp(π) = n(p−1)/p‖P t(x, ·)− 1n‖p ≤ t
−C ,
establishing (1.4) for all 1 < p ≤ 2. �
14 E. LUBETZKY AND Y. PERES
3. Nonbacktracking walks
3.1. Spectral decomposition. The spectrum of the nonbacktracking walk
has been thoroughly studied, in part due to the fact that its eigenvalues are
precisely the inverse of the poles of the so-called Ihara Zeta function of
the graph (cf. [8, 20]). Our analysis here, on the other hand, hinges on
the structure of the eigenfunctions, starting with a spectral decomposition
of the nonbacktracking operator; this builds on properties of this operator
that appear implicitly in [20] (see also [5,7,8] as well as [26, Exercise 6.59]).
Proposition 3.1 below gives a more complete picture.
Throughout this section, for a graph G = (V,E), we denote its adjacency
matrix by A = A(G) and let λ1 = d ≥ λ2 ≥ . . . ≥ λn be its eigenvalues.
Denote by ~E the set of N = 2|E| directed edges of G; we refer to undirected
edges as xy ∈ E and to directed ones as (x, y) ∈ ~E for the sake of clarity.
The nonbacktracking walk matrix B is the ( ~E × ~E)-matrix given by
B(u,v),(x,y) = 1{v=x , u 6=y} for (u, v) and (x, y) in ~E . (3.1)
Though B may not be a normal operator, it can be decomposed as follows.
Proposition 3.1. Let G = (V,E) be a connected d-regular graph (d ≥ 3) on
n vertices. Let N = dn and let {λi}ni=1 be the eigenvalues of the adjacency
matrix, with λ1 = d. Then the operator B from (3.1) is unitarily similar to
Λ = diag
(d− 1,
(θ2 α2
0 θ′2
), . . . ,
(θn αn0 θ′n
),
N/2−n︷ ︸︸ ︷− 1, . . . ,−1,
N/2−n+1︷ ︸︸ ︷1, . . . , 1
)(3.2)
where |αi| < 2(d− 1) for all i and θi, θ′i ∈ C are defined as the solutions to
θ2 − λiθ + d− 1 = 0 . (3.3)
Remark 3.2. The exact value of |αi| is shown in Proposition 4.1 to be d−2
for every |λi| ≤ 2√d− 1 and
√d2 − λ2i for every 2
√d− 1 < |λi| < d.
Remark 3.3. We see that every eigenvalue θ 6= ±1 of B is of the form
λ/2±√
(λ/2)2 − (d− 1) for some eigenvalue λ of A (with θ = d−1 matching
the principal eigenvalue λ = d). Indeed, this well-known fact follows from
Bass’s Formula [8], which in the d-regular case is equivalent to the statement
that fB(θ) =(1− θ2
)N/2−nfA(θ+(d−1)/θ) for fA and fB the characteristic
polynomials of A and B, respectively.
(i) λ = d corresponds to θ = d−1, the principal eigenvalue of B matching
the eigenvector w1 ≡ N−1/2; the second solution, θ′ = 1, was already
accounted for in (3.2). An eigenvalue of λ = −d (when G is bipartite)
yields θ = −(d−1) and an extra −1 eigenvalue (N −n+ 1 altogether).
CUTOFF ON ALL RAMANUJAN GRAPHS 15
(ii) 2√d− 1 < |λ| < d yields two eigenvalues θ 6= θ′ ∈ R of B.
(iii) λ < |2√d− 1| yields θ = θ′ ∈ C \ R with |θ| =
√d− 1 (for instance,
λ = 0 corresponds to θ = i√d− 1 and θ′ = −i
√d− 1).
(iv) λ = ±2√d− 1 gives a single solution θ = ±
√d− 1 with multiplicity 2.
Remark 3.4. For each θ ∈ C, define Tθ : `2(V )→ `2( ~E) by
(Tθf)(x, y) := θf(y)− f(x) . (3.4)
Each solution θ 6= ±1 of equation (3.3), for some λ such that Af = λf , is
an eigenvalue of B corresponding to the eigenvector Tθf ; indeed,
(BTθf)(x, y) =∑
z:yz∈Ez 6=x
(θf(z)− f(y)) = θ[(Af)(y)− f(x)]− (d− 1)f(y)
= [θλ− (d− 1)] f(y)− θf(x) = θ(Tθf)(x, y) ;
where the last equality used (3.3) to replace θλ by θ2 + d − 1; thus, Tθf is
an eigenfunction of B corresponding to θ as long as Tθf 6= 0, and clearly
Tθf = 0 only if θ = ±1 (which, in turn, occurs iff λ = ±d).
Proof of Proposition 3.1. Observe that `2( ~E) = `2+( ~E)⊕ `2−( ~E) where
`2+( ~E) = {w : w(x, y) = w(y, x)} , `2−( ~E) = {w : w(x, y) = −w(y, x)} ,
as the term for (x, y) in 〈w+, w−〉 cancels with that of (y, x) if w± ∈ `2±( ~E).
With this in mind, the eigenspaces of 1 and −1 in B are straightforward:
the star spaces S− ⊂ `2−( ~E) and S+ ⊂ `2+( ~E) are defined by
S± = Span({s±x : x ∈ V
}), where s±x (u, v) =
1 u = x ,
±1 v = x ,
0 otherwise .
For every w ∈ `2−( ~E) and s−x as above 〈w, s−x 〉 = 2∑
y:xy∈E w(x, y), and so
(Bw)(x, y) = −w(y, x) = w(x, y) when in addition w ⊥ s−y . Thus,
Bw = w for every w ∈ `2−( ~E) ∩ S⊥− , (3.5)
and similarly,
Bw = −w for every w ∈ `2+( ~E) ∩ S⊥+ . (3.6)
As for the dimension of these spaces, note that if {ax}x∈V is such that∑axs−x = 0 then ax = ay for every xy ∈ E; since G is connected, this implies
that dim(S−) = n− 1, thus B has an orthonormal system of N/2− (n− 1)
eigenvectors with eigenvalue 1. Similarly, if∑axs
+x = 0 then ax = −ay for
every xy ∈ E, so the eigenspace of −1 has dimension N/2− (n− 1) if G is
bipartite and dimension N/2− n otherwise.
16 E. LUBETZKY AND Y. PERES
Having specified these eigenvectors of B as well as those corresponding to
θi, θ′i in Remark 3.4, we proceed to analyzing their inner products. Observe
that after appropriate permutations of its rows and columns, B becomes
block diagonal with blocks Jd−Id, where Jd and Id are the all-one matrix and
identity matrix of order d, respectively; thus, B has an inverse, which under
the same permutations is block diagonal with blocks (d−1)−1Jd−Id, so the
matrix C := (d−1)B−1+B (which, of course, satisfies Cw = ((d−1)/θ+θ)w
for every eigenfunction w of B with eigenvalue θ) is given by
C(u,v),(x,y) =
1 {v = x, u 6= y} or {y = u, v 6= x} ,−(d− 2) (u, v) = (y, x) ,
0 otherwise .
Thus, C is real symmetric, and `2+( ~E) and `2−( ~E) are invariant under it.
Furthermore, if f ∈ `2(V ) and wf ∈ `2( ~E) is given by wf (x, y) := f(y) then
(Cwf )(x, y) =∑
z:yz∈Ez 6=x
f(z) +∑
v:vx∈Ev 6=y
f(x)− (d− 2)f(x)
=∑
z:yz∈Ez 6=x
f(z) + f(x) = (Af)(y) ,
and similarly, if w′f (x, y) := f(x) then (Cw′f )(x, y) = (Af)(x). Moreover,
〈wf , wg〉 =⟨w′f , w
′g
⟩= d 〈f, g〉 and
⟨wf , w
′g
⟩= 〈f,Ag〉 for f, g ∈ `2(V ).
In particular, the eigenfunctions (fi)ni=1 correspond in this way to pairwise
orthogonal eigenspaces of C with eigenvalues (λi)ni=1; the dimension of each
eigenspace is 1 if λi = ±d and 2 otherwise (as before, wf can be a multiple of
w′f only if w ≡ c or when G is bipartite and w ≡ c on one part and w ≡ −con the other), and they notably include the eigenfunctions Tθifi of B.
Of course, every such 2-dimensional eigenspace corresponding to λi 6= ±dis orthogonal to the eigenvectors of B from (3.5)–(3.6) (corresponding to the
eigenvalues ±1), as those are also eigenvectors of ±d for the self-adjoint C.
Finally, the eigenvector w ≡ 1 with the eigenvalue d−1 of B (and eigenvalue
d of C) is orthogonal to `2−( ~E) (thus to all eigenvectors from (3.5)), whereas
if G is bipartite and we take w ≡ 1 on outgoing edges from a prescribed
part of G and w ≡ −1 on the incoming ones (with eigenvalue −(d − 1) of
B) then w ⊥ `2+( ~E), thus it is orthogonal to all eigenvectors from (3.6).
Suppose for now that A has no eigenvalue λi such that |λi| = 2√d− 1.
Then there are two distinct solutions to (3.3) for each of the λi’s, and so, in
particular, the eigenspace of C corresponding to λi 6= ±d has two linearly
independent eigenvectors of B—corresponding to eigenvalues θi and θ′i. The
orthogonality of the eigenspaces from the discussion above now establishes
the form of Λ from (3.2).
CUTOFF ON ALL RAMANUJAN GRAPHS 17
When there exist eigenvalues of A such that |λi| = 2√d− 1, we have the
unique solution θi = λi/2 for (3.3), and claim that this gives rise to a Jordan
block(λi/2 10 λi/2
). Indeed, recalling that BTθfi = θifi, observe that
(BT1+θifi)(x, y) = [(1 + θi)λi − (d− 1)]fi(y)− (1 + θi)fi(x)
= θi[(1 + θi)fi(y)− fi(x)
]+[θ2i + θi − (d− 1)
]f(y)− f(x)
= θi(T1+θifi)(x, y) + (Tθifi)(x, y) , (3.7)
where the second equality used θi = λi/2 and the last one used θ2i = d−1. As
these both belong to the corresponding eigenspace of C, we arrive at (3.2).
To conclude the proof, it remains to show that |αi| < 2(d− 1) if λi 6= ±d.
Recall that there exist unit vectors wi, w′i such that Bw′i = αiwi+θ
′iw′i (these
can be taken as columns 2i and 2i+ 1 of U as above). Hence,
(B − θ′iI)w′i = αwi . (3.8)
Let ‖ · ‖2→2 be the `2( ~E)→ `2( ~E) operator norm; we claim ‖B‖2→2 = d−1.
Indeed, it is easy to verify that
(BB∗)(u,v),(x,y) =
d− 1 x = u and y = v ,
d− 2 x 6= u and y = v ,
0 otherwise .
We see that BB∗ has ‖BB∗‖∞→∞ = (d − 1)2 and an eigenvalue (d − 1)2
corresponding to the eigenvector w ≡ 1; thus, ‖B‖2→2 = d − 1. By (3.8),
using |θ′i| < d − 1 and ‖wi‖ = ‖w′i‖ = 1, and we infer that |α| < 2(d − 1),
concluding the proof of the proposition. �
3.2. Cutoff on non-bipartite Ramanujan graphs. On every d-regular
graph on n vertices, the number of directed edges at distance ` from a given
(x, y) ∈ ~E is at most (d−1)`; this readily implies (as stated in [22, Claim 4.8])
that the nonbacktracking random walk satisfies
tmix(1− ε) ≥ dlogd−1(dn)e − dlogd−1(1/ε)e for any 0 < ε < 1 . (3.9)
Our goal in this section is to show an asymptotically tight upper bound on
tmix using the spectral decomposition of the nonbacktracking operator B.
Theorem 3.5. Let G be a non-bipartite Ramanujan graph on n vertices
with degree d ≥ 3. Let µt be the t-step transition kernel of the NBRW, and
let π be the uniform distribution on ~E. Then for some fixed c(d) > 0,
max(x,y)∈ ~E
∥∥∥µt((x, y), ·)
π− 1∥∥∥2L2(π)
≤ c(d)
log nat t =
⌈logd−1 n+ 3 logd−1 log n
⌉.
Consequently, on any sequence of such graphs, the NBRW exhibits L1-cutoff
and L2-cutoff both at time logd−1 n.
18 E. LUBETZKY AND Y. PERES
Remark 3.6. The constant c(d) in the above theorem can be taken to be
8(d−1) log−2(d−1)+1 for any sufficiently large enough n (cf. (3.13) below).
Proof of Theorem 3.5. Appealing to Proposition 3.1, let U be the unitary
matrix such that B = UΛU∗ with Λ from (3.2), and write
U =(w1 | w2 | w′2 | w3 | w′3 | . . . | wn | w′n | u1 | . . . uN−(2n−1)
),
in which w1 ≡ N−1/2. Recalling Remark 3.3, observe that the assumption
that G is non-bipartite Ramanujan implies that for all i = 2, . . . , n, the
solutions θi, θ′i to (3.3) satisfy θ′i = θi and |θi| =
√d− 1.
Let (x0, y0) ∈ ~E be some initial edge for the NBRW; by the expansion (2.6)
of the L2-distance, the t-step transition kernel µt = (d− 1)−tBt satisfies∥∥∥µt((x0, y0), ·)π
− 1∥∥∥2L2(π)
= N∑(x,y)
∣∣µt((x0, y0), (x, y))∣∣2 − 1
= N∥∥µt((x0, y0), ·)∥∥2 − 1 . (3.10)
Using B = UΛU∗ with Λ from (3.2) and U as specified above we find that
Bt((x0, y0), ·
)= (d− 1)tw1(x0, y0)w1 +
∑i
(±1)tui(x0, y0)ui
+
n∑i=2
θtiwi(x0, y0)wi +(θtiw
′i(x0, y0) + γi(t)wi(x0, y0)
)w′i ,
where
γi(t) := αi
t−1∑j=0
θji θt−1−ji
with αi from Proposition 3.1. Note that in particular, as αi < 2(d− 1),
|γi(t)| ≤ 2(d− 1)t|θi|t−1 . (3.11)
From the above expansion of Bt, since U is unitary and w1 ≡ N−1/2,∥∥µt((x0, y0), ·)∥∥2 =1
N+∑i
(d− 1)−2t|ui(x0, y0)|2
+ (d− 1)−2tn∑i=2
(|θi|2t|wi(x0, y0)|2 +
∣∣θtiw′i(x0, y0) + γi(t)wi(x0, y0)∣∣2) .
(3.12)
Now we exploit the fact that G is Ramanujan: since |θi| =√d− 1 for every
2 ≤ i ≤ n, the expression in the second line of (3.12) is at most
(d− 1)−tn∑i=2
|wi(x0, y0)|2 + 2|w′i(x0, y0)|2 + 2|γi(t)|2
(d− 1)t|wi(x0, y0)|2 ,
CUTOFF ON ALL RAMANUJAN GRAPHS 19
using the parallelogram law. Since by Parseval’s identity,∑i
|ui(x0, y0)|2 +∑i
|wi(x0, y0)|2 + |w′i(x0, y0)|2 = ‖δ(x0,y0)‖2 = 1 ,
and with (3.10) in mind, we infer that∥∥∥µt((x0, y0), ·)π
− 1∥∥∥2L2(π)
≤ 2N(d− 1)−t(
1 + maxi
|γi(t)|2
(d− 1)t
).
Substituting the bound (3.11) on γi(t), again using that G is Ramanujan,∥∥∥µt((x0, y0), ·)π
− 1∥∥∥2L2(π)
≤ 2N(d− 1)−t(4(d− 1)t2 + 1
). (3.13)
In particular, for t = dlogd−1 n+ 3 logd−1 log ne,∥∥∥µt((x0, y0), ·)π
− 1∥∥∥2L2(π)
≤ O(1/ log n) = o(1) ,
thus concluding the proof of Theorem 3.5. �
Using the reduction in §2.1 from SRW to NBRW (see (2.3)–(2.4)), one
can deduce Theorem 1 from Theorem 3.5, as the O(log log n) window for
the NBRW is negligible compared with the term s√
logd−1 n in (1.1).
Note that for every integer ` ≥ (2+o(1)) logd−1 n there is a path of length
exactly ` between every pair of vertices x, y using D∞(2t) ≤ D2(t)D∗2(t) for
the NBRW (recall (2.15), and that the chain and its reversal are isomorphic).
Proof of Corollary 2. Since max(x,y) ‖µt((x, y), ·)−π‖tv = o(1) at time t
as per Theorem 3.5, for every x, all but o(n) directed edges can be reached
by a nonbacktracking path of length t from x. The remark above (3.9)
on the growth of balls in a d-regular graph thus implies the corollary: the
statement on a nonbacktracking cycle follows from applying this argument
once on a directed edge originating from x (and reaching almost every y
within the proper length bound) and once on an arbitrarily chosen other
directed edge ending at x, in the reversed NBRW. �
Proof of Corollary 3. Note that at time R, the L2-distance of the NBRW
from equilibrium is O(1/ log3/2 n) by (3.13), and that k = O(log n) since
k ≤ g. For a uniformly chosen path (yi)ki=1 in G, each yi is uniform by
the stationarity of the NBRW. Thus, by a union bound over the vertices yi,
for each i there exists a path of length R from the edge (xi, zi) to (yi, z′i),
except with probability O(k/ log3/2 n) = o(1), where zi and z′i are not on
the paths (xi) and (yi), respectively. The conclusion now follows since, if
vertex ` of the path from xi coincides with vertex `′ of the path from xj , then
`+`′+k > g and (R−`)+(R−`′)+k > g, so k > g−R, a contradiction. �
20 E. LUBETZKY AND Y. PERES
Remark 3.7. In the setting of Theorem 3.5, if G is in addition transitive
then, by using the exact value |αi| = d− 2 from Proposition 4.1 below, the
L2-mixing time of the NBRW can be pinpointed precisely: let
ΥG(k) := (d− 2)2(d− 1)−1∫Uk−1(x)2dµG
for µG = 1n
∑i δλi/(2
√d−1) the empirical spectral distribution (ESD) of G
and Uk(cos(x)) = sin((k−1)x)sinx the Chebyshev polynomial of the second kind.
Then for any fixed ε > 0,
t(L2)mix (ε) =
⌈logd−1(n) + logd−1
(ΥG(logd−1 n) + 2
)+ logd−1(1/ε)
⌉. (3.14)
Indeed, from (3.12) we see that for any non-bipartite Ramanujan graph G
(not necessarily transitive), averaging over the initial state (x0, y0) gives
1
N
∑(x0,y0)
∥∥µt((x0, y0), ·)∥∥2 =1
N+N − 2n+ 1
(d− 1)2t+
2n− 2
(d− 1)t+
∑i |γi(t)|2
(d− 1)t,
using that wi ⊥ w′i and ‖ui‖ = ‖wi‖ = ‖w′i‖ = 1 for all i. Thus, by (3.10),
1
N
∑(x0,y0)
∥∥∥µt((x0, y0), ·)π
− 1∥∥∥2L2(π)
= (1 + o(1))
(∑i
|γi(t)|2 + 2
)n
(d− 1)t,
provided that t→∞ with n. Writing ϕi = λi/(2√d− 1) (so θi = cosϕi for
i = 2, . . . , n) and using Proposition 4.1,
|γi(t)| = (d− 2)
∣∣∣∣ θti − θtiθi − θi
∣∣∣∣ = (d− 2)(d− 1)(t−1)/2∣∣∣∣sin(tϕi)
sinϕi
∣∣∣∣ ,which implies the analogue of (3.14) for the average of the mixing times over
the initial states (x0, y0), thus establishing (3.14) for the transitive case.
3.3. Extensions. We conclude with corollaries of the proof of Theorem 3.5.
3.3.1. Bipartite Ramanujan graphs. Following is the analog for NBRW in the
bipartite case; its SRW counterpart follows from the cover-tree reduction.
Corollary 3.8. Let G = (V0, V1, E) be a bipartite Ramanujan graph on n
vertices with degree d ≥ 3. Let µt be the t-step transition kernel of the
NBRW, and let π0 and π1 be the uniform distribution on the N/2 directed
edges originating from V0 and V1, respectively. Then for some fixed c(d) > 0,
max(x0,y0)∈ ~Ex0∈V0
∥∥∥µt((x0, y0), ·)π(t mod 2)
− 1∥∥∥2L2(π(t mod 2))
≤ c(d)
log n
at time
t =⌈logd−1 n+ 3 logd−1 log n
⌉.
Consequently, on any sequence of such graphs, the NBRW that is modified
to be lazy in its first step exhibits L1-cutoff and L2-cutoff at time logd−1 n.
CUTOFF ON ALL RAMANUJAN GRAPHS 21
Proof. Following the arguments used to prove Theorem 3.5, observe that in
computing E[|µt((x0, y0), (x, y)
)/π(t mod 2)−1|2
], the identity (3.10) becomes
valid once we replace N by N/2. The only other modification needed is to
treat λn = −d, which produces the eigenvalue θn = −(d− 1). Since all the
coordinates of wn are ±N−1/2, the contribution of this eigenvalue to the
right-hand of (3.12) is 1/N , exactly that of the eigenvalue d− 1 of B. The
combined 2/N cancels via the modified identity (3.10), thus (3.13) becomes∥∥∥∥µt((x0, y0), ·
)π(t mod 2)
− 1
∥∥∥∥2L2(π(t mod 2))
≤ N(d− 1)−t(4(d− 1)t2 + 1
),
which is O(1/ log n) at the same value of t. �
Corollary 3.9. Let G = (V0, V1, E) be a bipartite Ramanujan graph on n
vertices with degree d ≥ 3. Let P t be the t-step transition kernel of the SRW,
and let π0 and π1 be the uniform distribution on V0 and V1, respectively. Let
Then for every fixed s ∈ R and every initial vertex x, the SRW at time
t = dd−2 logd−1 n+ s
√logd−1 n .
satisfies
maxx0∈V0
∥∥P t(x0, ·)− π(t mod 2)
∥∥tv→ P (Z > cd s) as n→∞ ,
where Z is a standard normal random variable and cd = (d−2)3/2
2√d(d−1)
.
Consequently, on any sequence of such graphs, the SRW that is modified to
be lazy in its first step exhibits L1-cutoff and L2-cutoff at time dd−2 logd−1 n.
3.3.2. Weakly Ramanujan graphs. It suffices to establish the result for the
NBRW (here we do not specify Dtv for the SRW within the cutoff window,
thus there is no need to control the NBRW within a window of o(√
log n)),
and Theorem 4 and Corollary 5 will then follow using the above reduction.
Corollary 3.10. Fix d ≥ 3 and let G be a d-regular graph on n vertices
whose nontrivial eigenvalues {λi}ni=2 all satisfy |λi| ≤ (1 + δn)2√d− 1 for
some δn going to 0 as n → ∞. Let µt be the t-step transition kernel of the
NBRW, and let π be the uniform distribution on ~E. For some fixed c(d) > 0,
max(x,y)∈ ~E
∥∥∥µt((x, y), ·)
π− 1∥∥∥2L2(π)
≤ c(d)
log n
at time
t =⌈(
1 + 5√δn)
logd−1 n+ 3 logd−1 log n⌉.
Consequently, on any sequence of such graphs, the NBRW exhibits L1-cutoff
and L2-cutoff both at time logd−1 n.
22 E. LUBETZKY AND Y. PERES
Proof. The analysis of blocks of Λ corresponding to eigenvalues λi (i ≥ 2)
of A such that |λi| ≤ 2√d− 1 remains valid unchanged, and it remains to
consider the effect of
|λi| = (1 + ε)2√d− 1 for some 0 < ε ≤ δn . (3.15)
As mentioned in the proof of Theorem 3.5, the fact that G is Ramanujan
is exploited when replacing |θi|2t by (d − 1)t for all i ≥ 2 in the spectral
decomposition (3.12), and once again (just above (3.13)) in the bound (3.11)
on γi(t). For λi as in (3.15), the corresponding real eigenvalues θi, θ′i of B
are given, as per (3.3), by(1 + ε±
√ε(2 + ε)
)√d− 1 ;
in particular, denoting |θi| > |θ′i|, we have |θi| = (1 +√
2ε + O(ε))√d− 1
(while at the same time |θ′i| <√d− 1). We account for this modified value
of |θi|2t in the spectral decomposition of ‖µt((x, y), ·
)‖2 via the pre-factor(
1 +√
2ε+O(ε))2t≤ exp
[(2√
2δn +O(δn))t],
thus replacing the right-hand of (3.13) by
2N(d− 1)−te[2√2δn+O(δn)]t (4(d− 1)t2 + 1
).
For the designated value of t (in which there is an extra additive term of
5√δn logd−1 n compared to t from Theorem 3.5) and using that δn → 0, we
find that there exists some fixed c(d) > 0 such that ‖µt((x, y), ·
)/π−1‖2L2(π)
is at most
c(d) + o(1)
log nexp
[(2√
2− 5 log(d− 1) + o(1))√
δnt],
which is O(1/ log n) since 2√
2 < 5 log(d− 1) for all d ≥ 3. �
Remark 3.11. Suppose that, for some δn = o(1) and fixed ε′ > 0, the graph
G has |λ| ≤ 2√d− 1 + δn for all eigenvalues λ except for no(1) exceptional
ones, which instead satisfy |λ| < d − ε′. Each eigenvalue of the latter form
corresponds to an additive term of O(a2t) in the right-hand of (3.13), where
0 < a < 1 depends only on d and ε′. For the prescribed t from Corollary 3.10,
this amounts to O(n−ε′′) for some fixed ε′′ > 0, thus the overall contribution
of these no(1) exceptional eigenvalues is negligible and the same result holds.
4. Pinpointing the spectral decomposition
The following proposition gives the precise moduli of the off-diagonal
terms in Λ from the spectral decomposition (3.2) in Proposition 3.1.
CUTOFF ON ALL RAMANUJAN GRAPHS 23
Proposition 4.1. In the setting of Proposition 3.1, for all i ≥ 2 we have
αi = 0 if λi = −d (and i = n), and otherwise
|αi| =
d− 2 if |λi| ≤ 2√d− 1 ,√
d2 − λ2i if |λi| > 2√d− 1 .
Proof. Let i ≥ 2, and for simplicity, omit its indices from the corresponding
subscripts; namely, let θ, θ′ correspond to the eigenvalue λ 6= ±d, and let f
be so that Af = λf and ‖f‖ = 1, where A is the adjacency matrix of G.
Case (1): |λ| 6= 2√d− 1: Recalling Tθ from (3.4), we claim that
α =β(θ′ − θ)√
1− |β|2where β :=
⟨Tθ′f, Tθf
⟩‖Tθ′f‖ ‖Tθf‖
. (4.1)
Indeed, taking
w =Tθf
‖Tθf‖, w′ =
Tθ′f
‖Tθ′f‖, w′′ =
w′ − βw‖w′ − βw‖
for β as above gives Bw = θw, Bw′ = θ′w, and
‖w′ − βw‖2 = 1 + |β|2 − β⟨w,w′
⟩− β〈w′, w〉 = 1− |β|2 , (4.2)
so w′′ = (1− |β|2)−1/2(w′ − βw) satisfies
Bw′′ =θ′w′ − βθw√
1− |β|2= θ′w′′ +
β(θ′ − θ)√1− |β|2
= θ′w′′ + αw ,
as claimed. To estimate α, observe that for every f ∈ `2(V ),
‖Tθf‖2`2( ~E)= d(|θ|2 + 1)‖f‖2`2(V ) −
(θ + θ
)〈Af, f〉`2(V ) . (4.3)
Case (1.a): below the Ramanujan threshold. When |λ| < 2√d− 1 we
have θ′ = θ ∈ C \ R. Since θ2 + d− 1 = λθ and |θ| =√d− 1,
‖Tθf‖2 = d2 − (θ + θ)λ = d2 − 2(d− 1)− (θ2 + θ2)
= d2 − 2(d− 1) [1 + cos(2ϕ)] = (d− 2)2 + 2(d− 1)(1− cos(2ϕ)) ,
where we let θ =√d− 1 exp(iϕ). Similarly,
β =d(θ2 + 1)− 2θλ
(d− 2)2 + 2(d− 1)(1− cos(2ϕ))=
(d− 2)(θ2 − 1)
(d− 2)2 + 2(d− 1)(1− cos(2ϕ)),
and so
1− |β|2 = 1− (d− 1)2 + 1− 2(d− 1) cos(2ϕ)[d− 2 + 2d−1d−2(1− cos(2ϕ))
]2=
4(d−1d−2)2(1− cos(2ϕ))2 + 2(d− 1)(1− cos(2ϕ))[d− 2 + 2d−1d−2(1− cos(2ϕ))
]2 .
24 E. LUBETZKY AND Y. PERES
Substituting cos(2ϕ) = 1− 2 sin2 ϕ we see that
1− |β|2 =4(d− 1)
(1 + 4 d−1
(d−2)2 sin2 ϕ)
sin2 ϕ(d− 2 + 4d−1d−2 sin2 ϕ
)2 =4(d− 1) sin2 ϕ
(d− 2)2 + 4(d− 1) sin2 ϕ,
and so|β|2
1− |β|2=
1
1− |β|2− 1 =
(d− 2)2
4(d− 1) sin2 ϕ.
Since θ − θ′ = 2√d− 1 sinϕ, we conclude from (4.1) that |α| = d− 2.
Case (1.b): above the Ramanujan threshold. For 2√d− 1 < |λ| < d,
we have θ 6= θ′ ∈ R, and assume w.l.o.g. that θ > θ′. By (4.3) we get
‖Tθf‖2 = d(θ2 + 1)− 2θλ = d(θ2 + 1)− 2(θ2 + d− 1) = (d− 2)(θ2 − 1) ,
and for the same reason, ‖Tθ′f‖2 = (d− 2)(θ′2 − 1). Similarly,
〈Tθf, Tθ′f〉 = d(θθ′ + 1)− (θ + θ′)λ = d2 − λ2 ,
using that θθ′ = d − 1 whereas θ + θ′ = λ through their definition in (3.3).
Since we also have θ2 + θ′2 = λ2 − 2(d− 1), we see that
(θ2 − 1)(θ′2 − 1) = (d− 1)2 −
(λ2 − 2(d− 1)
)+ 1 = d2 − λ2 ,
and altogether deduce that
β =d2 − λ2[
(d− 2)(θ2 − 1)] 12[(d− 2)(θ′2 − 1)
] 12
=
√d2 − λ2d− 2
.
Therefore,
β2
1− β2=
1
1− β2− 1 =
d2 − λ2
(d− 2)2 − (d2 − λ2)=
d2 − λ2
λ2 − 4(d− 1),
Recalling the definition (4.1) of α, and using that θ − θ′ =√λ2 − 4(d− 1),
we infer that α2 = d2 − λ2.Case (2): at the Ramanujan threshold: For |λ| = 2
√d− 1 we claim
α =‖Tθf‖
‖T1+θf‖√
1− |β|2where β :=
⟨T1+θf, Tθf
⟩‖T1+θf‖ ‖Tθf‖
. (4.4)
To see this, take
w =Tθf
‖Tθf‖, w′ =
T1+θf
‖T1+θf‖, w′′ =
w′ − βw‖w′ − βw‖
;
since Bw′ = θw′+ (‖Tθf‖/‖T1+θf‖)w by (3.7), while ‖w′− βw‖2 = 1− |β|2(by the same calculation as in (4.2)),
Bw′′ =θw′ + ‖T1+θf‖−1w − βθw√
1− |β|2= θw′′ + αw ,
CUTOFF ON ALL RAMANUJAN GRAPHS 25
as claimed. To compute α, we recall that θ = λ/2, and infer from (4.3) that
‖Tθf‖2 = d(θ2 + 1)− 2θλ = d2 − 2θλ = (d− 2)2 , (4.5)
‖T1+θf‖2 = d((1 + θ)2 + 1
)− 2(1 + θ)λ = (d− 2)(d+ 2θ − 2) + d ,
as well as that
〈T1+θf, Tθf〉 = d(θ(1 + θ) + 1
)− (2θ + 1)λ = (d− 2)(d+ θ − 2) .
We therefore have
1− β2 = 1− (d+ θ − 2)2
(d− 2)(d+ 2θ − 2) + d=
(d+ θ − 2)(−θ) + θ(d− 2) + d
(d− 2)(d+ 2θ − 2) + d
=d− θ2
(d− 2)(d+ 2θ − 2) + d=
1
‖T1+θf‖2
and so, by (4.4)–(4.5), α = ‖Tθf‖ = d− 2. �
Acknowledgements. We thank Shayan Oveis Gharan for suggesting that
we study cutoff on Ramanujan graphs, and Perla Sousi for comments on an
earlier version of this manuscript. The research of E.L. was supported in
part by NSF grant DMS-1513403.
References
[1] D. Aldous. Random walks on finite groups and rapidly mixing Markov chains. In
Seminar on probability, XVII, volume 986 of Lecture Notes in Math., pages 243–297.
Springer, Berlin, 1983.
[2] D. Aldous and P. Diaconis. Shuffling cards and stopping times. Amer. Math. Monthly,
93(5):333–348, 1986.
[3] D. Aldous and J. A. Fill. Reversible markov chains and random walks on graphs,
2002. Available at http://www.stat.berkeley.edu/~aldous/RWG/book.html.
[4] N. Alon. Eigenvalues and expanders. Combinatorica, 6(2):83–96, 1986.
[5] N. Alon, I. Benjamini, E. Lubetzky, and S. Sodin. Non-backtracking random walks
mix faster. Commun. Contemp. Math., 9(4):585–603, 2007.
[6] N. Alon and V. D. Milman. λ1, isoperimetric inequalities for graphs, and supercon-
centrators. J. Combin. Theory Ser. B, 38(1):73–88, 1985.
[7] O. Angel, J. Friedman, and S. Hoory. The non-backtracking spectrum of the universal
cover of a graph. Trans. Amer. Math. Soc., 367(6):4287–4318, 2015.
[8] H. Bass. The Ihara-Selberg zeta function of a tree lattice. Internat. J. Math., 3(6):717–
797, 1992.
[9] C. Bordenave. A new proof of Friedman’s second eigenvalue Theorem and its exten-
sion to random lifts. 2015. Preprint, available at arXiv:1502.04482.
[10] G.-Y. Chen and L. Saloff-Coste. The cutoff phenomenon for ergodic Markov processes.
Electron. J. Probab., 13:no. 3, 26–78, 2008.
[11] F. R. K. Chung. Diameters and eigenvalues. J. Amer. Math. Soc., 2(2):187–196, 1989.
[12] F. R. K. Chung, V. Faber, and T. A. Manteuffel. An upper bound on the diameter
of a graph from eigenvalues associated with its Laplacian. SIAM J. Discrete Math.,
7(3):443–457, 1994.
26 E. LUBETZKY AND Y. PERES
[13] G. Davidoff, P. Sarnak, and A. Valette. Elementary number theory, group theory,
and Ramanujan graphs, volume 55 of London Mathematical Society Student Texts.
Cambridge University Press, Cambridge, 2003.
[14] P. Diaconis and M. Shahshahani. Generating a random permutation with random
transpositions. Z. Wahrsch. Verw. Gebiete, 57(2):159–179, 1981.
[15] R. Durrett. Random graph dynamics. Cambridge Series in Statistical and Probabilistic
Mathematics. Cambridge University Press, Cambridge, 2010.
[16] W. Feller. An introduction to probability theory and its applications. Vol. I. Third
edition. John Wiley & Sons, Inc., New York-London-Sydney, 1968.
[17] J. Friedman. A proof of Alon’s second eigenvalue conjecture and related problems.
Mem. Amer. Math. Soc., 195(910):viii+100, 2008.
[18] J. Friedman and D. Kohler. The relativized second eigenvalue conjecture of Alon.
2014. Preprint, available at arXiv:1403.3462.
[19] S. Hoory, N. Linial, and A. Wigderson. Expander graphs and their applications. Bull.
Amer. Math. Soc. (N.S.), 43(4):439–561 (electronic), 2006.
[20] M. Kotani and T. Sunada. Zeta functions of finite graphs. J. Math. Sci. Univ. Tokyo,
7(1):7–25, 2000.
[21] S. P. Lalley. Finite range random walk on free groups and homogeneous trees. Ann.
Probab., 21(4):2087–2130, 1993.
[22] E. Lubetzky and A. Sly. Cutoff phenomena for random walks on random regular
graphs. Duke Math. J., 153(3):475–510, 2010.
[23] E. Lubetzky and A. Sly. Explicit expanders with cutoff phenomena. Electron. J.
Probab., 16:no. 15, 419–435, 2011.
[24] A. Lubotzky. Discrete groups, expanding graphs and invariant measures. Modern
Birkhauser Classics. Birkhauser Verlag, Basel, 2010.
[25] A. Lubotzky, R. Phillips, and P. Sarnak. Ramanujan graphs. Combinatorica,
8(3):261–277, 1988.
[26] R. Lyons and Y. Peres. Probability on Trees and Networks. Cambridge University
Press. In preparation. Current version available at http://pages.iu.edu/~rdlyons/.
[27] A. Marcus, D. A. Spielman, and N. Srivastava. Interlacing families I: bipartite Ra-
manujan graphs of all degrees. Ann. of Math., 182(1):307–325, 2015.
[28] G. A. Margulis. Explicit group-theoretic constructions of combinatorial schemes and
their applications in the construction of expanders and concentrators. Problemy
Peredachi Informatsii, 24(1):51–60, 1988.
[29] A. Nilli. On the second eigenvalue of a graph. Discrete Math., 91(2):207–210, 1991.
[30] Y. Peres. American Institute of Mathematics (AIM) research workshop “Sharp
Thresholds for Mixing Times”, Palo Alto, December 2004. Summary available at
http://www.aimath.org/WWN/mixingtimes.
[31] N. T. Sardari. Diameter of Ramanujan graphs and random Cayley graphs with nu-
merics. 2015. Preprint, available at arXiv:1511.09340.
[32] P. Sarnak. Letter to Scott Aaronson and Andrew Pollington on the Solovay–Kitaev
Theorem and Golden Gates (with an appendix on optimal lifting of integral points).
February 2015. Available at http://publications.ias.edu/sarnak/paper/2637.
[33] J.-P. Serre. Repartition asymptotique des valeurs propres de l’operateur de Hecke Tp.
J. Amer. Math. Soc., 10(1):75–102, 1997.
[34] E. M. Stein and G. Weiss. Introduction to Fourier analysis on Euclidean spaces.
Princeton University Press, Princeton, N.J., 1971.
CUTOFF ON ALL RAMANUJAN GRAPHS 27
[35] W. Woess. Denumerable Markov chains. European Mathematical Society (EMS),
Zurich, 2009. Generating functions, boundary theory, random walks on trees.
E. Lubetzky
Courant Institute, New York University, 251 Mercer St., New York, NY 10012.
E-mail address: [email protected]
Y. Peres
Microsoft Research, One Microsoft Way, Redmond, WA 98052.
E-mail address: [email protected]