a matrix expander chernoff boundnikhil/mchernoff.pdfΒ Β· ankit garg, yin tat lee, zhao song, nikhil...

61
A Matrix Expander Chernoff Bound Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil Srivastava UC Berkeley

Upload: others

Post on 07-Feb-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

  • A Matrix Expander Chernoff Bound

    Ankit Garg, Yin Tat Lee, Zhao Song, Nikhil Srivastava

    UC Berkeley

  • Vanilla Chernoff Bound

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

  • Vanilla Chernoff Bound

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    Two Extensions:1.Dependent Random Variables2.Sums of random matrices

  • Expander Chernoff Bound [AKS’87, G’94]

    Thm[Gillman’94]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℝbe a function with 𝑓 𝑣 ≀ 1 and σ𝑣 𝑓 𝑣 = 0.

  • Expander Chernoff Bound [AKS’87, G’94]

    Thm[Gillman’94]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℝbe a function with 𝑓 𝑣 ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

    Implies walk of length π‘˜ β‰ˆ 1 βˆ’ πœ† βˆ’1 concentrates around mean.

  • Intuition for Dependence on Spectral Gap

    𝑓 = +1 𝑓 = βˆ’1

    1 βˆ’ πœ† =1

    𝑛2

  • Intuition for Dependence on Spectral Gap

    1 βˆ’ πœ† =1

    𝑛2

    Typical random walk takes Ξ© 𝑛2 steps to see both Β±1.

  • Derandomization Motivation

    Thm[Gillman’94]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℝbe a function with 𝑓 𝑣 ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

    To generate π‘˜ indep samples from [𝑛] need π‘˜ log 𝑛 bits.If 𝐺 is 3 βˆ’regular with constant πœ†, walk needs: log 𝑛 + π‘˜ log(3) bits.

  • Derandomization Motivation

    Thm[Gillman’94]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℝbe a function with 𝑓 𝑣 ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

    To generate π‘˜ indep samples from [𝑛] need π‘˜ log 𝑛 bits.If 𝐺 is 3 βˆ’regular with constant πœ†, walk needs: log 𝑛 + π‘˜ log(3) bits.

    When π‘˜ = 𝑂(log 𝑛) reduces randomness quadratically. Can completely derandomize in polynomial time.

  • Matrix Chernoff Bound

    Thm [Rudelson’97, Ahlswede-Winter’02, Oliveira’08, Tropp’11…]. If 𝑋1, … , π‘‹π‘˜ are independent mean zero random 𝑑 Γ— 𝑑 Hermitianmatrices with | 𝑋𝑖 | ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2𝑑exp(βˆ’π‘˜πœ–2/4)

  • Matrix Chernoff Bound

    Thm [Rudelson’97, Ahlswede-Winter’02, Oliveira’08, Tropp’11…]. If 𝑋1, … , π‘‹π‘˜ are independent mean zero random 𝑑 Γ— 𝑑 Hermitianmatrices with | 𝑋𝑖 | ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2𝑑exp(βˆ’π‘˜πœ–2/4)

    Very generic bound (no independence assumptions on the entries).Many applications + martingale extensions (see Tropp).

    Factor 𝑑 is tight because of the diagonal case.

  • Matrix Expander Chernoff Bound?

    Conj[Wigderson-Xiao’05]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℂ𝑑×𝑑 be a function with | 𝑓 𝑣 | ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2𝑑exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

    Motivated by derandomized Alon-Roichman theorem.

  • Main Theorem

    Thm. Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℂ𝑑×𝑑 be a function with | 𝑓 𝑣 | ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2𝑑exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

    Gives black-box derandomization of any application of matrix Chernoff

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

    Markov

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

    Indep.

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

    Bounded

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

    Mean zero

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

  • 1. Proof of Chernoff: reduction to mgf

    Thm [Hoeffding, Chernoff]. If 𝑋1, … , π‘‹π‘˜ are independent mean zerorandom variables with 𝑋𝑖 ≀ 1 then

    β„™1

    π‘˜

    𝑖

    𝑋𝑖 β‰₯ πœ– ≀ 2exp(βˆ’π‘˜πœ–2/4)

    β„™

    𝑖

    𝑋𝑖 β‰₯ π‘˜πœ– ≀ π‘’βˆ’π‘‘π‘˜πœ–π”Ό exp 𝑑

    𝑖

    𝑋𝑖 = π‘’βˆ’π‘‘π‘˜πœ–ΰ·‘

    𝑖

    𝔼𝑒𝑑𝑋𝑖

    ≀ π‘’βˆ’π‘‘π‘˜πœ– 1 + 𝑑𝔼𝑋𝑖 + 𝑑2 π‘˜ ≀ π‘’βˆ’π‘‘π‘˜πœ–+π‘˜π‘‘

    2≀ eβˆ’

    π‘˜πœ–2

    4

  • 2. Proof of Expander Chernoff

    Goal: Show 𝔼exp 𝑑 σ𝑖 𝑓 𝑣𝑖 ≀ exp π‘πœ†π‘˜π‘‘2

  • 2. Proof of Expander Chernoff

    Goal: Show 𝔼exp 𝑑 σ𝑖 𝑓 𝑣𝑖 ≀ exp π‘πœ†π‘˜π‘‘2

    Issue: 𝔼exp σ𝑖 𝑓 𝑣𝑖 β‰  ς𝑖 𝔼exp(𝑑𝑓(𝑣𝑖))

    How to control the mgf without independence?

  • Step 1: Write mgf as quadratic form

    =

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    β„™(𝑣0 = 𝑖0)𝑃 𝑖0, 𝑖1 …𝑃(π‘–π‘˜βˆ’1, π‘–π‘˜) exp 𝑑

    1β‰€π‘—β‰€π‘˜

    𝑓 𝑖𝑗

    =1

    𝑛

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    𝑃 𝑖0, 𝑖1 𝑒𝑑𝑓(𝑖1)…𝑃 π‘–π‘˜βˆ’1, π‘–π‘˜ 𝑒

    𝑑𝑓(π‘–π‘˜)

    𝔼𝑒𝑑 Οƒπ‘–β‰€π‘˜ 𝑓(𝑣𝑖)

    = βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© where 𝑒 =1

    𝑛, … ,

    1

    𝑛. π’ŠπŸŽ

    π’ŠπŸ

    π’ŠπŸ

    π’ŠπŸ‘

    𝑃 𝑖0, 𝑖1

    𝑃 𝑖1, 𝑖2

    𝑃 𝑖2, 𝑖3

    𝑒𝑑𝑓(𝑖1)

    𝑒𝑑𝑓(𝑖2)

    𝑒𝑑𝑓(𝑖3)

    𝐸 =𝑒𝑑𝑓(1)

    𝑒𝑑𝑓(𝑛)

  • Step 1: Write mgf as quadratic form

    =

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    β„™(𝑣0 = 𝑖0)𝑃 𝑖0, 𝑖1 …𝑃(π‘–π‘˜βˆ’1, π‘–π‘˜) exp 𝑑

    1β‰€π‘—β‰€π‘˜

    𝑓 𝑖𝑗

    =1

    𝑛

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    𝑃 𝑖0, 𝑖1 𝑒𝑑𝑓(𝑖1)…𝑃 π‘–π‘˜βˆ’1, π‘–π‘˜ 𝑒

    𝑑𝑓(π‘–π‘˜)

    𝔼𝑒𝑑 Οƒπ‘–β‰€π‘˜ 𝑓(𝑣𝑖)

    = βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© where 𝑒 =1

    𝑛, … ,

    1

    𝑛. π’ŠπŸŽ

    π’ŠπŸ

    π’ŠπŸ

    π’ŠπŸ‘

    𝑃 𝑖0, 𝑖1

    𝑃 𝑖1, 𝑖2

    𝑃 𝑖2, 𝑖3

    𝑒𝑑𝑓(𝑖1)

    𝑒𝑑𝑓(𝑖2)

    𝑒𝑑𝑓(𝑖3)

    𝐸 =𝑒𝑑𝑓(1)

    𝑒𝑑𝑓(𝑛)

  • Step 1: Write mgf as quadratic form

    =

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    β„™(𝑣0 = 𝑖0)𝑃 𝑖0, 𝑖1 …𝑃(π‘–π‘˜βˆ’1, π‘–π‘˜) exp 𝑑

    1β‰€π‘—β‰€π‘˜

    𝑓 𝑖𝑗

    =1

    𝑛

    𝑖0,…,π‘–π‘˜βˆˆπ‘‰

    𝑃 𝑖0, 𝑖1 𝑒𝑑𝑓(𝑖1)…𝑃 π‘–π‘˜βˆ’1, π‘–π‘˜ 𝑒

    𝑑𝑓(π‘–π‘˜)

    𝔼𝑒𝑑 Οƒπ‘–β‰€π‘˜ 𝑓(𝑣𝑖)

    = βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© where 𝑒 =1

    𝑛, … ,

    1

    𝑛. π’ŠπŸŽ

    π’ŠπŸ

    π’ŠπŸ

    π’ŠπŸ‘

    𝑃 𝑖0, 𝑖1

    𝑃 𝑖1, 𝑖2

    𝑃 𝑖2, 𝑖3

    𝑒𝑑𝑓(𝑖1)

    𝑒𝑑𝑓(𝑖2)

    𝑒𝑑𝑓(𝑖3)

    𝐸 =𝑒𝑑𝑓(1)

    …𝑒𝑑𝑓(𝑛)

  • Step 2: Bound quadratic form

    Observe: ||𝑃 βˆ’ 𝐽|| ≀ πœ† where 𝐽 = complete graph with self loops

    So for small πœ†, should have 𝑒, 𝐸𝑃 π‘˜π‘’ β‰ˆ βŸ¨π‘’, (𝐸𝐽)π‘˜π‘’βŸ©

    Goal: Show βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© ≀ exp π‘πœ†π‘˜π‘‘2

  • Step 2: Bound quadratic form

    Observe: ||𝑃 βˆ’ 𝐽|| ≀ πœ† where 𝐽 = complete graph with self loops

    So for small πœ†, should have 𝑒, 𝐸𝑃 π‘˜π‘’ β‰ˆ βŸ¨π‘’, (𝐸𝐽)π‘˜π‘’βŸ©

    Approach 1. Use perturbation theory to show ||𝐸𝑃|| ≀ exp π‘πœ†π‘‘2

    Goal: Show βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© ≀ exp π‘πœ†π‘˜π‘‘2

  • Step 2: Bound quadratic form

    Observe: ||𝑃 βˆ’ 𝐽|| ≀ πœ† where 𝐽 = complete graph with self loops

    So for small πœ†, should have 𝑒, 𝐸𝑃 π‘˜π‘’ β‰ˆ βŸ¨π‘’, (𝐸𝐽)π‘˜π‘’βŸ©

    Approach 1. Use perturbation theory to show ||𝐸𝑃|| ≀ exp π‘πœ†π‘‘2

    Approach 2. (Healy’08) track projection of iterates along 𝑒.

    Goal: Show βŸ¨π‘’, 𝐸𝑃 π‘˜π‘’βŸ© ≀ exp π‘πœ†π‘˜π‘‘2

  • Simplest case: πœ† = 0

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

  • Simplest case: πœ† = 0

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

  • Simplest case: πœ† = 0

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

  • Simplest case: πœ† = 0

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

  • Simplest case: πœ† = 0

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

  • Observations

    𝐸𝑃𝐸𝑃𝐸𝑃𝐸𝑃𝑒

    Observe: 𝑃 shrinks every vector orthogonal to 𝑒 by πœ†.

    𝑒, 𝐸𝑒 =1

    nΟƒπ‘£βˆˆV 𝑒

    𝑑𝑓(𝑣) = 1 + 𝑑2 by mean zero

    condition.

  • A small dynamical system

    𝑣 = 𝐸𝑃 𝑗𝑒

    π‘š|| = βŸ¨π‘£, π‘’βŸ©

    π‘šβŠ₯ = ||𝑄βŠ₯𝑣||

    π‘š|| π‘šβŠ₯

  • A small dynamical system

    𝑣 = 𝐸𝑃 𝑗𝑒

    π‘š|| = βŸ¨π‘£, π‘’βŸ©

    π‘šβŠ₯ = ||𝑄βŠ₯𝑣||

    π‘š|| π‘šβŠ₯

    0

    0

    𝑒𝑑2

    𝑑

    𝑣 β†· πΈπ‘ƒπ‘£πœ† = 0

  • A small dynamical system

    𝑣 = 𝐸𝑃 𝑗𝑒

    π‘š|| = βŸ¨π‘£, π‘’βŸ©

    π‘šβŠ₯ = ||𝑄βŠ₯𝑣||

    π‘š|| π‘šβŠ₯

    π‘’π‘‘πœ†

    πœ†π‘‘

    𝑒𝑑2

    𝑑

    𝑣 β†· 𝐸𝑃𝑣any πœ†

  • A small dynamical system

    𝑣 = 𝐸𝑃 𝑗𝑒

    π‘š|| = βŸ¨π‘£, π‘’βŸ©

    π‘šβŠ₯ = ||𝑄βŠ₯𝑣||

    π‘š|| π‘šβŠ₯

    ≀ 1

    πœ†π‘‘

    𝑒𝑑2

    𝑑

    𝑣 β†· 𝐸𝑃𝑣𝑑 ≀ log 1/πœ†

  • A small dynamical system

    𝑣 = 𝐸𝑃 𝑗𝑒

    π‘š|| = βŸ¨π‘£, π‘’βŸ©

    π‘šβŠ₯ = ||𝑄βŠ₯𝑣||

    π‘š|| π‘šβŠ₯

    ≀ 1

    πœ†π‘‘

    𝑒𝑑2

    𝑑

    Any mass that leaves gets shrunk by πœ†

  • Analyzing dynamical system gives

    Thm[Gillman’94]: Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℝbe a function with 𝑓 𝑣 ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

  • Main Issue: exp 𝐴 + 𝐡 β‰  exp 𝐴 exp(𝐡) unless 𝐴, 𝐡 = 0

    Generalization to Matrices?

    Setup: 𝑓: 𝑉 β†’ ℂ𝑑×𝑑 , random walk 𝑣1, … , π‘£π‘˜.Goal:

    π”Όπ‘‡π‘Ÿ exp 𝑑

    𝑖

    𝑓 𝑣𝑖 ≀ 𝑑 β‹… exp π‘π‘˜π‘‘2

    where 𝑒𝐴 is defined as a power series.

    can’t express exp(sum) as iterated product.

  • Main Issue: exp 𝐴 + 𝐡 β‰  exp 𝐴 exp(𝐡) unless 𝐴, 𝐡 = 0

    Generalization to Matrices?

    Setup: 𝑓: 𝑉 β†’ ℂ𝑑×𝑑 , random walk 𝑣1, … , π‘£π‘˜.Goal:

    π”Όπ‘‡π‘Ÿ exp 𝑑

    𝑖

    𝑓 𝑣𝑖 ≀ 𝑑 β‹… exp π‘π‘˜π‘‘2

    where 𝑒𝐴 is defined as a power series.

    can’t express exp(sum) as iterated product.

  • The Golden-Thompson InequalityPartial Workaround [Golden-Thompson’65]:

    π‘‡π‘Ÿ exp 𝐴 + 𝐡 ≀ π‘‡π‘Ÿ exp 𝐴 exp 𝐡

    Sufficient for independent case by induction.For expander case, need this for π‘˜ matrices. False!

    π‘‡π‘Ÿ 𝑒𝐴+𝐡+𝐢 > 0 > π‘‡π‘Ÿ(eAeBeC)

    β‰ˆ 𝑒𝐴

    β‰ˆ 𝑒𝐡 β‰ˆ 𝑒𝐢

  • The Golden-Thompson InequalityPartial Workaround [Golden-Thompson’65]:

    π‘‡π‘Ÿ exp 𝐴 + 𝐡 ≀ π‘‡π‘Ÿ exp 𝐴 exp 𝐡

    Sufficient for independent case by induction.For expander case, need this for π‘˜ matrices.

    π‘‡π‘Ÿ 𝑒𝐴+𝐡+𝐢 > 0 > π‘‡π‘Ÿ(eAeBeC)

    β‰ˆ 𝑒𝐴

    β‰ˆ 𝑒𝐡 β‰ˆ 𝑒𝐢

  • The Golden-Thompson InequalityPartial Workaround [Golden-Thompson’65]:

    π‘‡π‘Ÿ exp 𝐴 + 𝐡 ≀ π‘‡π‘Ÿ exp 𝐴 exp 𝐡

    Sufficient for independent case by induction.For expander case, need this for π‘˜ matrices. False!

    π‘‡π‘Ÿ 𝑒𝐴+𝐡+𝐢 > 0 > π‘‡π‘Ÿ(eAeBeC)

    β‰ˆ 𝑒𝐴

    β‰ˆ 𝑒𝐡 β‰ˆ 𝑒𝐢

  • Key Ingredient

    [Sutter-Berta-Tomamichel’16] If 𝐴1, … , π΄π‘˜ are Hermitian, then

    log π‘‡π‘Ÿ 𝑒𝐴1+ …+π΄π‘˜

    ≀ ∫ 𝑑𝛽(𝑏) log π‘‡π‘Ÿ 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2

    βˆ—

    where 𝛽(𝑏) is an explicit probability density on ℝ.

    1. Matrix on RHS is always PSD.

    2. Average-case inequality: 𝑒𝐴𝑖/2 are conjugated by unitaries.3. Implies Lieb’s concavity, triple-matrix, ALT, and more.

  • Key Ingredient

    [Sutter-Berta-Tomamichel’16] If 𝐴1, … , π΄π‘˜ are Hermitian, then

    log π‘‡π‘Ÿ 𝑒𝐴1+ …+π΄π‘˜

    ≀ ∫ 𝑑𝛽(𝑏) log π‘‡π‘Ÿ 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2

    βˆ—

    where 𝛽(𝑏) is an explicit probability density on ℝ.

    1. Matrix on RHS is always PSD.

    2. Average-case inequality: 𝑒𝐴𝑖/2 are conjugated by unitaries.3. Implies Lieb’s concavity, triple-matrix, ALT, and more.

  • Proof of SBT: Lie-Trotter Formula

    𝑒𝐴+𝐡+𝐢 = limπœƒβ†’0+

    π‘’πœƒπ΄π‘’πœƒπ΅π‘’πœƒπΆ1/πœƒ

    log π‘‡π‘Ÿ 𝑒𝐴+𝐡+𝐢 = limπœƒβ†’0+

    log ||𝐺 πœƒ ||2/πœƒ

    For 𝐺 𝑧 ≔ 𝑒𝑧𝐴

    2 𝑒𝑧𝐡

    2 𝑒𝑧𝐢

    2

  • Proof of SBT: Lie-Trotter Formula

    𝑒𝐴+𝐡+𝐢 = limπœƒβ†’0+

    π‘’πœƒπ΄π‘’πœƒπ΅π‘’πœƒπΆ1/πœƒ

    log π‘‡π‘Ÿ 𝑒𝐴+𝐡+𝐢 = limπœƒβ†’0+

    2log ||𝐺 πœƒ ||2/πœƒ/πœƒ

    For 𝐺 𝑧 ≔ 𝑒𝑧𝐴

    2 𝑒𝑧𝐡

    2 𝑒𝑧𝐢

    2

  • Complex Interpolation (Stein-Hirschman)

    log 𝐺 πœƒ 2/πœƒ

    πœƒ

  • Complex Interpolation (Stein-Hirschman)

    |𝐹 πœƒ | = 𝐺 πœƒ 2/πœƒ

    For each πœƒ, find analytic 𝐹(𝑧) st:

    |𝐹 1 + 𝑖𝑑 | ≀ 𝐺 1 + 𝑖𝑑 2

    |𝐹 𝑖𝑑 | = 1

  • Complex Interpolation (Stein-Hirschman)

    |𝐹 πœƒ | = 𝐺 πœƒ 2/πœƒ

    For each πœƒ, find analytic 𝐹(𝑧) st:

    |𝐹 1 + 𝑖𝑑 | ≀ 𝐺 1 + 𝑖𝑑 2

    |𝐹 𝑖𝑑 | = 1

    log 𝐹 πœƒ ≀ ΰΆ±log |𝐹 𝑖𝑑 | + ΰΆ±log |𝐹 1 + 𝑖𝑑 |

  • Complex Interpolation (Stein-Hirschman)

    |𝐹 πœƒ | = 𝐺 πœƒ 2/πœƒ

    For each πœƒ, find analytic 𝐹(𝑧) st:

    |𝐹 1 + 𝑖𝑑 | ≀ 𝐺 1 + 𝑖𝑑 2

    |𝐹 𝑖𝑑 | = 1

    log 𝐹 πœƒ ≀ ΰΆ±log |𝐹 𝑖𝑑 | + ΰΆ±log |𝐹 1 + 𝑖𝑑 |

    lim πœƒ β†’ 0

  • Key Ingredient

    [Sutter-Berta-Tomamichel’16] If 𝐴1, … , π΄π‘˜ are Hermitian, then

    log π‘‡π‘Ÿ 𝑒𝐴1+ …+π΄π‘˜

    ≀ ∫ 𝑑𝛽(𝑏) log π‘‡π‘Ÿ 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2

    βˆ—

    where 𝛽(𝑏) is an explicit probability density on ℝ.

  • Key Ingredient

    [Sutter-Berta-Tomamichel’16] If 𝐴1, … , π΄π‘˜ are Hermitian, then

    log π‘‡π‘Ÿ 𝑒𝐴1+ …+π΄π‘˜

    ≀ ∫ 𝑑𝛽(𝑏) log π‘‡π‘Ÿ 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2 𝑒𝐴1(1+𝑖𝑏)

    2 β€¦π‘’π΄π‘˜(1+𝑖𝑏)

    2

    βˆ—

    where 𝛽(𝑏) is an explicit probability density on ℝ.

    Issue. SBT involves integration over unbounded region, bad for Taylor expansion.

  • Bounded Modification of SBT

    Solution. Prove bounded version of SBT by replacing stripwith half-disk.

    [Thm] If 𝐴1, … , π΄π‘˜ are Hermitian, then

    log π‘‡π‘Ÿ 𝑒𝐴1+…+π΄π‘˜

    ≀ ∫ 𝑑𝛽(𝑏) log π‘‡π‘Ÿ 𝑒𝐴1𝑒

    𝑖𝑏

    2 β€¦π‘’π΄π‘˜π‘’

    𝑖𝑏

    2 𝑒𝐴1𝑒

    𝑖𝑏

    2 β€¦π‘’π΄π‘˜π‘’

    𝑖𝑏

    2

    βˆ—

    where 𝛽(𝑏) is an explicit probability density on

    βˆ’πœ‹

    2,πœ‹

    2.

    πœƒ

    Proof. Analytic 𝐹(𝑧) + Poisson Kernel + Riemann map.

  • Handling Two-sided Products

    Issue. Two-sided rather than one-sided products:

    π‘‡π‘Ÿ 𝑒𝑑𝑓 𝑣1 𝑒

    𝑖𝑏

    2 …𝑒𝑑𝑓 π‘£π‘˜ 𝑒

    𝑖𝑏

    2 𝑒𝑑𝑓 𝑣1 𝑒

    𝑖𝑏

    2 …𝑒𝑑𝑓 π‘£π‘˜ 𝑒

    𝑖𝑏

    2

    βˆ—

    Solution.Encode as one-sided product by using π‘‡π‘Ÿ 𝐴𝑋𝐡 = π΄βŠ—π΅π‘‡ 𝑣𝑒𝑐 𝑋 :

    βŸ¨π‘’π‘‘π‘“ 𝑣1 𝑒

    𝑖𝑏

    2 βŠ— 𝑒𝑑𝑓 𝑣1

    βˆ—π‘’π‘–π‘

    2 …𝑒𝑑𝑓 π‘£π‘˜

    βˆ—π‘‡π‘’βˆ’π‘–π‘

    2 βŠ—π‘’π‘‘π‘“ π‘£π‘˜

    βˆ—π‘‡π‘’βˆ’π‘–π‘

    2 vec Id , vec Id ⟩

  • Handling Two-sided Products

    Issue. Two-sided rather than one-sided products:

    π‘‡π‘Ÿ 𝑒𝑑𝑓 𝑣1 𝑒

    𝑖𝑏

    2 …𝑒𝑑𝑓 π‘£π‘˜ 𝑒

    𝑖𝑏

    2 𝑒𝑑𝑓 𝑣1 𝑒

    𝑖𝑏

    2 …𝑒𝑑𝑓 π‘£π‘˜ 𝑒

    𝑖𝑏

    2

    βˆ—

    Solution.Encode as one-sided product by using π‘‡π‘Ÿ 𝐴𝑋𝐡 = π΄βŠ—π΅π‘‡ 𝑣𝑒𝑐 𝑋 :

    βŸ¨π‘’π‘‘π‘“ 𝑣1 𝑒

    𝑖𝑏

    2 βŠ—π‘’π‘‘π‘“ 𝑣1

    βˆ—π‘‡π‘’π‘–π‘

    2 …𝑒𝑑𝑓 π‘£π‘˜

    βˆ—π‘‡π‘’βˆ’π‘–π‘

    2 βŠ—π‘’π‘‘π‘“ π‘£π‘˜

    βˆ—π‘‡π‘’βˆ’π‘–π‘

    2 vec Id , vec Id ⟩

  • Finishing the Proof

    Carry out a version of Healy’s argument with 𝑃 βŠ— 𝐼𝑑2 and:

    And 𝑣𝑒𝑐(𝐼𝑑) βŠ— 𝑒 instead of 𝑒.

    This leads to the additional 𝑑 factor.

    π’ŠπŸŽ

    π’ŠπŸ

    π’ŠπŸ

    π’ŠπŸ‘

    𝑃 𝑖0, 𝑖1

    𝑃 𝑖1, 𝑖2

    𝑃 𝑖2, 𝑖3

    𝐸 =𝑒𝑑𝑓 1 𝑒𝑖𝑏

    2 βŠ—π‘’π‘‘π‘“ 1 βˆ—π‘‡π‘’π‘–π‘

    2

    …

    𝑒𝑑𝑓(𝑛)𝑒𝑖𝑏

    2 βŠ—π‘’π‘‘π‘“ 𝑛 βˆ—π‘‡π‘’βˆ’π‘–π‘

    2

  • Main Theorem

    Thm. Suppose 𝐺 = (𝑉, 𝐸) is a regular graph with transition matrix 𝑃 which has second eigenvalue πœ†. Let 𝑓: 𝑉 β†’ ℂ𝑑×𝑑 be a function with | 𝑓 𝑣 | ≀ 1 and σ𝑣 𝑓 𝑣 = 0.Then, if 𝑣1, … , π‘£π‘˜ is a stationary random walk:

    β„™1

    π‘˜

    𝑖

    𝑓(𝑣𝑖) β‰₯ πœ– ≀ 2𝑑exp(βˆ’π‘(1 βˆ’ πœ†)π‘˜πœ–2)

  • Open Questions

    Other matrix concentration inequalities (multiplicative, low-rank, moments)

    Other Banach spaces(Schatten norms)

    More applications of complex interpolation