david jekel (joint work with wilfrid gangbo, kyeongsik nam

Duality for optimal couplings in free probability

David Jekel (joint work with Wilfrid Gangbo, Kyeongsik Nam, andDimitri Shlyakhtenko)

University of California, San Diego

COSy, May 31, 2021

David Jekel (UCSD) Optimal couplings 2021-05-31 1 / 25

Acknowledgements

Land Acknowledgement: UCSD stands on the land of the Kumeyaaypeople.

Funding Acknowledgement: D.J. was supported by the NSF postdocgrant DMS-2002826. W.G. was supported by DMS-1700202 and U.S. AirForce grant FA9550-18-1-0502. D.S. was supported by NSF grantDMS-1762360.

Inspiration acknowledgement: We thank Alice Guionnet, YoannDabrowski, Wuchen Li, Ben Hayes, and Srivatsav Kunnawalkam Elayavallifor useful conversations.

Thanks to the conference organizers!


Non-commutative probability spaces

A W∗-algebra is a unital C∗-algebra A that is also a dual space (Sakaishowed that this is equivalent to a von Neumann algebra).A tracial W∗-algebra is a pair (A, τ) where A is a W∗-algebra andτ : A→ C is a linear functional that is

Unital: τ(1) = 1,

Positive: τ(a∗a) ≥ 0,

Tracial: τ(ab) = τ(ba),

Faithful: τ(a∗a) = 0 =⇒ a = 0.

It is a well-known theorem that every commutative tracial W∗-algebra isisomorphic to L∞(Ω,P) for some probability space (Ω,P), with the tracebeing given by τ(f ) =

∫f dP.

Hence, a tracial W∗-algebra may be viewed as a non-commutativeprobability space.


Non-commutative probability spaces

classical non-commutative

L∞(Ω,P) Aexpectation E trace τ

bounded random variable Z Z ∈ Abounded real random variable Z ∈ Asa (self-adjoint)

bounded Rm-valued random variable X = (X1, . . . ,Xm) ∈ Amsa

unbounded random variables operators affiliated to AL2(Ω,P) L2(A, τ)

Here L2(A, τ) is the Hilbert space from the GNS construction, and thiscan naturally be identified with the space of square-integrable affiliatedoperators.


Non-commutative laws

In probability theory, if (X1, . . . ,Xm) are bounded real, random variables,then their law is a probability measure on [−R,R]m for some R, that is, atrace on C [−R,R]⊗m.

A non-commutative law is a trace on the C∗-universal free productC ([−R,R])∗m (for some R).

Σm,R denotes this trace space with weak-∗ topology.

More combinatorially, an element of Σm,R is a unital, positive, tracial mapµ : C〈X1, . . . ,Xm〉 → C satisfying

|µ(Xi1 . . .Xin)| ≤ Rn.

This encodes the non-commutative moments of some tuple ofnon-commutative random variables.


Non-commutative laws

For every X ∈ Amsa, the non-commutative law of X , denoted λX , is given

by λX (p) = τ(p(X )) for every non-commutative polynomial p.

In fact, tracial non-commutative laws ↔ tracial W ∗-algebras with aspecified generating m-tuple up to isomorphism.

→ GNS construction.

← evaluate moments of your generators.


Classical optimal couplings

Given µ, ν ∈ P([−R,R]m):

A coupling of µ and ν is a probability space (Ω,P) and randomvariables X and Y such that X ∼ µ and Y ∼ ν.

Couplings can alternatively be defined by looking at the joint law π of(X ,Y ) on [−R,R]2m, which is some probability distribution withmarginals µ and ν.

The cost of the coupling is ‖X − Y ‖L2(Ω,P).

A coupling is said to be optimal if it achieves the minimal cost.

The minimal value of the cost is called the Wasserstein distancebetween µ and ν and is denoted by dW (µ, ν).

dW defines a metric on P([−R,R]m), and it metrizes the weak-∗topology on P([−R,R]m).


Classical optimal couplings — intuition

Couplings provide a mathematical description of a transportation problem.Imagine that µ represents the distribution of mass in some pile of dirt. Wewant to move the dirt and rearrange it into the shape of ν. The measureπ describes a plan to transport the dirt. The “mass” at (x , y) under πrepresents the “mass” at the point x for µ that will be moved to the pointy for ν.

Hence, an optimal coupling represents a way to transport the dirt with theleast possible work. It is a classical example of an optimization problem.


Non-commutative optimal couplings (Biane-Voiculescu)

Given µ, ν ∈ Σm,R :

A coupling of µ and ν is a tracial W∗-algebra (A, τ) and randomvariables X ,Y ∈ Am

sa with λX = µ and λY = ν.

We can also consider the joint law π = λ(X ,Y ) ∈ Σ2m,R . By the GNSconstruction any π ∈ Σ2m,R with marginals µ and ν produces acoupling in the above sense.

The cost of the coupling is ‖X − Y ‖L2(A,τ)m .

A coupling is said to be optimal if it achieves the minimal cost.Optimal couplings exist by compactness of Σ2m,R .

The minimal value of the cost is called the (non-commutative)Wasserstein distance of µ and ν and is denoted by dW (µ, ν).

dW defines a metric on Σm,R . But does it generate the weak-∗topology?


Non-separability of the NC Wasserstein space

Proposition

For m > 1, while Σm,R is compact in the weak-∗ topology, it is notseparable with respect to the Wasserstein distance.

This relies on the following result:

Theorem (Gromov, Olshanskii, Ozawa)

There exists a group G with property (T) and an uncountable family(Gα)α∈I of non-isomorphic quotients of G .

Let G be as above and let g1, . . . , gm be the generators. Let qα : G → Gαbe the quotient map. Let Xα ∈ L(Gα)2m

sa be given by the real andimaginary parts of qα(g1), . . . , qα(gm). Because of property (T), if λXα

and λXβwere coupled sufficiently close together, then we would have

supg∈G |τ(qα(g))− τ(qβ(g))| < 1. However, since τ(qα(g)) = δqα(g)=e ,this would imply that Gα = Gβ. Hence, the laws (λXα)α∈I are ε-separatedin Wasserstein distance for some ε.


Classical Monge-Kantorovich duality

Recall that if f : Rm → Rm is convex, then y is said to be subgradient tof at x if

f (x ′)− f (x) ≥ 〈x ′ − x , y〉 for all x ′ ∈ Rm.

We denote the set of subgradients by ðf (x).

Theorem (Classical)

Let (Ω,P,X ,Y ) be a coupling of µ and ν. The following are equivalent:

(1) The coupling is optimal.

(2) There exists a convex function f : Rm → (−∞,∞] such that Y isalmost surely in ðf (X ).

(3) There exist convex functions f , g : Rm → (−∞,∞] such thatf (x) + g(y) ≥ 〈x , y〉 everywhere andEf (X ) + Eg(Y ) = 〈X ,Y 〉L2(Ω,P).


Classical Monge-Kantorovich duality

Let C (µ, ν) be the supremum of 〈X ,Y 〉L2(Ω,P) over couplings(Ω,P,X ,Y ). Since ‖X‖L2(Ω,P) and ‖Y ‖L2(Ω,P) are determined by µ andν, minimizing ‖X − Y ‖L2(Ω,P) is equivalent to maximizing 〈X ,Y 〉L2(Ω,P).We denote the maximal value of 〈X ,Y 〉L2(Ω,P) by C (µ, ν).

The supremum C (µ, ν) can be expressed as an infimum over pairs ofconvex functions f and g as in the previous theorem.

Theorem (Classical)

Let µ, ν ∈ P([−R,R]m). Then C (µ, ν) is the infimum of∫f dµ+

∫g dν

over pairs (f , g) of convex functions f , g : Rm → (−∞,∞] satisfyingf (x) + g(y) ≥ 〈x , y〉.

In other words, minimizing∫f dµ+

∫g dν over such pairs (f , g) is the

dual problem (in the sense of linear programming) to optimal coupling.


Non-commutative Monge-Kantorovich duality

In adapting this result to the non-commutative setting, the challenge wasto find an appropriate analog of convex functions. Although tracialW∗-algebras are an analog of L∞(Ω,P), the non-commutative probabilityspaces have no points!

Hence, we consider functions that can be evaluated on random variablesrather than points. As motivation, if f : Rm → (−∞,∞] is convex and(Ω,P) is a probability space with σ-algebra F , then we can define

f : L2(Ω,P;Rm)→ (−∞,∞], f (X ) = Ef (X ).

Then f is convex. Moreover, by Jensen’s inequality if G is a σ-subalgebraof F , then by Jensen’s inequality,

f (E[X |G]) ≤ f (X ).

Moreover, f , g : Rm → (−∞,∞] satisfy f (x) + g(y) ≥ 〈x , y〉 if and only iff (X ) + g(Y ) ≥ 〈X ,Y 〉L2(Ω,P;Rm) for all X ,Y ∈ L2(Ω,P;Rm).



An additional complication of the non-commutative setting is that we haveto consider multiple non-commutative probability spaces at the same time.

In the classical setting, every probability space can be modeled in [0, 1]with Lebesgue measure. By contrast, Ozawa showed based on Gromov andOlshanskii’s work that there is no separable tracial W∗-algebra thatcontains a copy of all others.

Hence, to study optimal couplings, we need to study functions f that aredefined for all tracial W∗-algebras, not only for one particular tracialW∗-algebra.


E -convex functions

To save space, we will use A to denote a pair (A, τ) of a von Neumannalgebra with faithful normal tracial state.

Definition

A tracial W∗-function is a collection f A : L2(A)msa → [−∞,∞] of functionsfor each tracial W∗-algebra such that if ι : A → B is a trace-preservingunital ∗-homomorphism, then f A = f B ι.

Definition

A tracial W∗-function f = (f A)A with values in (∞,∞] is said to beE -convex if

(1) Each f A is convex and lower semi-continuous on L2(A)msa.

(2) If ι : A → B is a trace-preserving inclusion and E = ι∗ : B → A is thecorresponding trace-preserving conditional expectation, thenf A(E [X ]) ≤ f B(X ).



If f is a tracial W∗-function, then f A(X ) only depends on λX because frespects inclusions. Hence, µ(f ) is well-defined for µ ∈ Σm,R .

We call a pair (f , g) of tracial W∗-functions admissible if

f A(X ) + gA(Y ) ≥ 〈X ,Y 〉L2(A)msa

for all A and all X ,Y ∈ L2(A)msa.

Proposition (GJNS)

For µ, ν ∈ Σm,R , the coupling constant C (µ, ν) is equal to the infimum ofµ(f ) + ν(g) over all admissible pairs of E -convex functions. Furthermore,a coupling (A,X ,Y ) of µ and ν is optimal if and only if there exists suchan admissible pair (f , g) with f A(X ) + gB(Y ) = 〈X ,Y 〉L2(A)msa

, and in thiscase both quantities are equal to C (µ, ν).


Applications — isomorphism of von Neumann algebras

If (Ω,P,X ,Y ) is a classical optimal coupling of µ, ν ∈ P([−R,R]m), thenwe saw that Y ∈ ðf (X ) almost surely for some convex function f . If f isdifferentiable everywhere, then ðf (X ) consists of a single point ∇f (X ),and hence Y can be expressed as a function of X .

In the non-commutative setting, if (A,X ,Y ) is an optimal coupling, thenY ∈ ðf A(X ) for some E -convex function f . E -convexity implies thatðf A(X ) contains a point in L2(W∗(X ))msa, and hence if f A is sufficientlyregular, then Y must belong to W∗(X )msa. This led to the following result:

Theorem (GJNS)

Let (A,X ,Y ) be an optimal coupling of µ, ν ∈ Σm,R . Then for everyt ∈ (0, 1), we have W∗((1− t)X + tY ) = W∗(X ,Y ).


Applications — demonstrating optimality

Guionnet and Shlyakhtenko (2014) considered the case where µ is alog-concave free Gibbs law and ν is the law of a semi-circular family S .They showed that there was some type of convex function f such that∇f (S) ∼ µ when S = (S1, . . . ,Sm) ∼ ν is a free semicircular family.However, they did not show that (W∗(S), S ,∇f (S)) is an optimalcoupling of ν and µ.

The preprint of J.-Li-Shlyakhtenko (Jan. 2021) used a special case ofnon-commutative Monge-Kantorovich duality to prove that the couplingwas indeed optimal.

One of the main goals of the current work (Gangbo-J.-Nam-Shlyakhtenko)was set up the non-commutative Monge-Kantorovich duality in greatergenerality.


Connections — quantum information theory

Definition (Anantharaman-Delaroche)

If A and B are tracial W∗-algebras, then a completely positive mapΦ : A → B is said to be factorizable if it is realized as a trace-preservinginclusion A → C followed by a trace-preserving conditional expectationC → B (and in this case, we say Φ factorizes through C).

Observation (GJNS)

Let X ∈ Amsa and Y ∈ Bmsa be non-commutative tuples. Then

C (λX , λY ) = supΦ∈FM(A,B)

〈Φ(X ),Y 〉L2(B)msa,

where FM(A,B) denotes the space of factorizable maps (also known asquantum channels).



Theorem (Musat-Rørdam 2020)

For large enough n, there exist factorizable maps Mn(C)→ Mn(C) thatfactorize through the hyperfinite II1 factor but not through anyfinite-dimensional algebra.

Corollary (GJNS)

For sufficiently large n, for all d , there exist X ,Y ∈ Mn(C)n2

sa such that anoptimal coupling of λX and λY requires an algebra of dimension at least d .

Problem

Study the optimal couplings of X ,Y ∈ Mn(C)msa for explicit examples. Canyou show directly that d may need to be infinite?



Theorem (Haagerup-Musat 2015)

A completely positive map Mn(C)→ Mn(C) factorizes through aConnes-embeddable tracial W∗-algebra if and only if it can beapproximated by maps that factorize through finite-dimensional algebras.Moreover, the Connes-embedding problem has a positive answer if andonly if every factorizable map can be approximated by those that factorizethrough finite-dimensional algebras.

Ji-Natarajan-Vidick-Wright-Yuen 2020 announced a negative solution tothe Connes embedding problem, which would imply the following corollary.

Corollary (GJNS)

For sufficiently large n, there exist X ,Y ∈ Mn(C)n2

sa such that everyoptimal coupling of λX and λY uses an non-Connes-embeddable tracialW∗-algebra.


Connections — lifting properties

Proposition (GJNS)

Let µ ∈ Σm,R , and let X ∈ Amsa such that λX = µ and A = W∗(X ). Then

the following are equivalent:

(1) The weak-∗ topology and the Wasserstein topology on Σm,R agree atat the point µ.

(2) Every trace-preserving embedding of A into a tracial ultraproduct∏n→U An lifts to a sequence of factorizable completely positive maps

Φn : A → An.

Corollary (GJNS)

In the above proposition, if A is Connes-embeddable, then the weak-∗ andWasserstein topologies agree at µ if and only if A is amenable.

This relies on Connes’ 1976 paper of course, and it is related to recentresults of Atkinson and Kunnawalkam Elayavalli characterizing amenabilitythrough tracial stability properties.


References (in order of appearance) I

1 Wilfrid Gangbo, David Jekel, Kyeongsik Nam, and DimitriShlyakhtenko. Duality for optimal couplings in free probability.arXiv:2105.12351

2 Philippe Biane and Dan-Virgil Voiculescu. A free probability analogueof the Wasserstein metric on the trace-state space. Geometric andFunctional Analysis, 11:1125–1138, 2001.

3 Narutaka Ozawa. There is no separable universal II1 factor. Proc.Amer. Math. Soc., 132:487–90, 2004.

4 Alice Guionnet and Dimitri Shlyakhtenko. Free monotone transport.Inventiones Mathematicae, 197(3):613–661, 09 2014.

5 David Jekel, Wuchen Li, and Dimitri Shlyakhtenko. Tracialnon-commutative smooth functions and the free Wassersteinmanifold. arXiv:2101.06572, 2021.


References (in order of appearance) II

6 Claire Anantharaman-Delaroche. On ergodic theorems for free groupactions on noncommutative spaces. Probab. Theory Rel. Fields,135:520–546, 2006.

7 Magdalena Musat and Mikael Rørdam. Non-closure of quantumcorrelation matrices and factorizable channels that require infinitedimensional ancilla (with an appendix by Narutaka Ozawa). Comm.Math. Phys., 375:1761–1776, 2020.

8 Uffe Haagerup and Magdalena Musat. An asymptotic property offactorizable completely positive maps and the Connes embeddingproblem. Comm. Math. Phys., 338(2):721–752, 2015.

9 Zhengfeng Ji, Anand Natarajan, Thomas Vidick, John Wright, HenryYuen. MIP∗=RE. arXiv:2001.04383

10 Alain Connes. Classification of injective factors. Cases II1, II∞, IIIλ,λ 6= 1. Ann. of Math. (2), 104(1):73–115, 1976.


References (in order of appearance) III

11 Scott Atkinson and Srivatsav Kunnawalkam Ellayavalli. Onultraproduct embeddings and amenability for tracial von Neumannalgebras. International Mathematics Research Notices,2021(4):2882–2918, 2021.


david jekel (joint work with wilfrid gangbo, kyeongsik nam

Documents