advances in metric embedding theory
DESCRIPTION
UCLA IPAM 07. Advances in Metric Embedding Theory. Yair Bartal Hebrew University & Caltech. Metric Spaces. Metric space: (X,d) d:X 2 → R + d( u,v)=d(v,u) d(v,w) ≤ d(v,u) + d(u,w) d(u,u)=0 Data Representation: Pictures (e.g. faces), web pages, DNA sequences, … - PowerPoint PPT PresentationTRANSCRIPT
Advances in Metric Advances in Metric Embedding TheoryEmbedding Theory
Yair BartalYair Bartal
Hebrew UniversityHebrew University&&
CaltechCaltech
UCLA IPAM 07UCLA IPAM 07
Metric SpacesMetric Spaces Metric space:Metric space: (X,d) d:X(X,d) d:X22→→RR+
d(d(u,v)=d(v,u)u,v)=d(v,u) d(v,w) d(v,w) ≤≤ d(v,u) + d(u,w) d(v,u) + d(u,w) d(u,u)=0d(u,u)=0
Data Representation: Data Representation: Pictures (e.g. faces), web pages, DNA sequences, …
Network: Network: communication distance
Metric EmbeddingMetric Embedding
Simple Representation: Simple Representation: Translate metric data into easy to analyze form, gain geometric structure: e.g. embed in low-dimensional Euclidean space
Algorithmic Application: Algorithmic Application: Apply algorithms for a “nice” space to solve problem on “problematic” metric spaces
Embedding Metric SpacesEmbedding Metric Spaces Metric spaces Metric spaces (X,d(X,dXX), (Y,d), (Y,dyy)) EmbeddingEmbedding is a function is a function f:Xf:X→→YY For an embedding For an embedding ff,, Given Given u,v u,v in in XX let let
Distortion Distortion
cc = = maxmax{u,v {u,v X} X} dist distff(u,v) (u,v) / / minmin{u,v {u,v X} X} dist distff(u,v) (u,v)
vud
vfufdvudist
X
Yf ,
,,
Special Metric SpacesSpecial Metric Spaces
Euclidean spaceEuclidean space llpp metric in R metric in Rnn::
Planar metricsPlanar metrics Tree metricsTree metrics UltrametricsUltrametrics
DoublingDoubling
p
ni
piip yxyx
1
||||||
Embedding in Normed Embedding in Normed SpacesSpaces
[Fréchet Embedding][Fréchet Embedding]:: Any Any nn-point -point metric space embeds metric space embeds isometricallyisometrically in in LL∞∞
Proof.Proof.
x
y
w
Embedding in Normed Embedding in Normed SpacesSpaces
[Bourgain 85][Bourgain 85]:: Any Any nn-point metric space embeds in -point metric space embeds in LLpp with distortion with distortion Θ(log n)(log n)
[Johnson-Lindenstrauss 85][Johnson-Lindenstrauss 85]:: Any Any nn-point subset of -point subset of Euclidean Space embeds with distortion Euclidean Space embeds with distortion (1+(1+)) in in dimension dimension Θ((--22
log n)log n)
[ABN 06, B 06][ABN 06, B 06]:: Dimension Dimension ΘΘ(log n)(log n)In fact:In fact: ΘΘ**(log n/ loglog n)(log n/ loglog n)
EmbeddingsEmbeddingsMetrics in their Metrics in their IntrinsicIntrinsic
DimensionDimension Definition:Definition: A metric space A metric space XX has has doubling constant doubling constant λλ, , if if
any ball with radius any ball with radius r>0r>0 can be covered with can be covered with λλ balls of balls of half the radius.half the radius.
Doubling dimension: Doubling dimension: dim(X)dim(X) = = log log λλ
[ABN 07b]:[ABN 07b]: Any Any nn point metric space point metric space XX can be can be embedded into embedded into LLpp with distortion with distortion O(logO(log1+1+θθ n),n), dimension dimension O(dim(X))O(dim(X)) Same embedding,Same embedding, using: using: netsnets Lovász Local LemmaLovász Local Lemma
Distortion-Dimension TradeoffDistortion-Dimension Tradeoff
Average DistortionAverage Distortion
Practical measure of the quality of an embedding Practical measure of the quality of an embedding Network embedding, Multi-dimensional scaling, Biology, Vision,Network embedding, Multi-dimensional scaling, Biology, Vision,
……
Given a non-contracting embedding Given a non-contracting embedding
ff::(X,d(X,dXX))→→(Y,d(Y,dYY):):
[ABN06][ABN06]: Every : Every nn point metric space embeds into point metric space embeds into LLpp
with average distortion with average distortion O(1)O(1),, worst-case distortion worst-case distortion ΘΘ(log (log n)n) and dimension and dimension ΘΘ(log n)(log n)..
Xvuf vudist
nfavgdist
,
1
),(2
XvuX
XvuY
vud
vfufd
fdistavg
,
,
,
,
vud
vfufdvudist
X
Yf ,
,,
TheThe l lqq-Distortion-Distortion
llqq-distortion-distortion:
q
vu
qfqfq vudist
nvudistfdist
,
2,
1
vudistfdist f ,max
21
2 ,2
Xvuf vudist
nfdist
Xvuf vudist
nfdist ,
2
1
1 ]]ABN 06ABN 06:[:[ lq-distortion is
bounded by Θ(q)
Dimension Reduction into Dimension Reduction into Constant DimensionConstant Dimension
[B 07][B 07]:: Any finite subset of Euclidean Any finite subset of Euclidean Space embeds in dimension Space embeds in dimension hh with with lq-
distortiondistortion eO(q/h) ~ 1+ O(q/h)
Corollary:Corollary: Every finite metric space Every finite metric space embeds into embeds into LLpp in dimension in dimension hh with with lq-
distortiondistortion phqO heq 121)/(
Local EmbeddingsLocal Embeddings Def:Def: AA kk-local embedding-local embedding has distortionhas distortion D(k) D(k) if for if for
every every kk-nearest neighbors-nearest neighbors x,y: dist x,y: distff(x,y) (x,y) ≤≤ D(k) D(k)
[ABN 07c]:[ABN 07c]: For fixed For fixed kk, , kk-local embedding into -local embedding into LLpp distortion distortion (log k(log k) and ) and dimension dimension (log k) (log k) (under (under very weak growth bound condition)very weak growth bound condition)
[ABN 07c]:[ABN 07c]: kk-local embedding into -local embedding into LLpp with with distortion distortion Õ(log k)Õ(log k) on neighbors, on neighbors, for all for all kk simultaneouslysimultaneously, and dimension , and dimension (log n)(log n) Same embedding methodSame embedding method Lovász Local LemmaLovász Local Lemma
Local Dimension Local Dimension ReductionReduction
[BRS 07]:[BRS 07]: For fixed For fixed kk, any finite set of , any finite set of points in Euclidean space has points in Euclidean space has kk-local -local embedding with distortion embedding with distortion (1+(1+)) in in dimension dimension ((--22 log k) log k) (under very weak (under very weak growth bound condition)growth bound condition)
New embedding ideasNew embedding ideas Lovász Local LemmaLovász Local Lemma
Time for a…Time for a…
Metric Ramsey ProblemMetric Ramsey Problem
Given a metric space what is the largest size subspace which has some special structure, e.g. close to be Euclidean
Graph theory: Graph theory: Every graph of size n contains either a clique or an independent set of size (log n)
Dvoretzky’s theorem…Dvoretzky’s theorem… [BFM 86]: [BFM 86]: Every n point metric space
contains a subspace of size (c log n) which embeds in Euclidean space with distortion (1+)
Basic Structures: Basic Structures: Ultrametric,Ultrametric, k k-HST [B 96]-HST [B 96]
d(x,z)= (lca(x,z))= (v)
(w)
(u)
0 = (z) (w)/k (v)/k2 (u)/k3
(v)
x z(z)=0
• An ultrametric k-embeds in a k-HST (moreover thiscan be done so that labels are powers of k).
Hierarchically Well-Hierarchically Well-Separated TreesSeparated Trees
1
1
1
1
1
2
22
2 1/ k
3
3
3
3
3
3 2/ k
Properties of Properties of UltrametricsUltrametrics
An ultrametric is a tree metric.An ultrametric is a tree metric.
Ultrametrics embed isometrically inUltrametrics embed isometrically in ll22..
[BM 04]:[BM 04]: Any Any nn-point ultrametric (1+-point ultrametric (1+)- )- embeds in embeds in llpp
dd, where, where dd = = OO((--22 log log nn) .) .
A Metric Ramsey A Metric Ramsey PhenomenonPhenomenon
Consider Consider nn equally spaced points on the line. equally spaced points on the line. Choose a “Cantor like” set of points, and Choose a “Cantor like” set of points, and
construct a binary tree over them. construct a binary tree over them. The resulting tree is 3-HST, and the original The resulting tree is 3-HST, and the original
subspace embeds in this tree with distortion 3.subspace embeds in this tree with distortion 3. Size of subspace: .Size of subspace: .2loglog 332 nn
Metric Ramsey Metric Ramsey PhenomenaPhenomena
[BLMN 03, MN 06, B 06][BLMN 03, MN 06, B 06]:: Any Any nn--point point metric space contains a subspace of size metric space contains a subspace of size which embeds in an which embeds in an ultrametric with distortion ultrametric with distortion Θ(1/(1/))
[B 06][B 06]:: Any Any nn--point metric space contains point metric space contains a subspace of a subspace of linearlinear size which embeds in size which embeds in an ultrametric with an ultrametric with llqq-distortion is bounded by ÕÕ(q)
1n
Metric Ramsey TheoremsMetric Ramsey Theorems
Key Ingredient:Key Ingredient: PartitionsPartitions
Complete Representation Complete Representation via Ultrametricsvia Ultrametrics? ?
Goal:Goal: Given an n point metric space, we would like to embed it into an ultrametric with low distortion.
Lower Bound:Lower Bound: (n), in fact this holds event for embedding the n-cycle into arbitrary tree metrics [RR 95][RR 95]
Probabilistic EmbeddingProbabilistic Embedding
[Karp 89]:[Karp 89]: TheThe nn--cycle probabilistically-cycle probabilistically-embeds in embeds in nn--line spaces with distortion 2line spaces with distortion 2
If If u,vu,v are adjacent in the cycle are adjacent in the cycle C thenthen
E(E(ddLL((u,vu,v))= ())= (nn-1)/-1)/nn + ( + (nn-1)/-1)/nn < < 22 = = 22 ddCC((u,vu,v))
C
Probabilistic EmbeddingProbabilistic Embedding
[B 96,98,04, FRT 03]:[B 96,98,04, FRT 03]: AnyAny nn--point metric point metric space probabilistically embeds into space probabilistically embeds into an ultrametric with distortion with distortion Θ(log n)(log n)
]]ABN 05,06, CDGKS 05ABN 05,06, CDGKS 05:[:[
lq-distortion is Θ(q)
Probabilistic EmbeddingProbabilistic Embedding
Key Ingredient:Key Ingredient: Probabilistic PartitionsProbabilistic Partitions
Probabilistic Partitions Probabilistic Partitions PP={={SS11,S,S22,…S,…Stt} is a partition of } is a partition of X X ifif
PP((xx)) is the cluster containing is the cluster containing xx.. P P is is ΔΔ-bounded-bounded if if diam(Sdiam(Sii)≤)≤ΔΔ for all for all ii.. A A probabilistic partitionprobabilistic partition PP is a distribution over a set is a distribution over a set
of partitions. of partitions. PP is is ((ηη,,)-padded)-padded if if
CallCall P P ηη-padded-padded if if
XSSSji ii
ji ,:
xPxB ,Prx1
x2
η
η
•[B 96][B 96] =(1/(log n))
•[CKR01+FRT03, ABN06]: [CKR01+FRT03, ABN06]: η(x)= Ω(1/log (ρ(x,Δ))
[B 96, Rao 99, …][B 96, Rao 99, …] Let Let ΔΔii=4=4ii be the scales.be the scales.
For each scale For each scale ii, create a probabilistic , create a probabilistic ΔΔii--
boundeboundedd partitions partitions PPii,, that are that are ηη--paddedpadded..
For each cluster choose For each cluster choose σσii(S)~Ber(½)(S)~Ber(½) i.i.d. i.i.d.
ffii(x)= (x)= σσii(P(Pii(x))·d(x,X\P(x))·d(x,X\Pii(x))(x))
Repeat Repeat O(log n)O(log n) times. times. Distortion : Distortion : O(O(ηη-1-1·log·log1/p1/pΔΔ).). Dimension : Dimension : O(log n·log O(log n·log ΔΔ).).
Partitions and EmbeddingPartitions and Embedding
xfxf ii 0
diameter of X =diameter of X = Δ
Δi
416
x
d(x,X\P(x))
Time to…Time to…
Uniform Probabilistic Uniform Probabilistic PartitionsPartitions In a In a UniformUniform Probabilistic Partition Probabilistic Partition ηη:X→[0,1] all points :X→[0,1] all points
in a cluster have in a cluster have the samethe same padding parameter. padding parameter. [ABN 06]: [ABN 06]: Uniform partition lemmaUniform partition lemma: There exists a : There exists a
uniformuniform probabilistic probabilistic ΔΔ-bounded partition such that for -bounded partition such that for any , any , ηη(x)=log(x)=log-1-1ρρ(v,(v,ΔΔ),), wherewhere
The The local growth ratelocal growth rate of x at radius r is: of x at radius r is:
v1v2
v3
C1C2
η(C2)
η(C1)
,min xvCxCx
4,
4,,
rxB
rxBrx
Let Let ΔΔii=4=4ii..
For each scale For each scale ii, create , create uniformly paddeduniformly padded probabilistic probabilistic ΔΔii--boundeboundedd partitions partitions PPii..
For each cluster choose For each cluster choose σσii(S)~Ber(½)(S)~Ber(½) i.i.d. i.i.d.
, , ffii(x)= (x)= σσii(P(Pii(x))·(x))·ηηii-1-1(x)·(x)·d(x,X\Pd(x,X\Pii(x))(x))
Upper boundUpper bound : : |f(x)-f(y)| |f(x)-f(y)| ≤≤ O(log n)·d(x,y). O(log n)·d(x,y). Lower boundLower bound: : E[|f(x)-f(y)|] E[|f(x)-f(y)|] ≥≥ ΩΩ(d(x,y))(d(x,y)) ReplicateReplicate D=Θ(log n)D=Θ(log n) times to get high probability. times to get high probability.
0i
i xfxf
Embedding Embedding into a single dimensioninto a single dimension
Upper Bound:Upper Bound: |f(|f(xx)-f()-f(yy)| ≤ )| ≤ OO(log (log nn) d() d(xx,,yy))
For all For all x,yx,yєєXX::
- - PPii(x)(x)≠≠PPii(y)(y) implies implies ffii(x)≤ (x)≤ ηηii-1-1(x)·(x)· d(x,y) d(x,y)
- P- Pii(x)(x)==PPii(y)(y) impliesimplies ffii(x)-(x)- ffii(y(y)≤ )≤ ηηii-1-1(x)·(x)· d(x,y) d(x,y)
yxdnO
xB
xByxd
xyxdyfxf
i i
i
ii
iii
,log
4,
4,log,
,
0
0
1
0
Use uniform padding in cluster
xPXxdxxPxf iiiii \,1
ii x
x
y
Take a scale Take a scale i i such that such that ΔΔii≈≈d(x,y)/4.d(x,y)/4. It must be thatIt must be that P Pii(x)≠P(x)≠Pii(y)(y) With probability ½ With probability ½ : : ηηii
-1-1(x)d(x,X\P(x)d(x,X\Pii(x))≥(x))≥ΔΔii
Lower Lower Bound:Bound:
Lower bound : E[|f(x)-f(y)|] ≥ Lower bound : E[|f(x)-f(y)|] ≥ d(x,y)d(x,y)
Two cases:Two cases:
1.1. R < R < ΔΔii/2/2 then then prob. prob. ⅛: ⅛: σσii(P(Pii(x))=1 and (x))=1 and σσii(P(Pii(y))=0(y))=0 Then Then f fii(x) (x) ≥≥ ΔΔii , ,ffii(y)=0(y)=0 |f(x)-f(y)| |f(x)-f(y)| ≥≥ ΔΔii/2 =/2 =ΩΩ(d(x,y)).(d(x,y)).
2.2. R R ≥≥ ΔΔii/2/2 then then prob. prob. ¼: ¼: σσii(P(Pii(x))=0 and (x))=0 and σσii(P(Pii(y))=0(y))=0 ffii(x)=f(x)=fii(y)=0(y)=0 |f(x)-f(y)| |f(x)-f(y)| ≥≥ ΔΔii/2 =/2 =ΩΩ(d(x,y)).(d(x,y)).
ij
jj yfxfR
Partial Embedding & Partial Embedding & Scaling DistortionScaling Distortion
DefinitionDefinition: : A A (1-(1-εε)-)-partial embedding has distortion partial embedding has distortion D(D(εε),), if if at least at least 1-1-εε of the pairs satisfy of the pairs satisfy distdistff(u,v) (u,v) ≤≤ D( D(εε))
DefinitionDefinition:: An embedding has scaling distortion An embedding has scaling distortion D(·)D(·) if it is if it is a a 1-1-εε partial embedding with distortion partial embedding with distortion D(D(εε),), for for all all εε>0>0
[KSW 04][KSW 04] [ABN 05, CDGKS 05][ABN 05, CDGKS 05]:: Partial distortion andPartial distortion and dimension dimension (log(1/(log(1/εε))))
[ABN06]:[ABN06]: Scaling distortion Scaling distortion (log(1/ε))(log(1/ε)) for all for all metricsmetrics
llqq-Distortion vs. -Distortion vs. Scaling DistortionScaling Distortion
Upper boundUpper bound DDc log(1/c log(1/) ) on on Scaling Scaling distortiondistortion:: ½ of pairs have distortion ≤ ½ of pairs have distortion ≤ c log 2 = cc log 2 = c + ¼ of+ ¼ of pairspairs have distortion ≤ distortion ≤ c log 4 = 2cc log 4 = 2c + ⅛ of+ ⅛ of pairspairs have distortion ≤ distortion ≤ c log 8 = 3cc log 8 = 3c … …..
Average distortion = Average distortion = O(1)O(1) Wost case distortionWost case distortion = = O(log(n))O(log(n)) llqq--distortiondistortion = O(min{q,log n}) = O(min{q,log n})
cciavgdisti
i 220
Coarse Scaling Embedding Coarse Scaling Embedding into Linto Lpp
Definition:Definition: For For uuєєX, X, rrεε(u)(u) is the minimal is the minimal radius such that radius such that ||B(u,rB(u,rεε(u))| ≥ (u))| ≥ εεnn..
CoarseCoarse scaling scaling embedding: For each embedding: For each uuєєX,X, preserves preserves distances to distances to vv s.t. s.t. d(u,v) ≥d(u,v) ≥ rrεε(u).(u).
urε(u)
vrε(v)
rε(w)w
Scaling DistortionScaling Distortion ClaimClaim: If : If d(x,y) d(x,y) ≥≥ r rεε(x)(x) then then 1 1 ≤≤ dist distff(x,y) (x,y) ≤≤ O(log 1/ O(log 1/εε)) Let Let ll be the scale be the scale d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y) < 4d(x,y)
Lower boundLower bound: : E[|f(x)-f(y)|] E[|f(x)-f(y)|] ≥≥ d(x,y) d(x,y) Upper boundUpper bound for for high high diameter termsdiameter terms
Upper boundUpper bound for for lowlow diameter terms diameter terms
ReplicateReplicate D=Θ(log n)D=Θ(log n) times to get high probability. times to get high probability.
yxdOyfxfli
ii ,1log
yxdOyfxfli
ii ,1
Upper Bound for high diameter terms:Upper Bound for high diameter terms:|f(|f(xx)-f()-f(yy)| ≤ )| ≤ OO(log 1/ε) d((log 1/ε) d(xx,,yy))
Scale Scale ll such that such that rrεε(x)(x)≤≤d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y). < 4d(x,y).
yxdO
xB
xByxd
xyxdyfxf
li i
i
lii
liii
,1log
4,
4,log,
, 1
nxrxB ,
xPXxdxxPxf iiiii \,1
Upper Bound for low diameter terms:Upper Bound for low diameter terms:|f(u)-f(v)| |f(u)-f(v)| == O(1)O(1) d(u,v) d(u,v)
Scale Scale ll such that such that d(x,y) d(x,y) ≤≤ ΔΔll < 4d(x,y). < 4d(x,y).
All lower levels All lower levels i i ≤≤ l l are bounded by are bounded by ΔΔii..
yxdOyfxf lli
ili
ii ,1
yxdOOyfxfi
ii ,11log0
xPXxdxxPxf iiiii \,1 iiiiii xPXxdxxPxf ,\,min 1
Embedding into trees with Embedding into trees with Constant Average DistortionConstant Average Distortion
[ABN 07a]:[ABN 07a]: An embedding of any n point An embedding of any n point metric into a single metric into a single ultrametricultrametric..
An embedding of any graph on n vertices An embedding of any graph on n vertices into a into a spanning treespanning tree of the graph. of the graph. Average distortion = Average distortion = O(1).O(1). LL22-distortion = -distortion =
LLqq-distortion = -distortion = ΘΘ(n(n1-2/q1-2/q), ), forfor 2<q 2<q≤∞ ≤∞ nlog
ConclusionConclusion Developing mathematical theory of Developing mathematical theory of
embedding of finite metric spacesembedding of finite metric spaces
Fruitful interaction between computer Fruitful interaction between computer science and pure/applied mathematicsscience and pure/applied mathematics
New concepts of embedding yield New concepts of embedding yield surprisingly strong propertiessurprisingly strong properties
SummarySummary Unified frameworkUnified framework for embedding finite metrics.for embedding finite metrics. Probabilistic embeddingProbabilistic embedding into into ultrametricsultrametrics.. MetricMetric Ramsey theorems.Ramsey theorems. NewNew measuresmeasures of distortion.of distortion. Embeddings with strong propertiesEmbeddings with strong properties:: OptimalOptimal scaling distortion.scaling distortion. ConstantConstant average distortion.average distortion. TightTight distortion-dimensiondistortion-dimension tradeoff.tradeoff.
Embedding metrics inEmbedding metrics in theirtheir intrinsic dimension.intrinsic dimension. Embedding that strongly preserveEmbedding that strongly preserve locality.locality.