the geometry of graphs and some of its algorithmic...

15
The geometry of graphs and some of its algorithmic applications Nathan Linial * Eran London Yuri Rabinovich Institute of Computer Science Institute of Mathematics Department of Computer Science Hebrew University Hebrew University University of Toronto Jerusalem 91904 Jerusalem 91904 Toronto, Ontario M5S 1A4 Israel Israel Canada nati@cs. huji. ac.il [email protected]. ac.il [email protected]. edu Abstract In this paper we explore some implications of view- ing graphs as geometric objects. This approach of- fers a new perspective on a number of graph-theoretic and algorithmic problems. There are several ways to model graphs geometrically and our main concern here is with geometric representations that respect the met- ric of the (possibly weighted) graph. Given a graph G we map its vertices to a normed space in an attempt to (i) Keep down the dimension of the host space and (ii) Guarantee a small distortion, i.e., make sure that distances between vertices in G closely match the dis- tances between their geometric images. In this paper we develop efficient algorithms for em- bedding graphs low-dimensionally with a small distor- tion. Further algorithmic applications include: 0 A simple, unified approach to a number of prob- lems on multicommodity flows, including the Leighton-Rae Theorem [29] and some of its ex- tensions. 0 For graphs embeddable in low-dimensional spaces with a small distortion, we can find low-diameter decompositions (in the sense of [4] and [34]). The parameters of the decomposition depend only on the dimension and the distortion and not on the size of the graph. arators can be found efficiently. 0 In graphs embedded this way, small balanced sep- Faithful low-dimensional representations of statisti- cal data allow for meaningful and efficient cluster- ing, which is one of the most basic tasks in pattern- recognition. For the (mostly heuristic) methods used *Supportedin part by grants from the Israeli Academy of Sciences, the Binational Science FoundationIsrael-USA and the Niedersachsen-Isriel Research Program. in the practice of pattern-recognition, see Duda and Hart [15], especially chapter 6. 1 Introduction Many combinatorial and algorithmic problems con- cern either directly or implicitly the distance, or met- ric on the vertices of a possibly weighted graph. It is, therefore, natural to look for connections between such questions and classical geometric structures. The approach taken here is to model graph metrics by map- ping the vertices to a real normed spaces in such a way that the distance between any two vertices is close to the distance between their geometric images. Embed- dings are sought so as to minimize (i) The dimension of the host space, and (ii) The distortion, i.e., the ex- tent to which the combinatorial and geometric metrics disagree. Specifically, we ask: 1. How small can the dimension and the distortion be for given graphs? Is it computationally feasible to find good embeddings? 2. Which graph-algorithmic problems are easier for graphs with favorable (low-dimensional, small distortion) embeddings? What are the combina- torial implications of such embeddings? 3. The above discussion extends to embeddings of general finite metric spaces. What are the al- gorithmic and combinatorial implications in this more general context? Here are some of the answers provided in this paper: 1. We develop a randomized polynomial-time al- gorithm that embeds every finite metric space in a Euclidean space with minimum distortion. 577 0272-5428/94 $04.00 Q 1994 IEEE

Upload: others

Post on 26-Aug-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

The geometry of graphs and some of its algorithmic applications

Nathan Linial * Eran London Yuri Rabinovich

Institute of Computer Science Institute of Mathematics Department of Computer Science Hebrew University Hebrew University University of Toronto Jerusalem 91904 Jerusalem 91904 Toronto, Ontario M5S 1A4

Israel Israel Canada nati@cs. huji. ac. il [email protected]. ac.il [email protected]. edu

Abstract

In this paper we explore some implications of view- ing graphs as geometric objects. This approach of- fers a new perspective on a number of graph-theoretic and algorithmic problems. There are several ways to model graphs geometrically and our main concern here is with geometric representations that respect the met- ric of the (possibly weighted) graph. Given a graph G we map its vertices to a normed space in an attempt to (i) Keep down the dimension of the host space and (ii) Guarantee a small distortion, i.e., make sure that distances between vertices in G closely match the dis- tances between their geometric images.

In this paper we develop efficient algorithms for em- bedding graphs low-dimensionally with a small distor- tion. Further algorithmic applications include:

0 A simple, unified approach to a number of prob- lems on multicommodity flows, including the Leighton-Rae Theorem [29] and some of its ex- tensions.

0 For graphs embeddable in low-dimensional spaces with a small distortion, we can find low-diameter decompositions (in the sense of [4] and [34]). The parameters of the decomposition depend only on the dimension and the distortion and not on the size of the graph.

arators can be found efficiently. 0 In graphs embedded this way, small balanced sep-

Faithful low-dimensional representations of statisti- cal data allow for meaningful and efficient cluster- ing, which is one of the most basic tasks in pattern- recognition. For the (mostly heuristic) methods used

*Supported in part by grants from the Israeli Academy of Sciences, the Binational Science Foundation Israel-USA and the Niedersachsen-Isriel Research Program.

in the practice of pattern-recognition, see Duda and Hart [15], especially chapter 6.

1 Introduction

Many combinatorial and algorithmic problems con- cern either directly or implicitly the distance, or met- r ic on the vertices of a possibly weighted graph. It is, therefore, natural to look for connections between such questions and classical geometric structures. The approach taken here is to model graph metrics by map- ping the vertices to a real normed spaces in such a way that the distance between any two vertices is close to the distance between their geometric images. Embed- dings are sought so as to minimize (i) The dimension of the host space, and (ii) The distortion, i.e., the ex- tent to which the combinatorial and geometric metrics disagree. Specifically, we ask:

1. How small can the dimension and the distortion be for given graphs? Is it computationally feasible to find good embeddings?

2. Which graph-algorithmic problems are easier for graphs with favorable (low-dimensional, small distortion) embeddings? What are the combina- torial implications of such embeddings?

3. The above discussion extends to embeddings of general finite metric spaces. What are the al- gorithmic and combinatorial implications in this more general context?

Here are some of the answers provided in this paper:

1. We develop a randomized polynomial-time al- gorithm that embeds every finite metric space in a Euclidean space with minimum distortion.

577 0272-5428/94 $04.00 Q 1994 IEEE

Page 2: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

Bourgain [7] had shown that every n-point met- ric can be embedded in an O(1og n)-dimensional Euclidean space with a logarithmic distortion and our algorithm finds embeddings which are at least as good. Better embeddings are provided for par- ticular families of graphs.

2. The geometry of graphs provides an excellent framework for studying multicommodity flow problems. In particular, we offer a very short and conceptual proof for several theorems that relate the value of the multicommodity flow in a network to the minimum cut in it (e.g., the Leighton-Rao Theorem [29]), and derive some new results in this area.

A low-diameter decomposition of a graph G is a (not necessarily proper) coloring of the vertices with few colors, so that all monochromatic con- nected subgraphs have small diameters. In [34] the precise tradeoff between the number of colors and the diameter was found for general graphs. In particular, both parameters can be made loga- rithmic in the order of the graph, which is, in gen- eral, essentially tight. Here we establish a better tradeoff that depends only on the dimension and the distortion at which the graph is embeddable, not on the number of vertices. For many algorith- mic applications of low-diameter decompositions, see [4] and [ll].

Graphs embeddable in a d-dimensional normed space with a small distortion have balanced sep- arators of only O(d . n l - i ) vertices. If the em- bedding is given, such separators can be found in random polynomial time. For closely related work see [39].

3. Clustering of statistical data calls for grouping sample points in sets (clusters), so that distances between members of the same cluster are sig- nificantly smaller than between points from di5 tinct clusters. If data points teside in a high- dimensional Euclidean space, or even worse, if distances between points do not conform with any norm, then clustering is notoriously problem- atic. Our algorithms provide one of the few non- heuristic approaches to this fundamental problem in pattern-recognition.

Many open problems remain in this area, some of which appear in section 8.

2 Related work and overview of results

During the past few years a new area of research has been emerging, an area which may be aptly called the geometry of graphs. This is the study of graphs from a geometric perspective. The limited space available forbids a comprehensive review of this area, so we re- strict ourselves to a few brief remarks and examples. Geometric models of graphs can be classified as either (i) Topological models, (ii) Adjacency models, or (iii) Metric models. The topological approach is mainly concerned with graph planarity and embeddability of graphs on other 2-dimensional manifolds. It also deals with 3-dimensional embeddings of a graph, mostly in the context of knot theory.

In an adjacency model, the geometry respects only the relation of adjacency/nonadjacency of vertices, but not necessarily the actual distance. A prime exam- ple for this approach is the Koebe-Andreev-Thurston Theorem (see [28], [l], [2], and [47]) that every planar graph is the contact graph of openly disjoint planar discs. Higher-dimensional results in the same vein appear in [16], [17], [39], and [44].

Another noteworthy adjacency model calls for map- ping the vertices of a graph to a Euclidean unit sphere where adjacent vertices are to be mapped to remote points on the sphere. This approach, initiated in [36], has recently found further interesting algorithmic ap- plications (see [20] and [26]).

Some new questions on adjacency-respecting mod- els appear in section 8.

We now turn to problems where graph metrics have a role. A low-diameter decomposition of a graph (see [4], [34], [ll]) is a (not necessarily proper) coloring of the vertices with 5 k colors, so that every con- nected monochromatic subgraph has diameter < D. In [34] it is shown that an n-vertex graph has such a decomposition whenever kD > n and Dk > n, the con- ditions being essentially tight. The first condition is necessary in decomposing triangulations of Euclidean spaces. The second condition is necessary for expander graphs. The similarity between this dichotomy and that of positive/negative curvature in geometry still awaits a satisfactory explanation.

Low-dimensional models for finite metric spaces, a main topic of the present paper, have previously been studied mostly by functional analysts (see [3], [6], [7], [14], [22], [24], [25], and [38]). Study of graph metrics has also led to the notion of spanners [41] and hop-sets [lo]. Local-global considerations, which are common- place in geometry, arose for graphs as well (see [32] and the references therein, [33], and the recent [43]).

All the referenced work notwithstanding, this short

Page 3: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

discussion leaves out large amounts of relevant work, for example, on embedding graphs in particular fam- ilies of graphs such as d-dimensional lattices, cubes, squashed cubes etc. (see [21], [48]). Particularly rele- vant are notions of dimension that emerge from such considerations, see, e.g., chapter 5 in [5]. The possi- bility of embedding graphs in spaces other than Eu- clidean and spheric geometry is very appealing, and hardly anything has been done in this direction (but see [46]). We have also said nothing about modeling geometric objects with graphs, which is a related vast area.

To initiate our technical discussion, recall that a norm 11 associates a nonnegative number llzll with every point t in real d-space, where (i) 1 1 ~ 1 1 2 0, with equality only if t = 0, (ii) llXrll = 1x1 11211, for every t E Rd and every real A, and (iii) For every 2, y E R d , 112 + yII F 11t11 + Ilyll. The met- ric naturally associated with the norm 1 1 . 1 1 is d(z, y) = 112 - yII. Particularly important are lp norms: lI(t1,. . . , t,)1Ip = (E I~j lp)~/P . We denote R" equipped with lp norm by 1;. Euclidean norm is, of course, 12.

space (X,d) to a metric space (Y,p) which preserves dis- tances, i.e., p('p(z),'p(y)) = d(z,y) for all z,y E X. Given a connected graph G = (V, E ) , we often denote by G also the metric space induced on V by distances in the graph. Isometries are often too restricted and much flexibility is gained by allowing the embedding to distort the metric somewhat. This leads to the main definition in this article:

Definition 2.1: Metric Dimension: For a finite metric space (X,d) and c 2 1, define d im(X) to be the least dimension of a real normed space (Y, 11 . [I), such that there is an embedding 'p of X into Y where every two points 21, z2 E X satisfy

An isometry is a mapping 'p from a metric

1 d(21,Q) 1 ll'p(z1) - ' p ( ~ 2 ) l l 1 ; . d(z1,22).

Such an embedding is said to have distortion 5 c . The (isometric) dimension of X is defined as dim(X) = diml(X). I We recall (Lemma 3.1) that dim(X) 5 n for every n-point metric space (X, d), whence this definition makes sense.

We list some of the main findings on isometric and near-isometric embeddings. Unless stated otherwise, n is always the number of vertices in the graph G or the number of points in a finite metric space (X, d).

'Technically, we are discussing semi-metrics, as we allow two distinct points to have distance zero.

Finding good embeddings

1. In Theorem 3.5 we show: There is a polynomial time algorithm that embeds every n-point metric space (X, d) in a Euclidean space with distortion arbitrarily close to optimal. In random polynomial time X may be embedded in if('0gn) with distortion 5 (1 + E ) . optimal (for any 6 > 0). This optimal distortion is always

In random polynomial time X may be also em- bedded in lF('Oga *) for every p 2 1 with distortion O(1og n). The bound on the distortion is tight for 2 1 p 2 1, regardless of the dimension.

ooog n) by ~71.

2. For every metric space (X, d) on n points there is a y > 0 such that the metric y.d can be embedded in a Hamming metric with an O(1og n) distortion (Corollary 3.7). The bound is tight.

Struct,ural consequences

The gap between the maximum flow and the minimum cut in a multicommodity flow problem equals the least distortion with which a particu- lar metric can be embedded in 11. This metric is defined via the Linear Programming dual of a program for the maximum flow. This is the ba- sis for a unified and simple proof to a number of old and new results on multicommodity flows (Section 7).

Low-dimensional graphs have small separators: A d-dimensional graph G has a set S of O(d . .'-a) vertices which separates the graph, so that no component of G \ S has more than (1 - & + o(1)) n vertices (Theorem 5.1).

The vertices of any d-dimensional graph can be (d + 1)-colored so that each monochromatic con- nected component has diameter 5 2d2. That is, there exist low-diameter decompositions as in [4] and [34], with parameters depending on the di- mension alone (Theorem 6.1).

Low-dimensional graphs have large diameter, diam(G) 1 no(*) (Lemma 4.3).

Algorithmic consequences

1. Near-tight cuts for multicommodity flow prob- lems can be found in random polynomial time (Section 7).

2. Given an isometric embedding of G in d dimen- sions, a balanced separator of size O(d . nl-4) can

579

Page 4: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

3.

be found in random polynomial time (Theorem 5.1).

Low-dimensional, small-distortion representa- tion of dimension is well-defined. tion of statistical data offers a new approach to clustering which is a key problem in pattern- recognition (Section 3.3).

assume B is a centrally symmetric convex polytope. A copy of B centered at z is denoted B(z).

The following well-known lemmashows that the no-

Lemma 3.1: A metric space (x, d ) with n points can be isometrically embedded into 1:.

Isometric dimensions

0

0

0

0

0

3

3.1

All trees have dimension O(1ogn). The bound is tight (Theorem 3.3).

dim(&) = peg, n1 (Proposition 4.1). (This re- sult essentially goes back to [12].)

p0g2 2 dim(Knl,...,?lk) 2 E:=, Fog2 nil - 1, where is the com- plete k-partite graph, and ni 2 2 (Theorem 4.5).

For cycles: dim(2m-Cycle) = m, and m + 1 2 dim((2m + 1)-Cycle) 2 7 - 1. Conse- quently, dim(G) 2 - 1. However, dimq(n-Cycle) = 2 (Proposition 4.7 and Remark 4.8).

dim(d-Cube) = d (Corollary 4.9).

General Results

Isomet ries

All logarithms are to base 2. G = (V, E) is always a connected graph and n is the number of its vertices. Unless otherwise stated, embeddings are into Rd. No distinction is made between a vertex and its im- age under the embedding. If G can be embedded in (X, 11 . 11) we also say that X realizes G.

The unit ball of a d-dimensional real normed space is: B = {z E Rd with 1 1 ~ 1 1 5 1). This is a convex body which is centrally symmetric around the origin. Every centrally symmetric convex body Q induces a norm, called the Minkowski norm: 11zll~ = inf{X > 0 such that f E Q}. Thus, normed

spaces are denoted either as (Rd, 11 1) or as (Rd , Q). The boundary of B is aB = {z E R with 11z11= 1). In an isometric embedding of G in Rd the set { d:Gyyy) 12 # y E V} is contained in dB. It is not hard to see that there is no loss of generality in assuming B = conv{dz-y I z # y E V}. That is, we may

G'(X,Y)

2However, d may also denote a metric. and So will be Daid

Proof: Let X = ( ~ 1 ~ . . . ,zn} with dij = d ( z i , t j ) .

Map zi to a point zi E R" whose k-th coordinate is z: = d ik . Then 1121 - z j l l , = maxklz: - 41 2 12: - 4 I = Jdij - d j j l = d i j . On the other hand, for all k, 1%; - 41 = ldik - d j k l 5 di, by the triangle inequality. I

We show later some examples of graphs with iso- metric dimension ;. The highest isometric dimension among n-vertex graphs is currently unknown (but see

The 1, norm is universal in that, as Lemma 3.1 shows, it realizes all graphs (in fact, all finite metric spaces). In some sense, it is the only universal norm, as the I , norm defines a graph on Z d , which must be realized by any other universal norm. Therefore any universal norm must contain a copy of I , . (Lest the reader suspects that 11 is universal as well, we remark that K2,3 is not embeddable in this norm, viz. [13].) - It is therefore of interest to also study dim(G) (and dim,(G)), the least k such that G can be isometri- - cally embedded in Ik (with distortion c). Clearly dim(G) 2 dim(G), and this gap can be exponential (e.g., dim(d-Cube) = d while dim(d-Cube) = 2d-1; see Corollary 4.9 and comments following Theorem 4.12). One simplification in studying isometric embeddings into 15 is that all vertices may be assumed to map to Z d : Given any embedding, round all coordinates up and isometry is preserved.

Geodetic paths in G and the face lattice of the unit sphere B are related via

Proposition 3.2: Let P = (21,. . . , zk) be a geodetic path in G. In an isometric embedding of G in (Rd , B),

[GI and [491)-

all vectors F, k 2 j > i 2 1, lie on the same face of B. I

J is an isometric subgraph of G if distances within J are the same as in the whole graph G. The di- mension of such a subgraph provides a lower bound on dim(G). Examples of isometric subgraphs of G in- clude cliques, induced subgraphs of diameter 2, geode- tic paths and irreducible cycles. In particular, since dim(C,) 2 2 - 1 (Proposition 4.7 and Remark 4.8), if G has finite girth, then dim(G) 2

To get the reader initiated on the methods for esti- - 1.

extra, [8] page 223. mating isometric dimensions, we present a bound on

580

Page 5: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

the dimension of trees: Theorem 3.5:

Theorem 3.3 : For every n-vertex tree TI dim(T) = O(1ogn). Moreover, if T has 1 leaves, then dim(T) = O(log1). The bound is tight.

Proof: The bound is tight, e.g., for stars, dim(K1,n-l) = R(1ogn) by Proposition 4.2.

The proof shows that T can be isometrically embed- ded in l c g n with c = &. It is well-known that every n-vertex tree T has a “central” vertex 0, such that each component of T \ (0) has at most ver- tices. Let R, L be two subtrees of T of size 5 9 whose union is T , sharing only the vertex 0. Find isomet- ric embeddings, one for L and one for R in l&’,log”-’.

Such an embedding remains an isometry if all points are translated the same amount. We may thus as- sume that in the embeddings of L and R the vertex 0 is mapped to the origin in Rc’log”-’ . The whole tree is isometrically embedded in a space of one more dimension. The coordinates used to embed R and L are maintained, and the value of the new coordinate is set as follows: For 0 it is zero, for x E V ( t ) this new coordinate is - d T ( o , z ) and for z E V ( R ) it is d T ( 0 , z ) . It is not hard to verify that this is an iso- metric embedding, as claimed.

The upper bound in terms of the number of leaves is obtained by splitting T into two subtrees with a single common vertex, neither of which contains more than

of the leaves of T (see [9] Theorem 2.1’). I

3.2 Near-Isometries

We start by quoting:

Theorem 3.4 : ( Johnson-Lindenstrauss [24], see also [16] ) A n y set of n points in a Euclidean spuce can be mapped t o Rt where t = O ( 9 ) with distortion 5 1 + 6 in the distances. Such a mapping may be found in random polynomial t ime.

Proof: (Rough sketch) Although the original pa- per does not consider computational issues, the proof is algorithmic. Namely, it is shown that an orthogonal projection of the original space (which can be assumed to be n-dimensional) on a random t-dimensional sub- space, almost surely produces the desired mapping. This is because the length of the image of a unit vector under a random projection is strongly concentrated around m. I

Our general results on near-isometric embeddings are summarized in the following theorem:

( Bourgain [7], see also [25], [88] ). Every n-point metric space ( X , d ) can be embedded in an O(1og n)-dimensional Euclidean space with an O(1ogn) distortion.

Let c be the least distortion with which X may be embedded in any Euclidean space (c = O(1ogn) b y Claim 1). There as a polynomial-time algo- ri thm that f o r every E > 0 embeds ( X , d ) in a Euclidean space with distortion < c + E. In ran- dom polynomial-time ( X , d ) m a y be embedded in If(1og”) with distortion O(c).

In random polynomial-time ( X , d ) may be embed- ded in /:(logan) (for any p 2 l), with distortion O(1ogn).

There are n-point metr ic spaces every embedding of which into an lp space, 2 2 p 2 1 of any di- mension has distortion R(1ogn).

Proof: Claim 1 appears mostly for future reference, but is seen to be an immediate corollary of Claim 3 and Theorem 3.4.

To prove Claim 2, let the rows of the matrix M be the images of the points of X under a distortion-c embedding in some Euclidean space. Let, further A = M M t . Clearly, A is positive semidefinite, and for every i # j ,

As in [36], [20], and [26] the ellipsoid algorithm can be invoked to find an E-approximation of c in polynomial time.

The dimension is reduced to O(1ogn) by applying Theorem 3.4.

We now turn to the second Claim. The overall structure of the algorithm follows the

general scheme of Bourgain’s proof: For each cardi- nality k < n which is a power of 2, randomly pick O(1ogn) sets A E V ( G ) of cardinality k. Map ev- ery vertex z to the vector ( d ( z , A ) ) (where d ( x , A ) = min{d(z, y)Jy E A } ) with one coordinate for each A selected. It is shown that this mapping to has almost surely an O(1ogn) distortion.

We now turn to the actual proof. Let B(z,p) =

p } denote the closed and open balls of radius p cen- tered at x . Consider two points x # y E X . Let po = 0, and let pt be the least radius p for which both

{Y E Xld(z, Y) I P I and &, PI = {Y E X l d ( x , Y) <

581

Page 6: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

IB(x,p)( 2 2' and IB(y,p)l 2 2 t . We define pt as long as pt < ad(x,y), and let t be the largest such index. Also let pi+l = v. Observe that B(y,pj) and B(x, p i ) are always disjoint.

Notice that A n i(x, pt ) = 0 U d(x, A ) 2 pt , and A n B(y,pt-l) # 0 e d(y, A ) I p t - l . There- fore if both conditions hold, then

Let us assume that I&x,pt)l < 2t (otherwise we argue for y). On the other hand IB(y,pt-1)1 2 2t-1. Therefore, a random set of size O ( 3 ) has a con- stant probability to both intersect B(y, p t - 1 ) and miss

We randomly select q = O(1og n) sets of cardinality 2', the least nonnegative power of two that is 2 +. Then, with high probability, for each pair x, y, at least & of the sets chosen will intersect B(y, p t - 1 ) and miss

Note that the same applies to pi+l - pi, since again we wish to miss-a set of size < 2i+1, and to intersect a set of size 2 2t.

Therefore for almost every choice of A I , . . . , A, :

M Y , A ) - d(X, A)I 2 pt - Pt-1.

&,pt).

i(x, P i ) .

We do this now for every 1 = 1,2 , . . . , llog nJ and obtain Q = O(log2 n) sets for which:

logn 4 x 7 Y) - - . pi+l 2 logn . - - 10 40 '

The reverse inequality is obtained by observing that Id(x, Ai) - d(y, Ai)l 5 d(x, y) for every A;, whence:

Q C I d ( x , A i ) - d ( y , A i ) I I C.log2n-d(x,y) . i= l

Thus, the mapping which sends every vertex x to the vector (9 I i = 1,2 , . . . , Q ) , embeds G in an O(log2 n)-dimensional space endowed with the 11-norm and has distortion O(1ogn). In fact, for every p 2 1, a proper normalization of this embedding sat- isfies the same statement with respect to the 1, norm.

The only modification is that now x gets mapped to (w I i = 1,2 , . . . ,Q): Let ~ ( x , y) denote the 1,

distance between the image of x and the image of y, i.e.,

But for every i, Id(x, Ai) - d(y, Ai)( 5 d(z, y) whence ~ ( x , y) 5 d(x, y). On the other hand:

by the monotonicity of p t h moment averages.

The proof of the last Claim is deferred to Remark 7.2.1

The following technical corollary will be needed in Sec- tion 7.

Corollary 3.6: Let ( X , d ) be a finite metr ic space and let Y E X . There exists a randomized polynomial-time algorithm that finds an embedding

f o r every x, y E X, and if x, y are both in Y , then also , so that d(X,Y) 2 llP(X) - P(Y)ll : x ~ p g 2 IYI)

llP(X1 - P(Y)ll 2 Q(&) .d(x, Y).

Proof: Follow the proof of Claim 2, picking random subsets of Y , not the whole of X . I

A Hamming space is a metric space which consists of (0 , l ) vectors of the same length, equipped with the Hamming metric.

Corollary 3.7: For every n-point metr ic space ( X , d) there is a Hamming space (Y, p) and a y > 0 such that ( X , d ) can be embedded in ( Y , y . p) with distortion O(1ogn). The bound is tight.

Proof: It suffices to find a constant-distortion em- bedding for every finite subset of ly into a Hamming space. Nothing is changed by adding the same number to all values at some coordinate. Also recall that mul- tiplication by a fixed factor is allowed. Therefore, at the cost of an arbitrarily small distortion the entries of the i-th coordinate may be assumed to be integers, the smallest of which is 0. If the largest i-th coordinate is T , this coordinate is replaced by T new ones, where xi = s is replaced by s new coordinates of 1 followed by T - s coordinates of 0. This latter step adds no distortion, being an isometry into a Hamming space. Again, the tightness result follows from Remark 7.2. 1

582

Page 7: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

3.3 Applications to clustering

It is a recurring situation in pattern-recognition, where one is given a large number of sample points, which are believed to fall into a small number of cat- egories. It is therefore desired to partition the points into a few clusters so that points in the same cluster tend to be much closer than points in distinct clusters. When sampling takes place in a low-dimensional Eu- clidean space, clustering is reasonably easy, but when the dimension is high, or worse still, if the metric is non-Euclidean, reliable clustering is a notoriously dif- ficult problem. See Duda and Hart [15], in particular chapter 6, for a standard reference in this area.

The algorithms we have just described offer a new approach to clustering. These algorithms are currently being practically tested [31] in a project to search for patterns among protein sequences. One pleasing as- pect that already emerges is this - the second algo- rithm in Theorem 3.5 assumes that the distance be- tween any pair of points in the space can be evaluated in a single time unit. In this particular application, the metric space consists of all presently known proteins. Molecular biologists have developed a number of mea- sures to estimate the (functional, structural, evolu- tionary etc.) distance between protein sequences, and some widely available software packages (e.g., FASTA, BLAST) calculate them. At this writing about 36,000 proteins of average length ca. 350 have already been sequenced. For our purposes, a protein is a word in an alphabet of 20 letters (amino-acids) and it takes about a quarter of a second to compute a single distance according to any of the common metrics, using stan- dard software on a typical workstation. A straightfor- ward implementation of the algorithm is therefore in- feasible. The difficulty stems from having to compute d(x, A) for large A’s. A closer observation of the proof shows that if we fail to include the coordinates which correspond to large A’s, the effect is that the distance between close pairs of points (protein sequences) is reduced in the mapping. This is definitely a welcome effect in a clustering algorithm, so what seems to be a problem turns out as a kind of a blessing.

4 Estimating the dimension

4.1 Volume considerations

This method goes back to [12]. Although stated in a different context, that paper shows dim(K,) 2 peg, n1. Here is a sketch of an argument

based on the original ideas and translated to our lan- guage:

Proposition 4.1: dim(K,) = Fog, n1.

Proof: (Sketch) Let K, be isometrically mapped to {XI, . . .,x,} in (Rd,B). Let D = conv{x1,. . .,z,}. We claim that the sets D + zi (i = 1, . . . , n) have dis- joint interiors. Assuming this for a moment, notice that since D is convex, D + xi 2 0 for all i . Hence

and the conclusion follows. To complete the proof, suppose for Contradiction. that (D + zi) n (D + Z j )

has a nonempty interior. This implies that xi - x j is an internal point of D- D. But D - D = conv{za - ZP I n 2 & # /3 2 1) which are all vectors of norm 1, whence D - D B. It fol- lows that zi - xj, a vector of norm 1, is in int(B), a contradiction.

On the other hand dim(&) 5 pog,nl follows by mapping the vertices of K, to the vertices of the Fog, nl-dimensional cube, under 1 , norm. I

Volume considerations yield upper bounds on de- grees and a lower bound on the diameter:

Proposition 4.2 : All vertex degrees in a d- dimensional graph do not exceed 3d - 1. This bound is tight.

Proof: Place a copy of !jB around a vertex v, and around each of its neighbors. The interiors of all these balls are disjoint, and their union is contained in $(v): If w is a neighbor of v and z E $(tu) then 11% - 2uII 5 4 since z E iB(w), and llzu - v11 = 1 by the isometry. Hence 11% - v11 5 11.z - 2uII + llw - 2111 5 f. Comparing volumes we get the desired result. Equal- ity is attained by the grid points under 1, norm. I

Lemma 4.3: diam(G) 2 !j(n* - 1).

Proof: Surround each vertex of G by $B. These balls have disjoint interiors and are contained in (diam(G) + ;)El, centered at an arbitrary vertex of G. The conclusion follows as before. The bound is nearly tight, as we know of graphs with diam(G) - &(ni - 1). I

n . vol(D) = VOl(Ui(D + Zi)) 5 vol(2D) = 2d * vol(D)

4.2 Ranks

In this section we relate isometric dimensions with linear algebra. We start with an alternative definition of a norm. As mentioned above, we are only con- cerned with normed spaces (X,B) where B is a cen- trally symmetric polytope. Associate with each pair

583

Page 8: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

of opposite facets { F , - F } of B, a linear functional 1~ which is identically 1 on F , and -1 on - F . Then

An n x r matrix M implements a graph G if for all i , j maxk ( m i k - mjk( = dG(vi,vj). The following result offers a characterization of dim(G) in terms of matrix ranks:

VZ E x, 11211 = m U F l l F ( z ) l .

Theorem 4.4: dim(G) = min rank(M), where the minamum is over all matrices M that implement G.

Proof: If G is isometrically embeddable in (Rd,B), define A4 via: miF = lF(Vi), where I F is the func- tional corresponding to the pair of opposite facets { F , - F } of B. Clearly, M implements G, and its rank is 5 dim(G).

On the other hand, suppose that M,,, implements G and d = rank(A4). Mapping the vertices of G to the rows of M is an embedding of G to the d-dimensional space L, spanned by the rows of M . The norm is induced by the unit sphere B = Ln[-1, l]', the inter- section of L with the unit cube in Ra. The fact that M implements G implies that the above mapping is an isometry of G into the normed space (L, B). I Here are some applications of the theorem:

Theorem 4.5 : If nl ,n2 , . . . ,nk 2 2, then d im(~n , ,n , , ..., n,) 2 cf1 POgnil - 1.

Proof: Let Ai be the i-th part of V and M be a matrix implementing G. For a, b E Ai, consider a column j where Im,j - mbj( = 2. If z 4 Ai, then d ( z , U) = d(z , b ) = 1, whence m,j = $ . (m,j + mbj). It follows that if i.(m,j+mbj).iis subtracted from the j-th column, all entries not in the rows of Ai become zero. Repeat this step for all columns j that imple- ment the distance between two points from the same part. Next, eliminate all other columns. The resulting matrix Q has rank(&) 5 rank(M) + 1 (elementary op- erations with a single column can increase the rank by 5 1 and eliminating columns can only decrease it). Q is a direct sum Q = @Qi, where +Qi implements I(", , whence rank(Qi) 2 pognil. The theorem follows. I he upper bound dim(~n,, , , , ..., n,) 5 ~ f = ~ Fog nil is shown easily, using 1, norm, as suggested by the proof of the lower bound.

Corollary 4.6: fect matching. Then dim(G) = n.

Proof: Here, it is obvious that the the elementary operations on the columns do not cause a loss of one in the dimension. I

Proposition 4.7: dim(C2,) = m.

Let G be a clique Kzn minus a per-

Proof: The following construction gives an upper bound, with ll norm: The vertices are mapped to the following 2m points:

for i = I , . . . ,m : xi = E',=, et; for i = m + 1 , . . . ,2m - 1 : xi = Ezi=i-m+l e t ;

z2m = 0,

where, as usual, et is the t-th unit vector. For the lower bound, let the matrix A implement

CZm , and let 01 , . . . , vZm be the vertices in cyclic or- der. All indices are taken modulo 2m. Consider a column t where &(vj , vm+j) = m is realized. It has the form = 6; = aj-l,t = 6 + E ; aj+2,t = aj-2,t = 6 + 2q . . . , aj+,,t = 6 + mc, for some real 6 and E E (-1, 1) . _By elementary operations with the column vector 1 it can be transformed so that

= i for i = 0 , . . . ,m. We thus ob- tain an m x m minor whose (r,s)-entry is Ir - S I . This matrix is non-singular , being the distance ma- trix of a path, which is known to be non-singular (e.g., 1351 pp. 64-65). This implies rank(A) 2 m - 1. A more careful analysis shows that the elementary oper- ations with the all-one vector can be avoided, yielding rank(A) 2 m. I

Remark 4.8 : For cycles of odd length: m + 1 2 dim(C2,+1) 2 ?j - 1. The upper bound is achieved with 1, norm: Let {wl, w2,. . . , wm+l} be m+ 1 consecutive vertices in the cycle. Map each ver- tex z to an ( m + 1)-vector, whose i-th coordinate is d(z,wi). As for a lower bound, the above argument yields only dim(Ca,+l) 2 - 1, though probably dim(C,) = 151 for all n. I

Consequently:

Corollary 4.9: dim(m-Cube) = m.

Proof: The m-Cube embeds isometrically in I F . The lower bound follows from the fact that the m-Cube contains a 2m-Cycle as an isometric subgraph. I

Consequently the infinite cubic grid in R" has dimension m. Moreover, for the part of grid G = [l, .., nIm, not only does dim(G) = m, but also dim,(G) = m for any c 5 n k . This bound can be obtained using volume arguments as described in a previous section.

The slabbing dimension of a finite family of convex bodies IC in Rd is the least dimension of a linear space L which intersects every K E IC.

Associate with a connected graph G on n ver- tices, the polyhedron P = PG R" = {z E R" with IZi-zj I 5 d(vi, v j ) for all 1 5 i , j 5 n}. Alternatively, P = {Z E R" with (2, - zjl 5 l , V [ v i , v j ] E E(G)}.

=

584

Page 9: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

Clearly, P is a centrally symmetric prism. For each i, j the facets F$ and F& are determined by the equa- tion xi - xj = f d(vi, U,). Let 3 = 3 G consist of all such faces. (In fact, because of the central symmetry, it would suffice to consider F$ alone.)

Theorem 4.10: The stabbing dimension of 3 G co- incides with the isometric dimension of G.

Proof: Suppose that L “stabs” all the faces in FG, and choose for each pair i , j a vector in L n Fi,j. Let the matrix M have these vectors as columns. Clearly M implements G; by Theorem 4.4, dim(G) 5 dim(L).

On the other hand, given a matrix of minimum rank implementing G, define L to be the span of its columns. Clearly L meets all the required facets, and dim(L) = dim(G). I

The case when G is a clique has an interesting geo- metric implication:

Theorem 4.11: Let C be the cube [ - f , ; I m . The stabbing dimension of any family of n paimuise disjoint faces { P I , . . . , F,} of C is at least Fogz n1.

Proof: Let L be a linear space that meets all Fi. Choose points v; E L n Fi, and form an n x m matrix M whose rows are the vi’s. Since the faces are disjoint, for each two rows i # j there is a column 1 where U;,/ = -; and vj,/ = $, or vice versa. Since all entries of M are in [-f, f ] , M implements the n-Clique.

By Theorem 4.4 and Proposition 4.1 rank(M) 2 Pogzn], whence dimL = rankM 2 Pogzn] as claimed. I The remaining part of the section concerns dim(G).

Theorem 4.12: dim(G) is the least number of columns in a matrix

implementing G. e dim(G) equals half the least number of faces of the unit ball of a normed space in which G can be isomet- rically embedded. I Here are some examples demonstrating the conve- nience of working with dim(G):

Let M realize I<,, and recall that A4 may be assumed to have integer entries. Hence, each column in M realizes the distance between every x E A and y 6 A for some A C V . Namely, an isometric embedding of Kn into 15 is equivalent to covering E(K,) by d complete bipartite graphs and it is well known that d 2 pogz n1, as claimed. A similar argument shows that dim(linegraph(K,)) = @(log n).

Let M implement the d-Cube and let column t realize

(i) dim(^,) = Peg, n1.

(ii) =(&Cube) = 2d-1.

d(x, y) = d for some antipodal pair x , y. Without loss of generality mt, = d(x,z). Hence every antipodal pair requires a separate column. At the same time this is just a description of a matrix implementing the d-Cube with 2d-1 columns.

5 Separators

Theorem 5.1: Let dim,(G) = d and assume that c . d = .(.a). Then G has a set S of O ( c . d . vertices which separates the graph, so that no compo- nent of G \ S has more than (1 - &+o(l))n vertices.

Proof: Given a distortion-c embedding of G in (Rd, B), our approach is this: Find two parallel hyper- planes H I , H2 in W d at distance 1 (distance is taken in the B-norm). S is the set of vertices which are embed- ded in the closed slab between the two hyperplanes. VI is the set of vertices embedded strictly above H1

and in Vz are those embedded strictly below Hz. That S separates VI and Vz is obvious. What we need is to construct H1 and H z , so that,

(SI = O(c . d . nl-3)

and 1

IVll, lvzl 5 (1 - dS1 + o(1)) 11.

The proof uses a beautiful idea from [40] which starts from the following well known consequence of Helly’s Theorem [50]:

Proposition 5.2: For any set V of n points in W d there exists a centroid 0 such that every closed half- space determined by a hyperplane passing through 0 contaans at least & points of V .

By translation, 0 may be assumed to be the origin. Partition the points in V according to their distance

from 0. First we will show that no slab contains “too many” points of V , that are “near” the origin. Then we show that on a random choice of H I , and H2 = -HI, the expected number of points in V which are “far” from the origin and fall in the slab is small.

It is a well known fact (e.g., [42]) that every d-dimensional norm may be approximated with dis- tortion O(&) by an affine image of the Euclidean norm. That is, we may assume that G is embedded in 1; with distortion O(c . A).

So we should look for a closed slab H of (Euclidean) width c . containing “not too many” points of V . Let n1 = #{x : x E V n H, 1 1 ~ 1 1 ~ < Ro}.

585

Page 10: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

Lemma 5.3 : Let R ~ = o ( c . & ) . Then n1 5 O(c . d . (2Ro + l)d-l).

Proof: All distances in G are 2 1, so the same holds also for the Euclidean distance of the images. Therefore, if we locate a $B around each point in V n H we get nl spheres with disjoint interiors. Since Ro = O(c . & , they all reside inside a cylinder of height O(c .d and base a (d - 1)-dimensional ball of radius (Ro + 3). Comparing volumes we obtain:

Where vi is the volume of the unit ball in R'. Recall that vzt = $ and ~ 2 t + l = 1,3,,,(2t+l), and in particu- lar, = O(fi). Consequently: nl 5 O(c . d . (2Ro + l)d-l). I

Now, we wish to estimate the probability for a re- mote point z (i.e., 11x112 2 &) to belong to a randomly chosen slab. A slab is determined by the unit vector perpendicular to its boundary and our choice is by the uniform distribution on Sd-'.

Lemma 5.4: Let z E Rd, 11~112 >_ Ro. Then Pr(x E H) 5 0 (e). Proof: Associate with each slab H the two points on

in the directions of the two unit vectors perpen- dicular to the hyperplanes of H. Slabs have width c . &, so the points associated to slabs containing x form a symmetric stripe of width (2.c.&), on Sd- ' . Therefore, the desired probability is the ratio between the surface area of this stripe and the surface area of the whole sphere. We recall the following fact:

Remark 5.5: Let C be a measurable subset of §d-l , and let CY be the ((d - 1)-dimensional) measure of C. Let u(C) = {y I y = Xz for some x E C and 1 2 X 2 0}, the cone with base C and apex at the origin. Then the (d-dimensional) measure of u(C) is 5. In particular, the surface area of Sd-' is d . vd. I

We need to evaluate the surface area of C, the part of the stripe of width (2 . c . &), that is on Sd-l. By the previous remark, this surface area equals d . vol(u(C)). Assume that C is symmetric with re- spect to the hyperplane Zd = 0. Then u(C) E {(%I, ..., z d ) with ~ ~ - ' z ~ 5 1 and lZdl 5 &}, and

as the volume of this cylinder is 2 . g V d - 1 :

2'+'.n'

Sd- 1

Thus 122, the expected number of remote points 2: E V which belong to H, satisfies 122 = O ( e ) . We optimize, by selecting Ro so that nl - 712. Then

or

This yields nl + 122 = O ( c . d . n l - i ) , for the expected number of points in the slab.

The requirements Ro = O(c .a) and Ro = O(n*) are consistent, since c . d = o(n3) was assumed. I

The centroid can be found in time linear in n and dd (see [37]). Therefore, given an embedding of G, the proof translates to a randomized polynomial time algorithm to find such a separator, provided that d = O( e). It is interesting to observe that in [39] an essentialfy similar separation is obtained, although both the setting and the methods are different.

R~ = O(n4).

6 A geometric perspective on low- diameter decompositions of graphs

Following [34] a decomposition of a graph G = (V, E ) is a partition of the vertex set into subsets (called blocks). The diameter of the decomposition is the least r such that any two vertices belonging to the same connected component of a block are at distance 5 T in the graph.

Theorem 6.1: Let G be a graph with 2 = dim,(G), d = dim,(G). Then

G can be decomposed t o z+ 1 blocks, each of diam- eter 5 Z C ~ .

e G can be decomposed t o d + 1 blocks, each of diam- eter 5 2cd2.

Remark 6.2: Combined with Theorem 3.5 we ob- tain a decomposition to O(1ogn) blocks of diame- ter O(10g3 n), a result slightly inferior to the optimal (logn,logn) from [34]. I

Proof: We prove the case c = 1, the general case then follows easily. Throughout the proof we use 1, norm. The key to the proof is the following universal tiling of Rd: Consider Zd, and define Ki, i = 0 , . . . , d - 1 as the neighborhood of the i-dimensional faces

of its cubes (i.e., KO consists of radius % cubes

centered at the grid points, K1 is the % neighbor- hood of the edges of the grid, etc.). Define TO = KO,

- - -

586

Page 11: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

- = Ki - UjZiKj for i = 1 , . . . , d - 1. Finally, T;

is the remaining part of Rd. It is not hard to check now that each Ti is a union of disjoint “bricks”, each of diameter < 1, and that the distance between any two such bricks is 2 A.

The first statement in the theorem is now immedi- ate: embed the graph in Rd, and consider the tiling as above, magnified by factor 22 + 6. Each G n Ti is a proper block.

The proof of the second statement is slightly more complicated, since one has to correlate between the d-dimensional grid and an arbitrary d-dimensional norm. We need a small distortion approximation of B (or a linear transformation thereof) by a cube. Dis- tortion d is attainable: first approximate B by a Eu- clidean unit ball (actually, the Lowner-John ellipsoid, see [42]), then approximate the unit ball by a unit cube, both distortions being 5 4. By recent work of Giannopoulos [19] an O(d0.859) distortion is attain- able.

As before, we embed G in R d , and superimpose on it the (unit) lattice that approximates B to a factor of d, magnified by 2d2+c. It is easy to check that, again, each Ti defines a proper block.

If c > 1, the only change is that the diameter of sets covered by a single brick may be multiplied by c. I

-

2 4

Remark 6.3: We have covered Rd with {TO,. . . , T d } , where each Ti is a union of compact sets of diameter < 1, any two of which are at least $ apart. The con- struction is nearly optimal in two respects:

It is impossible to cover Rd with fewer than d + 1 sets each of which is the disjoint union of compact sets whose diameters are bounded from above and whose mutual distances are bounded away from zero. This follows, e.g., from Lemma 3.4 in [34].

We next show that for any cover as above, there are two sets in the same family whose distance does not exceed O ( y ) . Indeed, there must be a Ti with upper density at least l/d. Let T* = Ti + mB (Minkowski Sum), where 2m is the least distance be- tween two connected components of T,. If K, is a typical connected component of Ti, then the sets K, + mB have disjoint interiors. Also Ti has up- per density at least l /d in R d , and therefore also in K* = U(K, + mB). By the Brunn-Minkowski in- equality (see [42]):

vol(K, + mB) (vol(K,)’/d + ~ o l ( m B ) l / ~ ) ~ vol(K,) vol(K,)

We used the fact that in any norm vol(B) 2 vol(K) whenever diam(K) 5 2. Also by the den- sity properties of T,, there is an index (Y for which d 2 -1 2 (1 + m)d which implies

m = O ( y ) , as claimed. We do not know what the bound on m is for various

norms. A plausible guess would be that m = O(l/d) for every norm, and that this is tight for I , . I

7 Multicommodity flows via low-distortion embeddings

We briefly recall some definitions about multicom- modity network flows. G is an undirected n-vertex graph, with a capacity C, 2 0 associated with ev- ery edge e . There are k pairs of (source-sink) nodes ( s , , t , ) , and for each such pair a distinct commodity and a demand D , 2 0 are associated. For simplicity of notation we let Ci,j = 0 for all non-edges ( i , j ) . As usual, flows have to satisfy conservation of matter, and the total flow through an edge must not exceed the capacity of the edge. Maximal multicommodity flow problems come in a number of flavors, and we concentrate on the following version: Find maxflow - the largest A for which it is possible to simultaneously flow AD, between s, and t , for all p.

A trivial upper bound on X is attained by consider- ing cuts in G. For S c V let Cup(S) be the sum of the capacities of the edges connecting S and S. Also let Dem(S) be the sum of the demands between source- sink pairs separated by S (i.e., IS n {s , , t , } l = 1). Obviously, X 5 a for every S.

In the case of a single commodity ( k = l), the max-flow min-cut theorem is easily seen to say that A equals the minimum of

It came as a pleasant surprise when Leighton and Rao showed that in some cases the gap between the ma-flow and the min-cut cannot be too big. We show that this gap equals the least distortion with which a certain metric associated with the network can be embedded in 11. This approach yields a unified approach to Leighton-Rao’s [29], and the subsequent [27] and [18]. The result for non-unit demands is new.

Theorem 7.1: There is a randomized polynomial- time algorithm which given a network G = (V, E , C )

3After the completion of this article, Yonatan Aumann and Yuval %bani informed us that they, too, proved this Theorem

over S c V .

587

Page 12: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

and the demands for k source-sank pairs, finds an S c V for which

Cap(S) < O(1og k) . maxjlow. Dem(S) -

Proof: The optimal A is the maximum of a certain linear program. By LP duality (e.g., [18]):

where the minimum is over all metrics d on G. We first apply Corollary 3.6 to the minimizing metric with X = V ( G ) and Y = {s,}i=, U { t , } ;= , . The vertices of G are thus mapped to points (21,. . ., xn} in R" where [[xi - xj(l1 5 di,, for all i , j and I l t s p -xt,II1 2 n(da,,tp/logk) for p = 1, ..., k. Therefore,

Since we are dealing with the IF metric we may con-

Let i; be an index where the minimum is achieved. We claim that no generality is lost in assuming that all xi,^ are in (0, l}, whence

where S = {a'lxi,~ = 1). To justify the assumption xi,^ E {0,1} for ev-

ery i l we argue that for any real ~ i j = aji, bi, = b j i , E. . a .IZ - z j l

real 2's) can be attained with all .zi E {O,l} . This is shown by a variational argument: If the 2's take exactly two values, one value can be replaced by zero and the other by one without affecting the expression. Otherwise, let s > t > U be three values taken by

1 5 a' # j 5 n the minimum of L* (over *#3

* based on a previous version of this paper which did not yet include the present Section.

z's. Fixing all other values, and letting t vary over the interval [u,s], the expression is the ratio of two linear functions in t . Therefore, all 2's which equal t can be changed to either s or U without increasing the expression. This procedure is followed until only two values remain.

The algorithm is a straight implementation of the proofs. First solve the linear program A = min(Eifj Ci,j d i j under the condition E:=, D , . d.p,tp = 1, where d is a metric on G}. Ap- proximate the optimizing metric in 11 as in Theorem 3.5 and Corollarv 3.6. Consider the index i: which

expression using the above variational procedure to find a near-optimal cut. I

The proof shows that the max-flow min-cut gap is accounted for by the distortion in approximating a certain metric by 11 norm. In those cases where distor- tion smaller than logk will do, better bounds follow for the multicommodity flow problem. For example - suppose the n-point metric space defined by the opti- mal d is isometrically embeddable in Ra for some small s. Then, as mentioned in the end of the proof of Theo- rem 6.1, d may be approximated by 1; with distortion O(s), yielding a better bound than in the general case.

Remark 7.2: Theorem 3.5 shows that every metric on n points can be embedded in an 11 with distortion O(1ogn). As observed in [29] the mu-flow min-cut gap can get as big as Jz(1og n) (all-pairs, unit-demand flow problem on a bounded-degree expander, where all capacities are one). Consequently this bound is tight. The same conclusion holds also for embeddings into 1, for 2 2 p 2 1, because in this range, every finite dimensional 1, space can be embedded in 11 with a constant distortion ([30] p. 182). I

Two commodities

That max-flow=min-cut for two commodities ([23] and [45]) can be shown as follows: Let d be the met- - -, E,,, C i , j . d i , j ric for which X =

to the point

. Map every vertex x

and let D be the I , metric E:=, Dw%b4I

among these points. If we replace d by D , the numer- ator can only decrease, while the denominator stays unchanged, whence there is no loss in assuming d to be (a restriction of) the 1% metric. It is not hard to see that the linear mapping ~ ( 2 1 , z2) = (w, y ) sat- isfies 1 1 ~ ( z l , z ~ ) - - ( ~ l , ~ a ) l l l = Il(21, z z ) - ( ~ i , w ) l l ~ . An application of T thus allows us to assume that the metric d is, in fact (a restriction of) 11. From here on,

588

Page 13: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

the proof of Theorem 7.1 can be followed to derive our claim.

8 Further problems

It is an intriguing idea that large diameters in graphs can be essentially attributed to low- dimensionality. The easy converse is our Lemma 4.3. Attempts to make this statement precise were, in fact, the initial motivation for this research. A plausible conjecture along these lines was formulated with the help of L. Levin. We are grateful for his permis sion to include it here. Let Z& denote the graph of the d-dimensional lattice with 1, metric. Define the growth rate p(G) of a graph G as the maximum (over all choices of r and x) of .wl (where B ( x , r ) is the r-ball around z).

Conjecture 8.1: Let G have growth rate p = p(G). Then Z z ( p ) contains a (not necessarily induced) sub- graph isomorphic to G. I By a standard counting argument, fewer than p(G)

dimensions will not suffice. We have verified the conjecture for the d-Cube and

for regular trees. Note that p(G) may be much smaller than any di-

mension in which G may be embedded nearly isomet- rically:

Example 8.2: Construct a tree of depth 2m as fol- lows: For m 2 i 2 1 level i contains exactly (i + 1)2 vertices. Each vertex in level i has at least one child and the 2i + 3 vertices in a randomly selected set have two children. Vertices in the last m levels have ex- actly one child each. The number of vertices in this tree is n = 0(m3). Now, while p(G) is a constant, a near isometric embedding of G requires dimension Q(logn), since the distance between any two of the (m + 1)2 leaves is between 2m and 4m. The conclu- sion follows from the standard volume argument. I

A question raised in [7] and [3] is to estimate the least $ = $,(n) so that every n-point metric space can be embedded in a $-dimensional normed space with distortion c. An obvious general question is

Problem 8.3 What is the computational complexity of deciding whether dim(G) = d? Similarly for dim(G) and dim,( G) . I It is not hard to see, e.g., by Proposition 4.2, that almost all n-vertex graphs have dimension at least Q(1og n). We would like to know

Problem 8.4: What is the typical dimension of an n-vertex graph? I We suspect the answer to be linear or nearly lin- ear. The situation for near-isometries is quite differ- ent. Since almost all n-vertex graphs have diameter 2, and since dim(K,) = P0g2n], almost all graphs satisfy dim2(G) = O(logn1. By the method of [3], diml+,(G) 5 c, log n diam (G), for every > 0, so for almost all graphs, diml+,(G) 5 4c,logn. On the other hand dim(G) = Q(n) for most graphs, be- cause almost surely no single coordinate can imple- ment the distance between more than n pairs x, y with d(x,y) = 2, but almost surely there are Q(n2) such distances in a random G.

We are only starting to understand the role of girth in this field (but see [43]), and offer:

Problem 8.5: Let all vertices in G have degree 2 3. Does every embedding of G in a Euclidean space (of any dimension) have distortion Q(girth(G))? I

9 Acknowledgments

The authors wish to thank Mike Saks, Micha A. Per- les and Gil Kalai for stimulating discussions. Thanks are also due to Alex Samorodnitsky. We are partic- ularly thankful to Leonid Levin for formulating Con- jecture 8.1.

References

E. M. Andreev, Convex polyhedra in LobaEevskiY spaces, Mat. Sb. (N.S.) 81 (123) (1970), 445-478. English translation: Math. USSR Sb. 10 (1970), 413-440.

E. M. Andreev, Convex polyhedra of finite vol- ume in LobaEevskii space, Mat. Sb. (N.S.) 83 (125) (1970), 256-260. English translation: Math. USSR Sb. 12 (1970), 255-259.

J. Arias-de-Reyna and L. Rodrigues-Piazza, Fi- nite metric spaces needing high dimension for Lipschitz embeddings in Banach spaces, Israel J . Math. 79 (1992), 103-111.

B. Awerbuch and D. Peleg, Sparse partitions, FOCS 31 (1990), 503-513.

L. Babai and P. Frankl, “Linear Algebra Methods in Combinatorics”, Preliminary Version 2, De- partment of Computer Science, The University of Chicago, Chicago, 1992.

589

Page 14: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

[6] K. Ball, Isometric embedding in I,-spaces, Europ. J . Combinatorics 11 (1990), 305-31 1.

[7] J . Bourgain, On Lipschitz embedding of finite metric spaces in Wilbert space, Israel J . Math. 52 (1985), 46-52.

[8] L. Carroll, “Through the Looking-Glass and what Alice Found There”, Chapter 6, Pan Books, London, 1947.

[9] F. R. K. Chung, Separator theorems and their ap- plications, in “Paths, Flows, and VLSI-Layout” (B. Korte, L. Lov&z, H. J . Promel, and A. Schri- jver eds.) Springer, Berlin - New York, 1990, 17- 34.

[lo] E. Cohen, Polylog-time and near-linear work approximation scheme for undirected shortest paths, STOC 26 (1994), 16-26.

[ll] L. J . Cowen, On local representations of graphs and networks, PhD dissertation, MIT/ LCS/ TR- 573, 1993.

[12] L. Danzer and B. Griinbaum, Uber zwei Prob- leme beziiglich konvexer Korper von P. Erdos und von V. L. Klee, Math. Zeitschr. 79 (1962), 95-99.

[13] M. Deza and M. Laurent, Applications of cut polyhedra, Liens - 92 - 18, September 1992.

[14] M. Deza and H. Maehara, Metric transforms and Euclidean embeddings, Trans. AMS 317 (1990), 661-671.

[15] R.O. Duda and P.E. Hart, “Pattern Classification and Scene Analysis”, John Wiley and Sons, New York, 1973.

[16] P. Frankl and H. Maehara, The Johnson- Lindenstrauss lemma and the sphericity of some graphs, J . Comb. Th. B 44 (1988), 355-362.

[17] P. Frankl and H. Maehara, On the contact di- mension of graphs, Discrete and Computational Geometry 3 (1988), 89-96.

[18] N. Garg, V. V. Vazirani, and M. Yannakakis, Ap- proximate max-flow min-(mu1ti)cut theorems and their applications, STOC 25 (1993), 698-707.

[19] A. A. Giannopoulos, On the Banach-Mazur dis- tance to the cube, to appear in “Geometric As- pects of Functional Analysis”.

[20] M. X. Goemans and D. P. Williamson, .878- Approximation algorithms for MAX CUT and MAX SSAT, STOC 26 (1994), 422-431.

[21] R. L. Graham and P. M. Winkler, On isometric embeddings of graphs, Trans. AMS 288 (1985), 527-536.

[22] W. Holsztynski, R” as a universal metric space, Abstract 78T-G56, Notices AMS 25 (1978), A- 367.

[23] T. C. Hu, Multicommodity network flows, Oper- ations Research 11 (1963), 344-360.

[24] W. B. Johnson and J . Lindenstrauss, Extensions of Lipschitz mappings into a Hilbert space, Con- temporary Mathematics 26 (1984), 189-206.

[25] W. B. Johnson, J . Lindenstrauss, and G. Schecht- man, On Lipschitz embedding of finite metric spaces in low dimensional normed spaces, in “Geometric Aspects of Functional Analysis”, (J. Lindenstrauss and V. Milman eds.) LNM 1267, Springer, Berlin - New York, 1987, 177-184.

[26] D. Karger, R. Motwani, and M. Sudan, Approx- imate graph coloring by semidefinite program- ming, these ,proceedings.

[27] P. Klein, A. Agrawal, R. Ravi, and S. Rao, Approximation through multicommodity flow, FOCS 31 (1990), 726-737.

[28] P. Koebe, Kontaktprobleme der konformen Ab- bildung, Berichte Verhande. Sachs. Akad. Wiss. Leipzig, Math.-Phys. Klasse 88 (1936), 141-164.

[29] T. Leighton and S. Rao, An approximate max- flow min-cut theorem for uniform multicommod- ity flow problems with applications to approxi- mation algorithms, FOCS 29 (1988), 422-431.

[30] J . Lindenstrauss and L. Tzafriri, “Classical Ba- nach Spaces 11”, Springer, Berlin - New York, 1979.

[31] M. Linial, N. Linial, H. Margalit, N. Tishby, and G. Yona, Work in progress, 1994.

[32] N. Linial, Local-global phenomena in graphs, Combinatorics, Probability and Computing 2 (1993), 491-503.

[33] N . Linial, D. Peleg, Yu. Rabinovich, and M. Saks, Sphere packing and local majorities in graphs, The 2nd Israel Symp. on Theory and Comput- ing Systems (1993), 141-149.

590

Page 15: The geometry of graphs and some of its algorithmic ...people.ee.duke.edu/~lcarin/GeometryofGraphs.pdf · The geometry of graphs and some of its algorithmic applications Nathan Linial

[34] N . Linial and M. Saks, Low diameter graph de- compositions, SODA 2 (1991), 320-330. Journal version: Combinatorica 13 (1993), 441-454.

[35] J . H. van Lint and R. M. Wilson, “A Course in Combinatorics”, Cambridge University Press, Cambridge, 1992.

[36] L. Lovkz, On the Shannon capacity of a graph, IEEE Trans. Inf. Th. 25 (1979) 1-7.

[37] J . Matougek, Computing the center of planar point sets, in “Discrete and computational Ge- ometry: papers from the DIMACS special year”, (J. E. Goodman, R. Pollack, and W. Steiger eds.) AMs, Providence, 1991, 221-230.

[38] J. Matougek, Note on bi-Lipschitz embeddings into normed spaces, Comment. Math. Univ. Car- olinae 33 (1992), 51-55.

[39] G. L. Miller, S-H. Teng, and S. A. Vavasis, A unified geometric approach to graph separators, FOCS 32 (1991), 538-547.

[40] G. L. Miller and W. Thurston, Separators in two and three dimensions, STOC 22 (1990), 300-309.

[41] D. Peleg and A. Schaffer, Graph spanners, J . Graph Theory 13 (1989), 99-116.

[42] G. Pisier, “The Volume of Convex Bodies and Banach Space Geometry”, Cambridge University Press, Cambridge, 1989.

[43] Yu. Rabinovich and R. Raz, On embeddings of fi- nite metric spaces in graphs with a bounded num- ber of edges, in preparation.

[44] J . Reiterman, V. Rodl, and E. Sifiajov6, Geomet- rical embeddings of graphs, Discrete Mathemat- ics 74 (1989), 291-319.

[45] B. Rothschild and A. Whinston, Feasibility of two-commodity network flows, Operations Re- search 14 (1966), 1121-1129.

[46] D. D. Sleator, R. E. Tarjan, and W. P. Thurston, Rotation distance, triangulations, and hyperbolic geometry, J . AMS 1 (1988), 647-681.

[47] W. P. Thurston, The finite Riemann mapping theorem, invited address, International Sympo- sium in Celebration of the Proof of the Bieber- bach Conjecture, Purdue University, 1985.

[48] P. M. Winkler, Proof of the squashed cube con- jecture, Combinatorica 3 (1983), 135-139.

[49] H . S. Witsenhausen, Minimum dimension embed- ding of finite metric spaces, J . Comb. Th. A 42 (1986), 184-199.

[50] I. M. Yaglom and V. G. Boltyanskii, “Convex Fig- ures”, Holt, Rinehart and Winston, New York, 1961.

591