euclidean distance matrix (edm) based optimization … · distance matrix (edm) based optimization...

ABSTRACT

For spherical data representation, we reformulate the problem as an Euclidean dis-

tance matrix optimization problem with a low rank constraint. We then propose

an iterative algorithm that uses a quadratically convergent Newton-CG method

at its each step. We study fundamental issues including constraint nondegeneracy

and the nonsingularity of generalized Jacobian that ensure the quadratic conver-

gence of the Newton method. We use some classic examples from the spherical

multidimensional scaling to demonstrate the flexibility of the algorithm in incor-

porating various constraints.

For wireless sensor network localization, we set up a convex optimization model

using EDM which integrates connectivity information as lower and upper bounds

on the elements of EDM, resulting in an EDM-based localization scheme that

possesses both efficiency and robustness in dealing with flip ambiguity under the

presence of high level of noises in distance measurements and irregular topology

of the concerning network of moderate size.

This thesis is an accumulation of work regarding a class of constrained Euclidean

Distance Matrix (EDM) based optimization models and corresponding numerical

approaches. EDM-based optimization is powerful for processing distance informa-

tion which appears in diverse applications arising from a wide range of fields, from

which the motivation for this work comes. Those problems usually involve min-

imizing the error of distance measurements as well as satisfying some Euclidean

distance constraints, which may present enormous challenge to the existing algo-

rithms. In this thesis, we focus on problems with two different types of constraints.

The first one consists of spherical constraints which comes from spherical data rep-

resentation and the other one has a large number of bound constraints which comes

from wireless sensor network localization.

OPJS University,

Rajasthan.

GITM,Gurgaon.

Dr. Amit Jain Professor

Dinesh Gupta

Research Scholar

EUCLIDEAN DISTANCE MATRIX (EDM) BASED

OPTIMIZATION MODELS

JASC: Journal of Applied Science and Computations

Volume VI, Issue VI, JUNE/2019

ISSN NO: 1076-5131

Page No:3415

Introduction

In this thesis, we focus on designing algorithms for a class of Euclidean Distance

Matrix (EDM) based optimization problems. In particular, we are interested in

EDM-based optimization problems with two types of constraints: spherical con-

straints and bound constraints. Let {x1, . . . ,xn} be n points in IRr, where r > 0

is known as the embedding dimension of those points. The primary information

that is available for those points is the measured Euclidean distances among them

dij ≈ ‖xi − xj‖, for some pairs (xi,xj), (1.1)

which may be incomplete or noisy, or both. The aim of EDM-based optimization is

to recover the (relative or global) coordinates of these points in a target space IRr

purely based on those available distances. Such problems are usually encountered

with cone constraints and rank constraints, which would bring nonsmoothness and

nonconvexity to the optimization model. So algorithms need to be designed for

solving the problems with specific constraints accurately and efficiently.

This chapter is split into three sections. In Section 1.1, we cover the background to

Euclidean Distance Matrix which is the fundamental concept of our modelling pro-

cess and algorithm design. In Section 1.2, we give an introduction to semismooth

Newton method that is the main approach to deal with spherical constraints. In

Section 1.3, we cover a novel convergent Alternating Direction Method with Mul-

tipliers (ADMM) which allows us to deal with large amount of bound constraints

in conic programming.



ISSN NO: 1076-5131

Page No:3416

1.1 Background on Euclidean Distance Matrix

Let Sn denote the space of n× n symmetric matrices equipped with the standard

inner product 〈A,B〉 = Tr(AB) for A,B ∈ Sn. Let ‖ · ‖ denote the induced

Frobenius norm. Let Sn+ denote the cone of positive semidefinite matrices in Sn

(often abbreviated as X � 0 for X ∈ Sn+). The so-called hollow subspace Snh is

defined by (“:=” means define)

Snh := {A ∈ Sn : diag(A) = 0} ,

where diag(A) is the vector formed by the diagonal elements of A. For subsets α,

β of {1, . . . , n}, denote Aαβ as the submatrix of A indexed by α and β (α for rows

and β for columns). Aα denotes the submatrix consisting of columns of A indexed

by α, and |α| is the cardinality of α. Throughout the thesis, vectors are treated

as column vectors. For example, xT is a row vector for x ∈ IRn. The vector e is

the vector of all ones and I denotes the identity matrix, whose dimension is clear

from the context. When it is necessary, we use In to indicate its dimension n.

Let ei denote the ith unit vector, which is the ith column of I. We also need the

following two important linear transformations.

The first one is Householder transformations, which are orthogonal transforma-

tions that describe reflections about hyperplanes containing the origin. Let

v := [1, . . . , 1, 1 +√

n]T = e+√nen.Then

Q = In −2

vTvvvT

n]T ∈

IRn.

The second one is the geometric centering transformation, which centers a set of

points at their geometric center. Consider a collection of n points in IRr, ascribed

to the columns of matrix X ∈ IRr×n, X = [x1,x2, . . . ,xn], xi ∈ IRr. The centroid

is the mean of all the points

√is the Householder transformation that maps e ∈ IRn to the vector [0, . . . , 0,−



ISSN NO: 1076-5131

Page No:3417

xc =1

n

n∑i=1

xi =1

nXe.

By subtracting this vector from all the points in the set, we have the set of cen-

tralized points as

Xc = X − xceT = X(In −

1

neeT ).

Then the geometric centering transformation is defined as

J := In −1

neeT . (1.2)

We often use the following properties:

J2 = J, Q2 = I and J = Q

In−1 0

0 0

Q. (1.3)

1.1.1 Squared Euclidean Distance Matrix

A matrix D is a (squared) EDM if D ∈ Snh and there exist points {x1, . . . ,xn} in

IRr such that Dij = ‖xi − xj‖2 for i, j = 1, . . . , n. IRr is often referred to as the

embedding space and r is the embedding dimension when it is the smallest such

r. Consider the following example of EDM for the case n = 3.



ISSN NO: 1076-5131

Page No:3418

1.1.2 Characterizations of EDM

It is well-known that a matrix D ∈ Sn is an EDM if and only if

D ∈ Snh and J(−D)J � 0. (1.4)

The origin of this result can be traced back to Schoenberg (1935) and an inde-

pendent work by Young and Householder (1938). See also Gower (1985) for a

nice derivation of (1.4). Moreover, the corresponding embedding dimension is

r = rank(JDJ).

From the definition in (1.2), it is noted that the matrix J , when treated as an

operator, is the orthogonal projection onto the subspace e⊥ := {x ∈ IRn : eTx =

0}. Characterization (1.4) simply means that D is an EDM if and only if D ∈ Snh

and D is negative semidefinite on the subspace e⊥:

−D ∈ Kn+ :={A ∈ Sn : xTAx ≥ 0, ∀ x ∈ e⊥

}.

It follows that Kn+ is a closed convex cone (known as the almost positive semidefi-

nite cone). This gives us a window of using conic programming in dealing with dis-

tance related problems. Let ΠKn+(D) denote the orthogonal projection of D ∈ Sn

onto Kn+:

ΠKn+(D) := arg min ‖D − Y ‖ s.t Y ∈ Kn+.

A nice property is that this projection can be done through the orthogonal pro-

jection onto the positive semidefinite cone Sn+ and is due to Gaffke and Mathar

(1989)

ΠKn+(D) = D + ΠSn+(−JDJ) ∀ D ∈ Sn. (1.5)



ISSN NO: 1076-5131

Page No:3419

The other formula for computing ΠKn+ is due to Hayden and Wells (1988, Thm.

2.1):

D ∈ Kn+ ⇐⇒ QDQ :=

D d

dT d0

and D ∈ Sn−1+ , (1.6)

and

ΠKn+(D) = Q

ΠSn−1+

(D) d

dT d0

Q, ∀D ∈ Sn. (1.7)

Because of (1.7), the cone Kn+ can be described as follows:

Kn+ =

Q Z z

zT z0

Q :Z ∈ Sn−1

+

z ∈ IRn−1 z0 ∈ IR

. (1.8)

Its polar cone (Kn+)◦ is then given by

(Kn+)◦ =

Q Z 0

0 0

Q : Z ∈ −Sn−1+

. (1.9)

We will use (1.5) for the implementation of our algorithm and (1.8) and (1.9) for

theoretical analysis.

1.1.3 Coordinates recovery from EDM

In this section, we mainly introduce process for recovering the coordinates of points

from EDM. If D is an EDM, from the definition introduced in section 1.1.1,

Dij = ‖xi − xj‖2 = (xi − xj)T (xi − xj) = xTi xi − 2xixj + xTj xj.

Let X ∈ IRr×n, X = [x1,x2, . . . ,xn] be a collection of n points, then

D = ediag(XTX)T − 2XTX + diag(XTX)eT , (1.10)



ISSN NO: 1076-5131

Page No:3420

which is an obvious relation between coordinates of points X and the EDM D.

Define the matrix

G := XTX, (1.11)

which is always called Gram matrix. From Gower (1982), the set of coordinates

can be obtained through the decomposition:

− 1

2JDJ = XTX. (1.12)

We note that the decomposition is possible because the matrix (−JDJ) is positive

semidefinite according to (1.4).

The results in (1.4) and (1.12) are true when D is a true EDM. What should one do

ifD is not a true EDM? The most popular method is the classical Multidimensional

Scaling (cMDS) (Cox and Cox, 2000; Borg and Groenen, 2005), which simply

computes the nearest positive semidefinite matrix from (−JDJ) and is obtained

through the following optimization:

minY‖J(Y −D)J‖2 s.t. − JY J � 0 and Y ∈ Snh . (1.13)

The optimal solution is just the orthogonal projection of (−JDJ) onto Sn+ and is

denoted by ΠSn+(−JDJ)). cMDS then uses this projection in replace of (−JDJ) in

(1.12) to get the embedding points in X. This method is also known as principal

coordinate analysis by Gower (1966). We summarize the cMDS algorithm as

Algorithm 1. We need to point out here that cMDS works well when D is close to

a true EDM. Otherwise it may perform poorly in terms of embedding quality due

to the rank of Gram matrix being too high.



ISSN NO: 1076-5131

Page No:3421

Best Euclidean distance

embedding on a sphere

In this paper, we mainly discuss a class of EDM-based optimization problem with

spherical constraints for data representation on a sphere of unknown radius. This

problem arises from various disciplines such as Statistic (spatial data representa-

tion), Psychology (constrained multidimensional scaling), and Computer Science

(machine learning and pattern recognition). The best representation often needs

to minimize a distance function of the data on a sphere as well as to satisfy some

Euclidean distance constraints. As discussed in Section 1.1.4, those spherical and

Euclidean distance constraints will present an enormous challenge to the existing

algorithms. In this chapter, we introduce a reformulation of the problem as an

EDM-based optimization problem with a low rank constraint. We then propose an

iterative algorithm that uses a quadratically convergent Newton-CG method at its

each step. We study fundamental issues including constraint nondegeneracy and

the nonsingularity of generalized Jacobian that ensure the quadratic convergence

of the Newton method. We use some classic examples from the spherical multidi-

mensional scaling to demonstrate the flexibility of the algorithm in incorporating

various constraints.



ISSN NO: 1076-5131

Page No:3422

The section is organized as follows. In Section 2.1, we give a background and

literature review for spherical data representation problem. In Section 2.2, We

first argue that when the EDM is used to formulate the problem, it is necessary

to introduce a new point to represent the center of the sphere. This is due to a

special property arising from embedding an EDM. The algorithmic framework that

we use for the obtained non-convex matrix optimization problem is closely related

to the majorized penalty method of Gao and Sun (2010) for the nearest low-rank

correlation matrix problem. One of the key elements in this type of method is that

the subproblems are convex. Those convex problems are structurally similar to

a convex relaxation of the original matrix optimization problem and they all can

be solved by a quadratically convergent Newton-CG method. We establish that

this is the case for our problem by studying the challenging issue of constraint

nondegeneracy, which further ensures the nonsingularity of generalized Jacobian

used by the Newton-CG method. Those results can be found in Section 2.3 and

ensure that the extension of the majorization method of Gao and Sun (2010)

to our problem is complete. The algorithm is presented in Section 2.4 and its

key convergent results are stated without detailed proofs as they can be proved

similarly as in Gao and Sun (2010). Section 2.5 aims to demonstrate a variety

of applications from classical MDS to the circle fitting problem. The numerical

performance is highly satisfactory with those applications.

2.1 Introduction to spherical data representa-

tion

The problem that we are mainly concerned with is placing n points {x1, . . . ,xn}

in a best way on a sphere in IRr. The primary information that we use is an

incomplete/complete set of pairwise Euclidean distances (often with noises) among

the n points. In such a setting, IRr is often a low-dimensional space (e.g., r takes



ISSN NO: 1076-5131

Page No:3423

2 or 3 for data visualization) and is known as the embedding space. The center

of the sphere is unknown. For some applications, the center can be put at origin

in IRr. Furthermore, the radius of the sphere is also unknown. In our matrix

optimization formulation of the problem, we treat both the center and the radius

as unknown variables. We develop a fast numerical method for this problem and

present a few of interesting applications taken from existing literature.

The problem described above has long appeared in the constrained Multi-Dimensional

Scaling (MDS) when r ≤ 3, which is mainly for the purpose of data visualization,

see Cox and Cox (2000, Sect. 4.6) and Borg and Groenen (2005, Sect. 10.3) for

more details. In particular, it is known as the spherical MDS when r = 3 and the

circular MDS when r = 2. Most numerical methods in this part took advantages

of r being 2 or 3. For example, two of the earliest circular MDS were by Borg and

Lingoes (1980) and Lee and Bentler (1980), where they introduced a new point

x0 ∈ IRr as the center of the sphere (i.e., circles in their case) and further forced

the following constraints to hold:

D01 = D02 = · · · = D0n.

Here D0j = ‖x0−xj‖, j = 1, . . . , n are the Euclidean distances between the center

x0 and the other n points. In their models, the variables are the coordinates of

the (n + 1) points in IRr. In Borg and Lingoes (1980), the optimal criterion was

a stress function widely used in MDS literature (see Borg and Groenen (2005,

Chp. 3)), whereas Lee and Bentler (1980) used a least square loss function as its

optimal criterion.

In the spherical MDS of Cox and Cox (1991), Cox and Cox placed the center of

the sphere at origin and represented the n points by their spherical coordinates.

Moreover, they also argued for the Euclidean distance to be used over the seem-

ingly more appropriate geodesic distance on the sphere. This is particularly the



ISSN NO: 1076-5131

Page No:3424

case when the order of the distances among the n points are more important than

the magnitude of their actual distances. For the accurate relationship between

Euclidean distance and the geodesic distance on a sphere, see Pekalska and Duin

(2005, Thm. 3.23), which is credited to Schoenberg (1937). A recent method

known as MDS on a quadratic surface (MDS-Q) was proposed by De Leeuw and

Mair (2009), where geodesic distances were used. As noted in De Leeuw and Mair

(2009, p. 12), ”geodesic MDS-Q, however, seems limited for now to spheres in any

dimension, with the possible exception of ellipses and parabolas in IR2”. For the

spherical case, MDS-Q places the center at origin and the variables are the radius

and the coordinates of the n points on the sphere. The Euclidean distances were

then converted to the corresponding geodesic distances. The optimal criterion is

a weighted least square loss function.

When the center of the sphere is placed at origin, any point on the sphere satisfies

the spherical constraint of the type ‖x‖ = R, where x ∈ IRr and R is the radius.

Optimization with spherical constraints has recently attracted much attention of

researchers, see, e.g., Malick (2007); Ling et al. (2010); Gao (2010); Gao and Sun

(2010); Li and Qi (2011); Zhou et al. (2012) and the references therein. Such

a problem can be cast as a more general optimization problem over the Stiefel

manifold (Wen and Yin, 2013; Jiang and Dai, 2014). One important example is

the nearest low-rank correlation matrix problem, where the unit diagonals of the

correlation matrix yields the spherical constraints (Gao and Sun, 2010; Li and Qi,

2011; Wen and Yin, 2013; Jiang and Dai, 2014). It is noted that the sequential

second-order methods in Gao and Sun (2010); Li and Qi (2011) as well as the

feasibility-preserving methods in Wen and Yin (2013); Jiang and Dai (2014) all

rely on the fact that the radius is known (e.g., R = 1). This is in contrast to our

problem where R is a variable.



ISSN NO: 1076-5131

Page No:3425

2.2 EDM-based optimization formulation

The available information for us to find n points {x1, . . . ,xn} embedded on a

sphere in IRr is the set of approximate (squared) Euclidean distances among the

n points:

D0ij ≈ ‖xi − xj‖2, i, j = 1, . . . , n.

Denote the center of the sphere by xn+1 (the (n+ 1)th point) and its radius by R.

Since the n points are placed on the sphere, we must have

‖xj − xn+1‖ = R, j = 1, . . . , n.

Although we do not know the exact magnitude of R, we can be sure that twice

the radius cannot be bigger than the diameter of the data set:

2R ≤ dmax := maxi,j

√D0ij.

We therefore define the approximate distance matrix D ∈ Sn+1 by (only upper

part of D is defined)

Dij =

14d2

max i = 1, . . . , n, j = n+ 1

D0ij i < j = 2, . . . , n

0 i = j,

(2.1)

The elements in D are approximate Euclidean distances among the (n+ 1) points

{x1, . . . ,xn+1}. But D may not be a true EDM. Our purpose is to find the nearest

EDM Y to D such that the embedding dimension of Y is r and its embedding



ISSN NO: 1076-5131

Page No:3426

points {x1, . . . ,xn} are on a sphere centered at xn+1. The resulting matrix opti-

mization model is then given by

minY ∈Sn+112‖Y −D‖2

s.t. Y ∈ Sn+1h , −Y ∈ Kn+1

+ , rank(JY J) ≤ r

Y1(n+1) = Yj(n+1), j = 2, . . . , n.

(2.2)

Once we find the nearest EDM Y from which the total deviation of D is the

smallest, combined with the classical MDS Algorithm 1, we will get the positions

of n embedding points.

Problem (2.2) is always feasible (e.g., the zero matrix is feasible). The feasible

region is closed and the objective function is coercive. Let Y be its optimal so-

lution. The first group of constraints in (2.2) implies that Y is an EDM with an

embedding dimension not greater than r. If r < n (i.e., rank(JY J) < n), the

problem is nonconvex. If r = n, then we can drop the rank constraint so that the

problem is convex. This is due to the fact that any EDM of size (n+ 1)× (n+ 1)

has an embedding dimension not greater than (n + 1 − 1) = n. One can easily

check that 0 is always an eigenvalue of JY J and e is the corresponding eigenvec-

tor. Therefore, the rank constraint is automatically satisfied if r = n. The second

group of constraints in (2.2) means that the distances from xi, i = 1, . . . , n to

xn+1 are equal. Hence, {x1, . . . ,xn} lie on a sphere centered at xn+1. We call the

constraints Y1(n+1) = Yj(n+1), j = 2, . . . , n spherical constraints and we note that

they are linear. This is in contrast to the nonlinear formulation of the spherical

constraints in the previous studies (Borg and Lingoes, 1980; Lee and Bentler, 1980;

Cox and Cox, 1991; De Leeuw and Mair, 2009).

Regarding to model (2.2), we have the following two remarks.

Remark 2.1. The idea of introducing a variable representing the center (i.e., one

more dimension in our formulation) is similar to that of Borg and Lingoes (1980);



ISSN NO: 1076-5131

Page No:3427

Lee and Bentler (1980), whose main purpose was for the case r = 2 and the

variables of the optimization problems are the coordinates of the points concerned.

Our model is more general for arbitrary r and is conducive to (second-order)

algorithmic development because the spherical constraints are linear. Furthermore,

as introduce in Section 1.1.3, the actual embedding is left out as a separate issue,

which can be done by Algorithm 1, possibly through Procrustes analysis.

Remark 2.2. The following reasoning further justifies why it is necessary to intro-

duce a new point for the center of the sphere. Let D0 denote the true squared

Euclidean distance matrix among n points on a sphere. From Gower (1982), the

decomposition

− 1

2JD0J = XTX with X ∈ IRr×n, (2.3)

would provide a set of points {xi : i = 1, . . . , n} such that the distances in D0 are

recovered through D0ij = ‖xi − xj‖2. In order for those points to lie on a sphere

centered at origin, it is necessary and sufficient to enforce the constraints

‖x1‖ = ‖x2‖ = · · · = ‖xn‖. (2.4)

We note that

‖xi‖2 = eTi (XTX)ei = −1

2eTi JD

0Jei= D0

ii +1

2n

(eiD

0e+ eTD0ei)− eTD0e

2n2=

1

2n〈D0, Ai〉 −

eTD0e

2n2,

where Ai := eieT + eeTi . The spherical constraints are then equivalent to

〈D0, A1 − Ai〉 = 0, i = 2, · · · , n,

which are linear in the Euclidean distance matrix D0. It seems that there is no

need to introduce a new point to represent the center of the sphere. However, there



ISSN NO: 1076-5131

Page No:3428

is a potential conflict in this seemingly correct argument. We note that there is

an implicit constraint we ignored. In (2.3), the embedding points in X have to

satisfy the centralization condition (because of the projection matrix J)

Xe = 0. (2.5)

A potential conflict is that the constraints (2.4) and (2.5) may be contradicting

to each other. Such possible contradiction can be verified through the following

example: Let D0 be from the tree points on the unit circle centered at origin:

x1 = (1, 0)T , x2 = (−1, 0)T , x3 = (0, 1)T .

There exists no X ∈ IR2×3 that satisfies (2.3) (hence (2.5)) and (2.4). Now we

define D by (2.1) and solves problem (2.2), we obtain the following 4 embedding

points:

z1 = (−1, 0.25)T , z2 = (1, 0.25)T , z3 = (0,−0.75)T , z4 = (0, 0.25)T .

The first three points are on the unit circle centered at z4. The original three

points x1, x2 and x3 can be obtained through the simple shift xi = zi − z4 (the

simplest Procrustes analysis). This example shows that it is necessary to introduce

a new point to represent the center in order to remove the potential confliction in

representing the spherical constraints as linear equations.

We now reformulate (2.2) in a more conventional format. By replacing Y by (−Y )

(in order to get rid of the minus sign before Kn+1+ ), we obtain

minY ∈Sn+112‖Y +D‖2

s.t. Y ∈ Sn+1h , Y ∈ Kn+1

+ , rank(JY J) ≤ r

Y1(n+1) = Yj(n+1), j = 2, . . . , n.



ISSN NO: 1076-5131

Page No:3429

Define three linear mappings A1 : Sn+1 → IRn+1, A2 : Sn+1 → IRn−1 and A :

Sn+1 → IR2n respectively by

A1(Y ) := diag(Y ), A2(Y ) :=(Y1(n+1) − Yj(n+1)

)nj=2

and A(Y ) :=

A1(Y )

A2(Y )

.

It is therefore that solving (2.2) is equivalent to solving the following problem

minY ∈Sn+112‖Y +D‖2

s.t. A(Y ) = 0, Y ∈ Kn+1+

rank(JY J) ≤ r.

(2.6)

We note that without the spherical constraints A2(Y ) = 0, the problem reduces

to the problem (1.21) studied in Qi and Yuan (2014). However, with the spherical

constraints, the analysis in Qi and Yuan (2014), especially for the semismooth

Newton-CG method developed in Qi (2013); Qi and Yuan (2014) is not valid any

more because it heavily depends on the simple structure of the diagonal constraints

A1(Y ) = 0. One of our main tasks in this section is to develop more general

analysis that covers the spherical constraints.



ISSN NO: 1076-5131

Page No:3430

−0.5 0 0.5

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

Ekman Color Example

434445

465472

490

504537

555

584

600

610628651

674

(a) Circular fitting without any constraints

−0.5 0 0.5

−0.6

−0.4

−0.2

0

0.2

0.4

0.6

Wheel Representation of Ekman Color Example

434445

465

472

490

504537

555584

600

610

628

651674

(b) Circuclar fitting with pole constraints

Figure 2.1: Comparison between the two circular fitting of Ekman’s 14 color

problem with and without pole constraints.

Figure 2.1 (a) (the radius is R = 0.5354) is the resulting circular representation by

FITS with colors appearing on the circle one by one in order of their wavelength.

This figure is similar to De Leeuw and Mair (2009, Fig. 2), where more comments

on this example can be found. A pair of colors (i, j) are said opposing to each

other if their distance equals the diameter of the circle. That is

Yij = 4Y1(n+1), (2.58)

which means that the squared distance between opposing colors is fourfold of the

radius squared. This type of constraints is called “pole constraint”. An interesting

feature is that we assume that the first 7 colors are set to oppose the remaining

7 colors, the resulting circular representation appears as a nice wheel, without

having changed the order of the colors, see Figure 2.1(b) (the radius is 0.5310).

Practitioners in Psychology may have new interpretation of such nice representa-

tion. We emphasize that our method can easily include the pole constraints and

other linear constraints without any technical difficulties. We are not aware any

existing methods that can directly handle those extra constraints.

(E2) Trading globe. The data in this example was first mapped to a sphere (r =



ISSN NO: 1076-5131

Page No:3431

Figure 2.2: Spherical representation for trading data in 1986 between countries

{Argentina, Australia, Brazil, Canada, China, Czechoslovakia, East Germany,

Egypt, France, Hungary, India, Italy, Japan, New Zealand, Poland, Sweden,

UK, USA, USSR, West Germany}.

3) in Cox and Cox (1991) and was recently tested in De Leeuw and Mair (2009).

The data was originally taken from the New Geographical Digest (1986) on which

countries traded with other countries. For 20 countries the main trading partners

are dichotomously scored (1 means trade performed, 0 trade not performed) as

shown in Table 2.2. Based on this dichotomous matrix X the distance matrix D0

is computed using the squared Jaccard coefficient (computed by the Matlab build-

in function pdist(X, ’jaccard’). The most intuitive MDS approach is to project

the resulting distances to a sphere which gives a “trading globe”.

In Figure 2.2 (R = 0.5428), the counties were projected on to a globe with the

shaded points being on the other side of the sphere. The figure is from the de-

fault viewpoint of Matlab. It is interesting to point out that obvious clusters of

countries can be observed. For example, on the top left is the cluster of Com-

monwealth nations (Australia, Canada, India, and New Zealand). On the bottom

right is the cluster of western allies (UK, US, and West Germany) with Japan



ISSN NO: 1076-5131

Page No:3432

Tab

le2.2:

Nation

s’trad

ing

data

fromN

ewG

eographical

Digest

(1986)

Arge

00

10

00

00

00

01

10

00

10

01

Aust

00

00

10

00

00

00

11

00

10

11

Braz

10

00

00

00

00

00

10

00

10

01

Can

a0

00

01

00

00

00

01

00

01

01

0C

hin

01

01

00

00

00

00

10

00

01

00

Czec

00

00

00

01

01

00

00

10

01

00

Egy

p0

00

00

00

01

00

10

00

01

11

1E

.Ge

00

00

01

00

01

00

00

10

01

01

Fran

00

00

00

10

00

01

00

00

10

11

Hung

00

00

01

01

00

00

00

00

01

01

Indi

00

00

00

00

00

00

10

00

11

11

Ital1

00

00

01

01

00

00

00

00

00

1Jap

a1

11

11

00

00

01

00

10

01

00

0N

.Ze

01

00

00

00

00

00

10

00

10

10

Pola

00

00

01

01

00

00

00

00

01

01

Sw

ed0

00

00

00

00

00

00

00

01

01

1U

SA

11

11

00

10

10

10

11

01

00

11

USSR

00

00

11

11

01

10

00

10

00

00

U.K

01

01

00

10

10

10

01

01

10

01

W.G

e1

11

00

01

11

11

10

01

11

01

0



ISSN NO: 1076-5131

Page No:3433

not far on above of them. On the north pole is China, which reflects its isolated

trading situation back in 1986. On the backside is the cluster of countries headed

by USSR. On the left backside is the cluster of Brazil, Argentina, Egypt. We note

that this figure appears different from those in Cox and Cox (1991); De Leeuw

and Mair (2009) mainly because that each used a different method on a different

(nonconvex) model of the spherical embedding of the data.

(E3) 3D Map of global cities in HA30 data set. HA30 is a dataset of spherical

distances among 30 global cities, measured in hundreds of miles and selected by

Hartigan (1975) from the World Almanac, 1966. It also provides XYZ coordinates

of those cities. In order to use FITS, we first convert the spherical distances to

Euclidean distances through the formula: dij := 2R sin(sij/(2R)) where sij is the

spherical distance between city i and city j and R = 39.59 (hundreds miles) is the

Earth radius (see Pekalska and Duin (2005, Thm. 3.23)). The initial matrix D0

consists of the squared distances d2ij. It is observed that the matrix (−JD0J) has

15 positive eigenvalues and 14 negative eigenvalues and 1 zero eigenvalue. There-

fore, the original spherical distances are not accurate and contain large errors.

Therefore, FITS is needed to correct those errors. We plot the resulting coordi-

nates of the 30 cities in Figure 2.3. One of the remarkable features is that FITS is

able to recover the Earth radius with high accuracy R = 39.5916.

We now assess the quality of the spherical embedding in Figure 2.3 through a

Procrustes analysis introduced in Section 1.1.3. The optimal objective f in (1.14)

is f = 0.2782. This small error is probably due to the fact that the radius used

in HA30 is 39.59 in contrast to ours 39.5916. This small value also confirms the

good quality of the embedding from FITS when compared to the solution in HA30.

(E4) Circle fitting. The problem of circle fitting has recently been studied in



ISSN NO: 1076-5131

Page No:3434

Figure 2.3: Spherical embedding of HA30 data set with radius R = 39.5916.

Beck and Pan (2012), where more references on the topic can be found. Let points

{ai}ni=1 with ai ∈ IRr be given. The problem is to find a circle with center x ∈ IRr

and radius R such that the points stay as close to the circle as possible. Two

criteria were considered in Beck and Pan (2012):

minx, R

f1 =n∑i=1

(‖ai − x‖ −R)2 (2.59)

and

minx, R

f2 =n∑i=1

(‖ai − x‖2 −R2

)2. (2.60)

Problem (2.60) is much easier to solve than (2.59). But the key numerical message

in Beck and Pan (2012) is that (2.59) may produce far better geometric fitting

than (2.60). This was demonstrated through the following example Beck and Pan

(2012, Example 5.3):

a1 =

1

9

, a2 =

2

7

, a3 =

5

8

, a4 =

7

7

, a5 =

9

5

, a6 =

3

7

.



ISSN NO: 1076-5131

Page No:3435

Model (2.60) produces a very small circle, not truly reflecting the geometric layout

of the data.

The Euclidean distance embedding studied in this paper provides an alternative

model. Let D0ij = ‖ai − aj‖2 for i = 1, . . . , n and n = 6, r = 2 in this example.

Let Y be the final distance matrix from FITS and the embedding points in X be

obtained from (1.12). The first 6 columns {xi}6i=1 of X correspond to the known

points {ai}6i=1. The last column x7 is the center. The points {xi}6

i=1 are on the

circle centered at x7 with radius R (R =√Y 1(n+1)). We need to match {xi}6

i=1 to

{ai}6i=1 so that the known points stay as close to the circle as possible. This can

be done through the orthogonal Procrustes problem (1.14).

We first centralize both sets of points. Let

a0 :=1

n

n∑i=1

ai, ai := ai−a0 and x0 :=1

n

n∑i=1

xi, xi := xi−x0, i = 1, . . . , n.

Let A be the matrix whose columns are ai and Z whose columns are xi for i =

1, . . . , n. Solve the orthogonal Procrustes problem (1.14) to get P = UV T . The

resulting points are

zi := Pxi + a0, i = 1, . . . , n

and the new center, denoted by zn+1, is

zn+1 := P (xn+1 − x0) + a0.

It can be verified that the points {zi}ni=1 are on the circle centered at zn+1 with

radius R. That is

‖zi − zn+1‖2 = ‖P (xi − xn+1)‖2 = ‖xi − xn+1‖2 = R2.

This circle is the best circle from model (2.2) and is plotted in Figure 2.4 with the



ISSN NO: 1076-5131

Page No:3436

−5 0 5 10

−4

−2

0

2

4

6

8

Circle Fitting

Figure 2.4: Circle fitting of 6 points with R = 6.5673. The known points and

their corresponding points on the circle by FITS are linked by a line.

pair of points {ai, zi} being linked by a line. When the obtained center x = zn+1

and R are substituted to (2.59), we get f1 = 3.6789, not far from the reported value

f1 = 3.1724 in Beck and Pan (2012). The circle fits the original data reasonably

well. The model used by Beck and Pan (2012) is nonconvex and the resulting f

highly depends on a good starting point while our algorithm solves a sequence of

convex relaxations that do not count on a good starting point. We complete this

example by noting a common feature between our model (2.2) and the squared

least square model (2.60) in that the squared distances are used in both models.

But the key difference is that (2.2) used all available pairwise squared distances

among ai rather than just those from ai to the center x as is in (2.60).

(E5) Synthetic data. In this part we generate random data to test the per-

formance of Algorithm 7 with growing dimension. Two types of data are used

as shown in Figure 2.5. The first one contains points randomly distributed on

a sphere and the second one contains points on a circle, both of them have the

radius 1, note that we do not actually use the radius as a given information in our

algorithm. The noise in the distance information is generated following a standard

framework as follows:

Dij = Dij × |1 + nf × randn|, i = 1, . . . , n, j = 1, . . . , n, (2.61)



ISSN NO: 1076-5131

Page No:3437

10.5

0-0.5-0.5

0

0.5

-0.5

0

0.5

-1

1

(a) Points randomly distributed on a sphere

-1 -0.5 0 0.5 1-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

(b) Points randomly distributed on a circle

Figure 2.5: Synthetic data with n = 200 points randomly distributed

where Dij is the true Euclidean distance between points xi and xj, 0 6 nf 6 1

is the noise factor, randn is the standard normal random variable. The accuracy

measurement of the estimated positions is the root mean square distance (RMSD)

RMSD :=1√n

(n∑i=1

‖xi − xi‖2

) 12

, (2.62)

where xi is the estimated position and xi is the ground-truth position. Note that

without knowing some points position in advance, only relative positions can be

found. To calculate RMSD, we apply the Procrustes (1.14) to all estimated points

to get the global coordinates.

To test the computation efficiency of Algorithm 7, data with various number of

points from 100 to 1000 are used. The noise factor nf is set to 1. The time and

accuracy results of sphere data are listed in Table 2.3.

The first column is the number of data points that are generated. The second

column is the number of convex subproblems that are solved in the step (2) of

Algorithm 7. The third column contains the total number of iteration in New-

ton method among all subproblems, i.e., the iteration number of the semismooth

Newton method (2.44). We can see that as the scale of problem going large, the

RMSD decreases since the radius of the sphere remains 1 and the density of points



ISSN NO: 1076-5131

Page No:3438

Table 2.3: Execution Time and Quality Results on Sphere Data

n Subproblem Total iteration RMSD Time

100 5 25 4.05E-02 5.80200 5 26 2.89E-02 11.90300 5 31 2.39E-02 23.05400 5 34 2.09E-02 41.14500 5 35 1.91E-02 57.75600 5 38 1.75E-02 86.01700 5 40 1.64E-02 103.57800 5 41 1.56E-02 144.18900 5 43 1.47E-02 170.941000 5 44 1.41E-02 209.07

is increasing. For problem with small scale, our method only takes seconds to get a

result with RMSD around 10−2. For problem with 1000 points, it takes Algorithm

7 around 3 minutes to solve it. Similar observation can be obtained for circle data

shown in Table 2.4.

Table 2.4: Execution Time and Quality Results on Circle Data

n Subproblem Total iteration RMSD Time

100 5 24 2.55E-02 5.13200 5 28 2.06E-02 11.97300 5 33 1.69E-02 22.15400 5 35 1.45E-02 38.11500 5 37 1.32E-02 58.39600 5 40 1.20E-02 77.61700 5 40 1.13E-02 100.70800 5 40 1.07E-02 124.75900 5 41 1.04E-02 144.651000 5 44 1.01E-02 186.20

To test the influence of noise factor nf on the accuracy of FITS, we vary nf from

0.1 to 0.5, the resulting RMSD on sphere and circle data are depicted in Figure

2.6. we can see that our algorithm achieved better RMSD on circle data than on

sphere data, and the RMSD on both data sets are less than 20% of the radius even

when the noise factor is as large as 0.5.



ISSN NO: 1076-5131

Page No:3439

Noise factor nf0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5

RM

SD

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

0.18

0.2

Sphere dataCircle data

Figure 2.6: Variation of RMSD with varying number of noise factor nf

2.6 Summary

In this section, we proposed a matrix optimization approach to the problem of

Euclidean distance embedding on a sphere. We applied the majorized penalty

method of Gao and Sun (2010) to the resulting matrix problem. A key feature we

exploited is that all subproblems to be solved share a common set of Euclidean

distance constraints with a simple distance objective function. We showed that

such problems can be efficiently solved by the Newton-CG method, which is proved

to be quadratically convergent under constraint nondegeneracy.

Constraint nondegeneracy is a difficult constraint qualification to analyze. We

proved it under a weak condition for our problem. We illustrated in Example

2.1 that this condition holds everywhere but one point (t = 0). This means that

constraint nondegeneracy is satisfied for t 6= 0. For the case t = 0, we can verify

(through verifying Lemma 2.9) that constraint nondegeneracy also holds. This

motivates our open question whether constraint nondegeneracy should hold under

a weaker condition.



ISSN NO: 1076-5131

Page No:3440

A. Shapiro. Sensitivity analysis of generalized equations. Journal of Mathematical

Sciences, 115(4):2554–2565, 2003.

S. Shekofteh, M. Khalkhali, M. Yaghmaee, and H. Deldari. Localization in wireless

sensor networks using tabu search and simulated annealing. In Computer and

Automation Engineering (ICCAE), 2010 The 2nd International Conference on,

volume 2, pages 752–757, Feb 2010.

D. Sun. The strong second-order sufficient condition and constraint nondegeneracy

in nonlinear semidefinite programming and their implications. Mathematics of

Operations Research, 31(4):761–776, 2006.

D. Sun and J. Sun. Semismooth matrix-valued functions. Mathematics of Opera-

tions Research, 27(1):150–169, 2002.

D. Sun, K.-C. Toh, and L. Yang. A convergent 3-block semi-proximal alternat-

ing direction method of multipliers for conic programming with 4-type of con-

straints. arXiv preprint arXiv:1404.5378, 2014.

In the numerical part, we used 4 existing embedding problems on a sphere to

demonstrate a variety of applications that the developed algorithm can be applied

to. The first two examples are from classical MDS and new features (wheel repre-

sentation for E1 and new clusters for E2) are revealed. For E3, despite the large

noises in the initial distance matrix, our method is remarkably able to recover the

Earth radius and to project accurate mapping of the 30 global cities on the sphere.

The last example is different from the others in that its inputs are the coordinates

of known points (rather than a distance matrix). Finding the best circle to fit

those points requires localization of its center and radius. The resulting visualiza-

tions are very satisfactory for all the examples. Since those examples are of small

scale, our method took less than 1 second to find the optimal embedding. Hence,

we omitted reporting such information.

REFERENCES

Ad Hoc Networking &Amp; Computing, MobiHoc ’03, pages 201–212, New York,

NY, USA, 2003. ACM.



ISSN NO: 1076-5131

Page No:3441

euclidean distance matrix (edm) based optimization … · distance matrix (edm) based optimization...

Documents