euclidean distance matrix (edm) based optimization … · distance matrix (edm) based optimization...
TRANSCRIPT
ABSTRACT
For spherical data representation, we reformulate the problem as an Euclidean dis-
tance matrix optimization problem with a low rank constraint. We then propose
an iterative algorithm that uses a quadratically convergent Newton-CG method
at its each step. We study fundamental issues including constraint nondegeneracy
and the nonsingularity of generalized Jacobian that ensure the quadratic conver-
gence of the Newton method. We use some classic examples from the spherical
multidimensional scaling to demonstrate the flexibility of the algorithm in incor-
porating various constraints.
For wireless sensor network localization, we set up a convex optimization model
using EDM which integrates connectivity information as lower and upper bounds
on the elements of EDM, resulting in an EDM-based localization scheme that
possesses both efficiency and robustness in dealing with flip ambiguity under the
presence of high level of noises in distance measurements and irregular topology
of the concerning network of moderate size.
This thesis is an accumulation of work regarding a class of constrained Euclidean
Distance Matrix (EDM) based optimization models and corresponding numerical
approaches. EDM-based optimization is powerful for processing distance informa-
tion which appears in diverse applications arising from a wide range of fields, from
which the motivation for this work comes. Those problems usually involve min-
imizing the error of distance measurements as well as satisfying some Euclidean
distance constraints, which may present enormous challenge to the existing algo-
rithms. In this thesis, we focus on problems with two different types of constraints.
The first one consists of spherical constraints which comes from spherical data rep-
resentation and the other one has a large number of bound constraints which comes
from wireless sensor network localization.
OPJS University,
Rajasthan.
GITM,Gurgaon.
Dr. Amit Jain Professor
Dinesh Gupta
Research Scholar
EUCLIDEAN DISTANCE MATRIX (EDM) BASED
OPTIMIZATION MODELS
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3415
Introduction
In this thesis, we focus on designing algorithms for a class of Euclidean Distance
Matrix (EDM) based optimization problems. In particular, we are interested in
EDM-based optimization problems with two types of constraints: spherical con-
straints and bound constraints. Let {x1, . . . ,xn} be n points in IRr, where r > 0
is known as the embedding dimension of those points. The primary information
that is available for those points is the measured Euclidean distances among them
dij ≈ ‖xi − xj‖, for some pairs (xi,xj), (1.1)
which may be incomplete or noisy, or both. The aim of EDM-based optimization is
to recover the (relative or global) coordinates of these points in a target space IRr
purely based on those available distances. Such problems are usually encountered
with cone constraints and rank constraints, which would bring nonsmoothness and
nonconvexity to the optimization model. So algorithms need to be designed for
solving the problems with specific constraints accurately and efficiently.
This chapter is split into three sections. In Section 1.1, we cover the background to
Euclidean Distance Matrix which is the fundamental concept of our modelling pro-
cess and algorithm design. In Section 1.2, we give an introduction to semismooth
Newton method that is the main approach to deal with spherical constraints. In
Section 1.3, we cover a novel convergent Alternating Direction Method with Mul-
tipliers (ADMM) which allows us to deal with large amount of bound constraints
in conic programming.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3416
1.1 Background on Euclidean Distance Matrix
Let Sn denote the space of n× n symmetric matrices equipped with the standard
inner product 〈A,B〉 = Tr(AB) for A,B ∈ Sn. Let ‖ · ‖ denote the induced
Frobenius norm. Let Sn+ denote the cone of positive semidefinite matrices in Sn
(often abbreviated as X � 0 for X ∈ Sn+). The so-called hollow subspace Snh is
defined by (“:=” means define)
Snh := {A ∈ Sn : diag(A) = 0} ,
where diag(A) is the vector formed by the diagonal elements of A. For subsets α,
β of {1, . . . , n}, denote Aαβ as the submatrix of A indexed by α and β (α for rows
and β for columns). Aα denotes the submatrix consisting of columns of A indexed
by α, and |α| is the cardinality of α. Throughout the thesis, vectors are treated
as column vectors. For example, xT is a row vector for x ∈ IRn. The vector e is
the vector of all ones and I denotes the identity matrix, whose dimension is clear
from the context. When it is necessary, we use In to indicate its dimension n.
Let ei denote the ith unit vector, which is the ith column of I. We also need the
following two important linear transformations.
The first one is Householder transformations, which are orthogonal transforma-
tions that describe reflections about hyperplanes containing the origin. Let
v := [1, . . . , 1, 1 +√
n]T = e+√nen.Then
Q = In −2
vTvvvT
n]T ∈
IRn.
The second one is the geometric centering transformation, which centers a set of
points at their geometric center. Consider a collection of n points in IRr, ascribed
to the columns of matrix X ∈ IRr×n, X = [x1,x2, . . . ,xn], xi ∈ IRr. The centroid
is the mean of all the points
√is the Householder transformation that maps e ∈ IRn to the vector [0, . . . , 0,−
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3417
xc =1
n
n∑i=1
xi =1
nXe.
By subtracting this vector from all the points in the set, we have the set of cen-
tralized points as
Xc = X − xceT = X(In −
1
neeT ).
Then the geometric centering transformation is defined as
J := In −1
neeT . (1.2)
We often use the following properties:
J2 = J, Q2 = I and J = Q
In−1 0
0 0
Q. (1.3)
1.1.1 Squared Euclidean Distance Matrix
A matrix D is a (squared) EDM if D ∈ Snh and there exist points {x1, . . . ,xn} in
IRr such that Dij = ‖xi − xj‖2 for i, j = 1, . . . , n. IRr is often referred to as the
embedding space and r is the embedding dimension when it is the smallest such
r. Consider the following example of EDM for the case n = 3.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3418
1.1.2 Characterizations of EDM
It is well-known that a matrix D ∈ Sn is an EDM if and only if
D ∈ Snh and J(−D)J � 0. (1.4)
The origin of this result can be traced back to Schoenberg (1935) and an inde-
pendent work by Young and Householder (1938). See also Gower (1985) for a
nice derivation of (1.4). Moreover, the corresponding embedding dimension is
r = rank(JDJ).
From the definition in (1.2), it is noted that the matrix J , when treated as an
operator, is the orthogonal projection onto the subspace e⊥ := {x ∈ IRn : eTx =
0}. Characterization (1.4) simply means that D is an EDM if and only if D ∈ Snh
and D is negative semidefinite on the subspace e⊥:
−D ∈ Kn+ :={A ∈ Sn : xTAx ≥ 0, ∀ x ∈ e⊥
}.
It follows that Kn+ is a closed convex cone (known as the almost positive semidefi-
nite cone). This gives us a window of using conic programming in dealing with dis-
tance related problems. Let ΠKn+(D) denote the orthogonal projection of D ∈ Sn
onto Kn+:
ΠKn+(D) := arg min ‖D − Y ‖ s.t Y ∈ Kn+.
A nice property is that this projection can be done through the orthogonal pro-
jection onto the positive semidefinite cone Sn+ and is due to Gaffke and Mathar
(1989)
ΠKn+(D) = D + ΠSn+(−JDJ) ∀ D ∈ Sn. (1.5)
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3419
The other formula for computing ΠKn+ is due to Hayden and Wells (1988, Thm.
2.1):
D ∈ Kn+ ⇐⇒ QDQ :=
D d
dT d0
and D ∈ Sn−1+ , (1.6)
and
ΠKn+(D) = Q
ΠSn−1+
(D) d
dT d0
Q, ∀D ∈ Sn. (1.7)
Because of (1.7), the cone Kn+ can be described as follows:
Kn+ =
Q Z z
zT z0
Q :Z ∈ Sn−1
+
z ∈ IRn−1 z0 ∈ IR
. (1.8)
Its polar cone (Kn+)◦ is then given by
(Kn+)◦ =
Q Z 0
0 0
Q : Z ∈ −Sn−1+
. (1.9)
We will use (1.5) for the implementation of our algorithm and (1.8) and (1.9) for
theoretical analysis.
1.1.3 Coordinates recovery from EDM
In this section, we mainly introduce process for recovering the coordinates of points
from EDM. If D is an EDM, from the definition introduced in section 1.1.1,
Dij = ‖xi − xj‖2 = (xi − xj)T (xi − xj) = xTi xi − 2xixj + xTj xj.
Let X ∈ IRr×n, X = [x1,x2, . . . ,xn] be a collection of n points, then
D = ediag(XTX)T − 2XTX + diag(XTX)eT , (1.10)
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3420
which is an obvious relation between coordinates of points X and the EDM D.
Define the matrix
G := XTX, (1.11)
which is always called Gram matrix. From Gower (1982), the set of coordinates
can be obtained through the decomposition:
− 1
2JDJ = XTX. (1.12)
We note that the decomposition is possible because the matrix (−JDJ) is positive
semidefinite according to (1.4).
The results in (1.4) and (1.12) are true when D is a true EDM. What should one do
ifD is not a true EDM? The most popular method is the classical Multidimensional
Scaling (cMDS) (Cox and Cox, 2000; Borg and Groenen, 2005), which simply
computes the nearest positive semidefinite matrix from (−JDJ) and is obtained
through the following optimization:
minY‖J(Y −D)J‖2 s.t. − JY J � 0 and Y ∈ Snh . (1.13)
The optimal solution is just the orthogonal projection of (−JDJ) onto Sn+ and is
denoted by ΠSn+(−JDJ)). cMDS then uses this projection in replace of (−JDJ) in
(1.12) to get the embedding points in X. This method is also known as principal
coordinate analysis by Gower (1966). We summarize the cMDS algorithm as
Algorithm 1. We need to point out here that cMDS works well when D is close to
a true EDM. Otherwise it may perform poorly in terms of embedding quality due
to the rank of Gram matrix being too high.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3421
Best Euclidean distance
embedding on a sphere
In this paper, we mainly discuss a class of EDM-based optimization problem with
spherical constraints for data representation on a sphere of unknown radius. This
problem arises from various disciplines such as Statistic (spatial data representa-
tion), Psychology (constrained multidimensional scaling), and Computer Science
(machine learning and pattern recognition). The best representation often needs
to minimize a distance function of the data on a sphere as well as to satisfy some
Euclidean distance constraints. As discussed in Section 1.1.4, those spherical and
Euclidean distance constraints will present an enormous challenge to the existing
algorithms. In this chapter, we introduce a reformulation of the problem as an
EDM-based optimization problem with a low rank constraint. We then propose an
iterative algorithm that uses a quadratically convergent Newton-CG method at its
each step. We study fundamental issues including constraint nondegeneracy and
the nonsingularity of generalized Jacobian that ensure the quadratic convergence
of the Newton method. We use some classic examples from the spherical multidi-
mensional scaling to demonstrate the flexibility of the algorithm in incorporating
various constraints.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3422
The section is organized as follows. In Section 2.1, we give a background and
literature review for spherical data representation problem. In Section 2.2, We
first argue that when the EDM is used to formulate the problem, it is necessary
to introduce a new point to represent the center of the sphere. This is due to a
special property arising from embedding an EDM. The algorithmic framework that
we use for the obtained non-convex matrix optimization problem is closely related
to the majorized penalty method of Gao and Sun (2010) for the nearest low-rank
correlation matrix problem. One of the key elements in this type of method is that
the subproblems are convex. Those convex problems are structurally similar to
a convex relaxation of the original matrix optimization problem and they all can
be solved by a quadratically convergent Newton-CG method. We establish that
this is the case for our problem by studying the challenging issue of constraint
nondegeneracy, which further ensures the nonsingularity of generalized Jacobian
used by the Newton-CG method. Those results can be found in Section 2.3 and
ensure that the extension of the majorization method of Gao and Sun (2010)
to our problem is complete. The algorithm is presented in Section 2.4 and its
key convergent results are stated without detailed proofs as they can be proved
similarly as in Gao and Sun (2010). Section 2.5 aims to demonstrate a variety
of applications from classical MDS to the circle fitting problem. The numerical
performance is highly satisfactory with those applications.
2.1 Introduction to spherical data representa-
tion
The problem that we are mainly concerned with is placing n points {x1, . . . ,xn}
in a best way on a sphere in IRr. The primary information that we use is an
incomplete/complete set of pairwise Euclidean distances (often with noises) among
the n points. In such a setting, IRr is often a low-dimensional space (e.g., r takes
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3423
2 or 3 for data visualization) and is known as the embedding space. The center
of the sphere is unknown. For some applications, the center can be put at origin
in IRr. Furthermore, the radius of the sphere is also unknown. In our matrix
optimization formulation of the problem, we treat both the center and the radius
as unknown variables. We develop a fast numerical method for this problem and
present a few of interesting applications taken from existing literature.
The problem described above has long appeared in the constrained Multi-Dimensional
Scaling (MDS) when r ≤ 3, which is mainly for the purpose of data visualization,
see Cox and Cox (2000, Sect. 4.6) and Borg and Groenen (2005, Sect. 10.3) for
more details. In particular, it is known as the spherical MDS when r = 3 and the
circular MDS when r = 2. Most numerical methods in this part took advantages
of r being 2 or 3. For example, two of the earliest circular MDS were by Borg and
Lingoes (1980) and Lee and Bentler (1980), where they introduced a new point
x0 ∈ IRr as the center of the sphere (i.e., circles in their case) and further forced
the following constraints to hold:
D01 = D02 = · · · = D0n.
Here D0j = ‖x0−xj‖, j = 1, . . . , n are the Euclidean distances between the center
x0 and the other n points. In their models, the variables are the coordinates of
the (n + 1) points in IRr. In Borg and Lingoes (1980), the optimal criterion was
a stress function widely used in MDS literature (see Borg and Groenen (2005,
Chp. 3)), whereas Lee and Bentler (1980) used a least square loss function as its
optimal criterion.
In the spherical MDS of Cox and Cox (1991), Cox and Cox placed the center of
the sphere at origin and represented the n points by their spherical coordinates.
Moreover, they also argued for the Euclidean distance to be used over the seem-
ingly more appropriate geodesic distance on the sphere. This is particularly the
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3424
case when the order of the distances among the n points are more important than
the magnitude of their actual distances. For the accurate relationship between
Euclidean distance and the geodesic distance on a sphere, see Pekalska and Duin
(2005, Thm. 3.23), which is credited to Schoenberg (1937). A recent method
known as MDS on a quadratic surface (MDS-Q) was proposed by De Leeuw and
Mair (2009), where geodesic distances were used. As noted in De Leeuw and Mair
(2009, p. 12), ”geodesic MDS-Q, however, seems limited for now to spheres in any
dimension, with the possible exception of ellipses and parabolas in IR2”. For the
spherical case, MDS-Q places the center at origin and the variables are the radius
and the coordinates of the n points on the sphere. The Euclidean distances were
then converted to the corresponding geodesic distances. The optimal criterion is
a weighted least square loss function.
When the center of the sphere is placed at origin, any point on the sphere satisfies
the spherical constraint of the type ‖x‖ = R, where x ∈ IRr and R is the radius.
Optimization with spherical constraints has recently attracted much attention of
researchers, see, e.g., Malick (2007); Ling et al. (2010); Gao (2010); Gao and Sun
(2010); Li and Qi (2011); Zhou et al. (2012) and the references therein. Such
a problem can be cast as a more general optimization problem over the Stiefel
manifold (Wen and Yin, 2013; Jiang and Dai, 2014). One important example is
the nearest low-rank correlation matrix problem, where the unit diagonals of the
correlation matrix yields the spherical constraints (Gao and Sun, 2010; Li and Qi,
2011; Wen and Yin, 2013; Jiang and Dai, 2014). It is noted that the sequential
second-order methods in Gao and Sun (2010); Li and Qi (2011) as well as the
feasibility-preserving methods in Wen and Yin (2013); Jiang and Dai (2014) all
rely on the fact that the radius is known (e.g., R = 1). This is in contrast to our
problem where R is a variable.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3425
2.2 EDM-based optimization formulation
The available information for us to find n points {x1, . . . ,xn} embedded on a
sphere in IRr is the set of approximate (squared) Euclidean distances among the
n points:
D0ij ≈ ‖xi − xj‖2, i, j = 1, . . . , n.
Denote the center of the sphere by xn+1 (the (n+ 1)th point) and its radius by R.
Since the n points are placed on the sphere, we must have
‖xj − xn+1‖ = R, j = 1, . . . , n.
Although we do not know the exact magnitude of R, we can be sure that twice
the radius cannot be bigger than the diameter of the data set:
2R ≤ dmax := maxi,j
√D0ij.
We therefore define the approximate distance matrix D ∈ Sn+1 by (only upper
part of D is defined)
Dij =
14d2
max i = 1, . . . , n, j = n+ 1
D0ij i < j = 2, . . . , n
0 i = j,
(2.1)
The elements in D are approximate Euclidean distances among the (n+ 1) points
{x1, . . . ,xn+1}. But D may not be a true EDM. Our purpose is to find the nearest
EDM Y to D such that the embedding dimension of Y is r and its embedding
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3426
points {x1, . . . ,xn} are on a sphere centered at xn+1. The resulting matrix opti-
mization model is then given by
minY ∈Sn+112‖Y −D‖2
s.t. Y ∈ Sn+1h , −Y ∈ Kn+1
+ , rank(JY J) ≤ r
Y1(n+1) = Yj(n+1), j = 2, . . . , n.
(2.2)
Once we find the nearest EDM Y from which the total deviation of D is the
smallest, combined with the classical MDS Algorithm 1, we will get the positions
of n embedding points.
Problem (2.2) is always feasible (e.g., the zero matrix is feasible). The feasible
region is closed and the objective function is coercive. Let Y be its optimal so-
lution. The first group of constraints in (2.2) implies that Y is an EDM with an
embedding dimension not greater than r. If r < n (i.e., rank(JY J) < n), the
problem is nonconvex. If r = n, then we can drop the rank constraint so that the
problem is convex. This is due to the fact that any EDM of size (n+ 1)× (n+ 1)
has an embedding dimension not greater than (n + 1 − 1) = n. One can easily
check that 0 is always an eigenvalue of JY J and e is the corresponding eigenvec-
tor. Therefore, the rank constraint is automatically satisfied if r = n. The second
group of constraints in (2.2) means that the distances from xi, i = 1, . . . , n to
xn+1 are equal. Hence, {x1, . . . ,xn} lie on a sphere centered at xn+1. We call the
constraints Y1(n+1) = Yj(n+1), j = 2, . . . , n spherical constraints and we note that
they are linear. This is in contrast to the nonlinear formulation of the spherical
constraints in the previous studies (Borg and Lingoes, 1980; Lee and Bentler, 1980;
Cox and Cox, 1991; De Leeuw and Mair, 2009).
Regarding to model (2.2), we have the following two remarks.
Remark 2.1. The idea of introducing a variable representing the center (i.e., one
more dimension in our formulation) is similar to that of Borg and Lingoes (1980);
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3427
Lee and Bentler (1980), whose main purpose was for the case r = 2 and the
variables of the optimization problems are the coordinates of the points concerned.
Our model is more general for arbitrary r and is conducive to (second-order)
algorithmic development because the spherical constraints are linear. Furthermore,
as introduce in Section 1.1.3, the actual embedding is left out as a separate issue,
which can be done by Algorithm 1, possibly through Procrustes analysis.
Remark 2.2. The following reasoning further justifies why it is necessary to intro-
duce a new point for the center of the sphere. Let D0 denote the true squared
Euclidean distance matrix among n points on a sphere. From Gower (1982), the
decomposition
− 1
2JD0J = XTX with X ∈ IRr×n, (2.3)
would provide a set of points {xi : i = 1, . . . , n} such that the distances in D0 are
recovered through D0ij = ‖xi − xj‖2. In order for those points to lie on a sphere
centered at origin, it is necessary and sufficient to enforce the constraints
‖x1‖ = ‖x2‖ = · · · = ‖xn‖. (2.4)
We note that
‖xi‖2 = eTi (XTX)ei = −1
2eTi JD
0Jei= D0
ii +1
2n
(eiD
0e+ eTD0ei)− eTD0e
2n2=
1
2n〈D0, Ai〉 −
eTD0e
2n2,
where Ai := eieT + eeTi . The spherical constraints are then equivalent to
〈D0, A1 − Ai〉 = 0, i = 2, · · · , n,
which are linear in the Euclidean distance matrix D0. It seems that there is no
need to introduce a new point to represent the center of the sphere. However, there
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3428
is a potential conflict in this seemingly correct argument. We note that there is
an implicit constraint we ignored. In (2.3), the embedding points in X have to
satisfy the centralization condition (because of the projection matrix J)
Xe = 0. (2.5)
A potential conflict is that the constraints (2.4) and (2.5) may be contradicting
to each other. Such possible contradiction can be verified through the following
example: Let D0 be from the tree points on the unit circle centered at origin:
x1 = (1, 0)T , x2 = (−1, 0)T , x3 = (0, 1)T .
There exists no X ∈ IR2×3 that satisfies (2.3) (hence (2.5)) and (2.4). Now we
define D by (2.1) and solves problem (2.2), we obtain the following 4 embedding
points:
z1 = (−1, 0.25)T , z2 = (1, 0.25)T , z3 = (0,−0.75)T , z4 = (0, 0.25)T .
The first three points are on the unit circle centered at z4. The original three
points x1, x2 and x3 can be obtained through the simple shift xi = zi − z4 (the
simplest Procrustes analysis). This example shows that it is necessary to introduce
a new point to represent the center in order to remove the potential confliction in
representing the spherical constraints as linear equations.
We now reformulate (2.2) in a more conventional format. By replacing Y by (−Y )
(in order to get rid of the minus sign before Kn+1+ ), we obtain
minY ∈Sn+112‖Y +D‖2
s.t. Y ∈ Sn+1h , Y ∈ Kn+1
+ , rank(JY J) ≤ r
Y1(n+1) = Yj(n+1), j = 2, . . . , n.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3429
Define three linear mappings A1 : Sn+1 → IRn+1, A2 : Sn+1 → IRn−1 and A :
Sn+1 → IR2n respectively by
A1(Y ) := diag(Y ), A2(Y ) :=(Y1(n+1) − Yj(n+1)
)nj=2
and A(Y ) :=
A1(Y )
A2(Y )
.
It is therefore that solving (2.2) is equivalent to solving the following problem
minY ∈Sn+112‖Y +D‖2
s.t. A(Y ) = 0, Y ∈ Kn+1+
rank(JY J) ≤ r.
(2.6)
We note that without the spherical constraints A2(Y ) = 0, the problem reduces
to the problem (1.21) studied in Qi and Yuan (2014). However, with the spherical
constraints, the analysis in Qi and Yuan (2014), especially for the semismooth
Newton-CG method developed in Qi (2013); Qi and Yuan (2014) is not valid any
more because it heavily depends on the simple structure of the diagonal constraints
A1(Y ) = 0. One of our main tasks in this section is to develop more general
analysis that covers the spherical constraints.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3430
−0.5 0 0.5
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Ekman Color Example
434445
465472
490
504537
555
584
600
610628651
674
(a) Circular fitting without any constraints
−0.5 0 0.5
−0.6
−0.4
−0.2
0
0.2
0.4
0.6
Wheel Representation of Ekman Color Example
434445
465
472
490
504537
555584
600
610
628
651674
(b) Circuclar fitting with pole constraints
Figure 2.1: Comparison between the two circular fitting of Ekman’s 14 color
problem with and without pole constraints.
Figure 2.1 (a) (the radius is R = 0.5354) is the resulting circular representation by
FITS with colors appearing on the circle one by one in order of their wavelength.
This figure is similar to De Leeuw and Mair (2009, Fig. 2), where more comments
on this example can be found. A pair of colors (i, j) are said opposing to each
other if their distance equals the diameter of the circle. That is
Yij = 4Y1(n+1), (2.58)
which means that the squared distance between opposing colors is fourfold of the
radius squared. This type of constraints is called “pole constraint”. An interesting
feature is that we assume that the first 7 colors are set to oppose the remaining
7 colors, the resulting circular representation appears as a nice wheel, without
having changed the order of the colors, see Figure 2.1(b) (the radius is 0.5310).
Practitioners in Psychology may have new interpretation of such nice representa-
tion. We emphasize that our method can easily include the pole constraints and
other linear constraints without any technical difficulties. We are not aware any
existing methods that can directly handle those extra constraints.
(E2) Trading globe. The data in this example was first mapped to a sphere (r =
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3431
Figure 2.2: Spherical representation for trading data in 1986 between countries
{Argentina, Australia, Brazil, Canada, China, Czechoslovakia, East Germany,
Egypt, France, Hungary, India, Italy, Japan, New Zealand, Poland, Sweden,
UK, USA, USSR, West Germany}.
3) in Cox and Cox (1991) and was recently tested in De Leeuw and Mair (2009).
The data was originally taken from the New Geographical Digest (1986) on which
countries traded with other countries. For 20 countries the main trading partners
are dichotomously scored (1 means trade performed, 0 trade not performed) as
shown in Table 2.2. Based on this dichotomous matrix X the distance matrix D0
is computed using the squared Jaccard coefficient (computed by the Matlab build-
in function pdist(X, ’jaccard’). The most intuitive MDS approach is to project
the resulting distances to a sphere which gives a “trading globe”.
In Figure 2.2 (R = 0.5428), the counties were projected on to a globe with the
shaded points being on the other side of the sphere. The figure is from the de-
fault viewpoint of Matlab. It is interesting to point out that obvious clusters of
countries can be observed. For example, on the top left is the cluster of Com-
monwealth nations (Australia, Canada, India, and New Zealand). On the bottom
right is the cluster of western allies (UK, US, and West Germany) with Japan
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3432
Tab
le2.2:
Nation
s’trad
ing
data
fromN
ewG
eographical
Digest
(1986)
Arge
00
10
00
00
00
01
10
00
10
01
Aust
00
00
10
00
00
00
11
00
10
11
Braz
10
00
00
00
00
00
10
00
10
01
Can
a0
00
01
00
00
00
01
00
01
01
0C
hin
01
01
00
00
00
00
10
00
01
00
Czec
00
00
00
01
01
00
00
10
01
00
Egy
p0
00
00
00
01
00
10
00
01
11
1E
.Ge
00
00
01
00
01
00
00
10
01
01
Fran
00
00
00
10
00
01
00
00
10
11
Hung
00
00
01
01
00
00
00
00
01
01
Indi
00
00
00
00
00
00
10
00
11
11
Ital1
00
00
01
01
00
00
00
00
00
1Jap
a1
11
11
00
00
01
00
10
01
00
0N
.Ze
01
00
00
00
00
00
10
00
10
10
Pola
00
00
01
01
00
00
00
00
01
01
Sw
ed0
00
00
00
00
00
00
00
01
01
1U
SA
11
11
00
10
10
10
11
01
00
11
USSR
00
00
11
11
01
10
00
10
00
00
U.K
01
01
00
10
10
10
01
01
10
01
W.G
e1
11
00
01
11
11
10
01
11
01
0
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3433
not far on above of them. On the north pole is China, which reflects its isolated
trading situation back in 1986. On the backside is the cluster of countries headed
by USSR. On the left backside is the cluster of Brazil, Argentina, Egypt. We note
that this figure appears different from those in Cox and Cox (1991); De Leeuw
and Mair (2009) mainly because that each used a different method on a different
(nonconvex) model of the spherical embedding of the data.
(E3) 3D Map of global cities in HA30 data set. HA30 is a dataset of spherical
distances among 30 global cities, measured in hundreds of miles and selected by
Hartigan (1975) from the World Almanac, 1966. It also provides XYZ coordinates
of those cities. In order to use FITS, we first convert the spherical distances to
Euclidean distances through the formula: dij := 2R sin(sij/(2R)) where sij is the
spherical distance between city i and city j and R = 39.59 (hundreds miles) is the
Earth radius (see Pekalska and Duin (2005, Thm. 3.23)). The initial matrix D0
consists of the squared distances d2ij. It is observed that the matrix (−JD0J) has
15 positive eigenvalues and 14 negative eigenvalues and 1 zero eigenvalue. There-
fore, the original spherical distances are not accurate and contain large errors.
Therefore, FITS is needed to correct those errors. We plot the resulting coordi-
nates of the 30 cities in Figure 2.3. One of the remarkable features is that FITS is
able to recover the Earth radius with high accuracy R = 39.5916.
We now assess the quality of the spherical embedding in Figure 2.3 through a
Procrustes analysis introduced in Section 1.1.3. The optimal objective f in (1.14)
is f = 0.2782. This small error is probably due to the fact that the radius used
in HA30 is 39.59 in contrast to ours 39.5916. This small value also confirms the
good quality of the embedding from FITS when compared to the solution in HA30.
(E4) Circle fitting. The problem of circle fitting has recently been studied in
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3434
Figure 2.3: Spherical embedding of HA30 data set with radius R = 39.5916.
Beck and Pan (2012), where more references on the topic can be found. Let points
{ai}ni=1 with ai ∈ IRr be given. The problem is to find a circle with center x ∈ IRr
and radius R such that the points stay as close to the circle as possible. Two
criteria were considered in Beck and Pan (2012):
minx, R
f1 =n∑i=1
(‖ai − x‖ −R)2 (2.59)
and
minx, R
f2 =n∑i=1
(‖ai − x‖2 −R2
)2. (2.60)
Problem (2.60) is much easier to solve than (2.59). But the key numerical message
in Beck and Pan (2012) is that (2.59) may produce far better geometric fitting
than (2.60). This was demonstrated through the following example Beck and Pan
(2012, Example 5.3):
a1 =
1
9
, a2 =
2
7
, a3 =
5
8
, a4 =
7
7
, a5 =
9
5
, a6 =
3
7
.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3435
Model (2.60) produces a very small circle, not truly reflecting the geometric layout
of the data.
The Euclidean distance embedding studied in this paper provides an alternative
model. Let D0ij = ‖ai − aj‖2 for i = 1, . . . , n and n = 6, r = 2 in this example.
Let Y be the final distance matrix from FITS and the embedding points in X be
obtained from (1.12). The first 6 columns {xi}6i=1 of X correspond to the known
points {ai}6i=1. The last column x7 is the center. The points {xi}6
i=1 are on the
circle centered at x7 with radius R (R =√Y 1(n+1)). We need to match {xi}6
i=1 to
{ai}6i=1 so that the known points stay as close to the circle as possible. This can
be done through the orthogonal Procrustes problem (1.14).
We first centralize both sets of points. Let
a0 :=1
n
n∑i=1
ai, ai := ai−a0 and x0 :=1
n
n∑i=1
xi, xi := xi−x0, i = 1, . . . , n.
Let A be the matrix whose columns are ai and Z whose columns are xi for i =
1, . . . , n. Solve the orthogonal Procrustes problem (1.14) to get P = UV T . The
resulting points are
zi := Pxi + a0, i = 1, . . . , n
and the new center, denoted by zn+1, is
zn+1 := P (xn+1 − x0) + a0.
It can be verified that the points {zi}ni=1 are on the circle centered at zn+1 with
radius R. That is
‖zi − zn+1‖2 = ‖P (xi − xn+1)‖2 = ‖xi − xn+1‖2 = R2.
This circle is the best circle from model (2.2) and is plotted in Figure 2.4 with the
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3436
−5 0 5 10
−4
−2
0
2
4
6
8
Circle Fitting
Figure 2.4: Circle fitting of 6 points with R = 6.5673. The known points and
their corresponding points on the circle by FITS are linked by a line.
pair of points {ai, zi} being linked by a line. When the obtained center x = zn+1
and R are substituted to (2.59), we get f1 = 3.6789, not far from the reported value
f1 = 3.1724 in Beck and Pan (2012). The circle fits the original data reasonably
well. The model used by Beck and Pan (2012) is nonconvex and the resulting f
highly depends on a good starting point while our algorithm solves a sequence of
convex relaxations that do not count on a good starting point. We complete this
example by noting a common feature between our model (2.2) and the squared
least square model (2.60) in that the squared distances are used in both models.
But the key difference is that (2.2) used all available pairwise squared distances
among ai rather than just those from ai to the center x as is in (2.60).
(E5) Synthetic data. In this part we generate random data to test the per-
formance of Algorithm 7 with growing dimension. Two types of data are used
as shown in Figure 2.5. The first one contains points randomly distributed on
a sphere and the second one contains points on a circle, both of them have the
radius 1, note that we do not actually use the radius as a given information in our
algorithm. The noise in the distance information is generated following a standard
framework as follows:
Dij = Dij × |1 + nf × randn|, i = 1, . . . , n, j = 1, . . . , n, (2.61)
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3437
10.5
0-0.5-0.5
0
0.5
-0.5
0
0.5
-1
1
(a) Points randomly distributed on a sphere
-1 -0.5 0 0.5 1-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
(b) Points randomly distributed on a circle
Figure 2.5: Synthetic data with n = 200 points randomly distributed
where Dij is the true Euclidean distance between points xi and xj, 0 6 nf 6 1
is the noise factor, randn is the standard normal random variable. The accuracy
measurement of the estimated positions is the root mean square distance (RMSD)
RMSD :=1√n
(n∑i=1
‖xi − xi‖2
) 12
, (2.62)
where xi is the estimated position and xi is the ground-truth position. Note that
without knowing some points position in advance, only relative positions can be
found. To calculate RMSD, we apply the Procrustes (1.14) to all estimated points
to get the global coordinates.
To test the computation efficiency of Algorithm 7, data with various number of
points from 100 to 1000 are used. The noise factor nf is set to 1. The time and
accuracy results of sphere data are listed in Table 2.3.
The first column is the number of data points that are generated. The second
column is the number of convex subproblems that are solved in the step (2) of
Algorithm 7. The third column contains the total number of iteration in New-
ton method among all subproblems, i.e., the iteration number of the semismooth
Newton method (2.44). We can see that as the scale of problem going large, the
RMSD decreases since the radius of the sphere remains 1 and the density of points
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3438
Table 2.3: Execution Time and Quality Results on Sphere Data
n Subproblem Total iteration RMSD Time
100 5 25 4.05E-02 5.80200 5 26 2.89E-02 11.90300 5 31 2.39E-02 23.05400 5 34 2.09E-02 41.14500 5 35 1.91E-02 57.75600 5 38 1.75E-02 86.01700 5 40 1.64E-02 103.57800 5 41 1.56E-02 144.18900 5 43 1.47E-02 170.941000 5 44 1.41E-02 209.07
is increasing. For problem with small scale, our method only takes seconds to get a
result with RMSD around 10−2. For problem with 1000 points, it takes Algorithm
7 around 3 minutes to solve it. Similar observation can be obtained for circle data
shown in Table 2.4.
Table 2.4: Execution Time and Quality Results on Circle Data
n Subproblem Total iteration RMSD Time
100 5 24 2.55E-02 5.13200 5 28 2.06E-02 11.97300 5 33 1.69E-02 22.15400 5 35 1.45E-02 38.11500 5 37 1.32E-02 58.39600 5 40 1.20E-02 77.61700 5 40 1.13E-02 100.70800 5 40 1.07E-02 124.75900 5 41 1.04E-02 144.651000 5 44 1.01E-02 186.20
To test the influence of noise factor nf on the accuracy of FITS, we vary nf from
0.1 to 0.5, the resulting RMSD on sphere and circle data are depicted in Figure
2.6. we can see that our algorithm achieved better RMSD on circle data than on
sphere data, and the RMSD on both data sets are less than 20% of the radius even
when the noise factor is as large as 0.5.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3439
Noise factor nf0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 0.5
RM
SD
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
Sphere dataCircle data
Figure 2.6: Variation of RMSD with varying number of noise factor nf
2.6 Summary
In this section, we proposed a matrix optimization approach to the problem of
Euclidean distance embedding on a sphere. We applied the majorized penalty
method of Gao and Sun (2010) to the resulting matrix problem. A key feature we
exploited is that all subproblems to be solved share a common set of Euclidean
distance constraints with a simple distance objective function. We showed that
such problems can be efficiently solved by the Newton-CG method, which is proved
to be quadratically convergent under constraint nondegeneracy.
Constraint nondegeneracy is a difficult constraint qualification to analyze. We
proved it under a weak condition for our problem. We illustrated in Example
2.1 that this condition holds everywhere but one point (t = 0). This means that
constraint nondegeneracy is satisfied for t 6= 0. For the case t = 0, we can verify
(through verifying Lemma 2.9) that constraint nondegeneracy also holds. This
motivates our open question whether constraint nondegeneracy should hold under
a weaker condition.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3440
A. Shapiro. Sensitivity analysis of generalized equations. Journal of Mathematical
Sciences, 115(4):2554–2565, 2003.
S. Shekofteh, M. Khalkhali, M. Yaghmaee, and H. Deldari. Localization in wireless
sensor networks using tabu search and simulated annealing. In Computer and
Automation Engineering (ICCAE), 2010 The 2nd International Conference on,
volume 2, pages 752–757, Feb 2010.
D. Sun. The strong second-order sufficient condition and constraint nondegeneracy
in nonlinear semidefinite programming and their implications. Mathematics of
Operations Research, 31(4):761–776, 2006.
D. Sun and J. Sun. Semismooth matrix-valued functions. Mathematics of Opera-
tions Research, 27(1):150–169, 2002.
D. Sun, K.-C. Toh, and L. Yang. A convergent 3-block semi-proximal alternat-
ing direction method of multipliers for conic programming with 4-type of con-
straints. arXiv preprint arXiv:1404.5378, 2014.
In the numerical part, we used 4 existing embedding problems on a sphere to
demonstrate a variety of applications that the developed algorithm can be applied
to. The first two examples are from classical MDS and new features (wheel repre-
sentation for E1 and new clusters for E2) are revealed. For E3, despite the large
noises in the initial distance matrix, our method is remarkably able to recover the
Earth radius and to project accurate mapping of the 30 global cities on the sphere.
The last example is different from the others in that its inputs are the coordinates
of known points (rather than a distance matrix). Finding the best circle to fit
those points requires localization of its center and radius. The resulting visualiza-
tions are very satisfactory for all the examples. Since those examples are of small
scale, our method took less than 1 second to find the optimal embedding. Hence,
we omitted reporting such information.
REFERENCES
Ad Hoc Networking &Amp; Computing, MobiHoc ’03, pages 201–212, New York,
NY, USA, 2003. ACM.
JASC: Journal of Applied Science and Computations
Volume VI, Issue VI, JUNE/2019
ISSN NO: 1076-5131
Page No:3441