joint intrinsic and extrinsic similarity for recognition ... · joint intrinsic and extrinsic...
TRANSCRIPT
Joint intrinsic and extrinsic similarity for
recognition of non-rigid shapes
Alexander Bronstein, Michael Bronstein and Ron Kimmel
March 12, 2007
Abstract
This paper explores the problem of similarity criteria between non-
rigid shapes. Broadly speaking, such criteria are divided into intrinsic
and extrinsic, the first referring to the metric structure of the objects
and the latter to the Euclidean coordinates of the points of which the
objects comprises. Both criteria have their advantages and disadvan-
tages; extrinsic similarity is sensitive to non-rigid deformations of the
shapes, while intrinsic similarity is sensitive to topological noise. We
present an approach unifying both criteria in a single measure. We
consider the tradeoff between the extrinsic and intrinsic similarity and
use it as a set-valued “distance” related to the notion of Pareto opti-
mality in economics. Numerical results demonstrate the efficiency of
our approach in cases where using solely extrinsic or intrinsic criteria
would fail.
1 Introduction
Most of us are familiar with the childhood Rock, Paper, Scissors game, in
which the players bend their fingers in different ways to make the hand re-
semble one of the three objects (rock, represented by a closed fist; paper –
by an open palm; or scissors – by the extended index and middle fingers). In
1
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
the context of pattern recognition, this example demonstrates the difficulty
of defining the similarity of non-rigid shapes: one may recognize the hand
shapes as objects they intend to imitate, while others may say that all these
“objects” are actually just deformations of the human hand. Using geometric
terminology, the first similarity criterion is extrinsic, i.e., looks at the prop-
erties of the shape related to the way it is embedded in the ambient space.
The second criterion, on the other hand, looks at the intrinsic properties of
the shape, described by its metric structure.
Extrinsic similarity of shapes has been widely addressed in computer vi-
sion, pattern recognition and computational geometry literature. Most of
the papers in these fields, either implicitly or explicitly, look at the prob-
lem of shape similarity from the extrinsic point of view (see, for example,
[22, 40, 36]). Here, while considering extrinsic similarity, we focus mainly on
a classical result for rigid object matching, called the iterative closest point
(ICP) method, introduced by Chen and Medioni [24] and Besl and McKay
[2]. ICP and its numerous variations [60, 42, 47] try to find a Euclidean
transformation between two surfaces, minimizing an extrinsic distance be-
tween them, usually a variant of the Hausdorff distance. In some sense, one
can think of ICP as finding the best possible rigid alignment between the
surfaces.
Another important class of extrinsic similarity methods is based on high-
order moments [56, 35, 32, 55]. The shape’s extrinsic geometry is represented
by a vector of its moments and these vectors are compared. Conceptually,
one can think of such methods as of a Fourier-like representation, where only
low-frequency components are used.
On the other hand, methods for the computation of intrinsic similarity
of shapes started penetrating into the computer vision community relatively
late. As the precursor of a recent trend in non-rigid shape analysis, we
can consider the paper of Schwartz et al. [54], in which a method for the
representation of the intrinsic geometry of the cortical surface of the brain
using multidimensional scaling (MDS) was presented. MDS is a family of
2
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
algorithms [3] commonly used in dimensionality reduction and data analysis
[51, 27, 57] and graph representation [45]. The idea of Schwartz et al. was ex-
tended by Elad and Kimmel [29], who proposed a non-rigid shape recognition
method based on Euclidean embeddings. The key idea is to map the metric
structure of the surfaces to a low-dimensional Euclidean space and compare
the resulting images (called canonical forms) in this space.1 Grossman et al.
extended this method to objects represented as voxels [34]. Applied to the
problem of three-dimensional face recognition, the canonical forms method
proved to be very efficient in being able to recognize identity of people while
being insensitive to their facial expressions [10, 19, 13]. Ling and Jacobs
used this method for recognition of articulated two-dimensional shapes [44].
Other applications were in texture mapping and object morphing [61, 15, 9],
mesh segmentation [38] and image matching [43].
One of the main disadvantages of the canonical forms is the fact that they
can represent the intrinsic geometry of the objects only approximately, as it
is generally impossible to find an isometry between a non-flat surface and a
flat Euclidean space. It was shown empirically in [58, 12, 11, 18] that using
spaces with non-Euclidean geometry, it is possible to obtain more accurate
representations. Memoli and Sapiro [46] showed how the representation er-
ror can be theoretically avoided by using to the Gromov-Hausdorff distance,
introduced by Mikhail Gromov in [33]. The Gromov-Hausdorff distance is
invariant to isometries, which makes it a useful tool for intrinsic shape com-
parison. Unfortunately, the computation of the Gromov-Hausdorff distance is
intractable, as its discretization results in a combinatorial problem. For this
reason, Memoli and Sapiro had to resort to an approximation, computing
1The the measurement of pairwise geodesic distances and the solution of the underlying
MDS problem are the two most computationally-intensive components of the canonical
forms method. The measurement of geodesic distance can be performed very efficiently
using the recently proposed parallel fast marching method [8]. The solution of MDS prob-
lem can be carried out efficiently using the multigrid framework [20, 21] or the vector
extrapolation methods [50]. This makes possible real-time applications like face recogni-
tion.
3
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
a different distance related to the Gromov-Hausdorff distance by a proba-
bilistic bound, rather than the Gromov-Hausdorff distance itself. Using the
fact that the Gromov-Hausdorff distance can be related to the distortion
of embedding one surface into another, we proposed an efficient and accu-
rate method for the computation of the Gromov-Hausdorff distance based
on a numerical core similar to MDS. This algorithm was named generalized
MDS (GMDS) [16, 14], and can be thought as a natural extension of pre-
vious works on isometric embedding into non-Euclidean spaces. GMDS was
used for the comparison of two-dimensional [7, 4] and three-dimensional [16]
shapes, face recognition [17], as well as problems of texture mapping and
shape manipulation [15, 9].
Another class of intrinsic similarity methods is based on the analysis of
spectral properties of the Laplace-Beltrami operator. Reuter et al. used
Laplace-Beltrami spectra, referring to them as “shape DNA”, for the char-
acterization of surfaces [49]. Since the Laplace-Beltrami operator is an in-
trinsic characteristic of the surface, it is insensitive to isometric deformations
[25, 48]. Intrinsic similarity criteria based on spectral analysis are intimately
related to methods used in manifold learning and data analysis; see e.g.
[52, 59, 1, 31, 26]. It appears, however, that such criteria are able to iden-
tify isospectral rather than isometric surfaces, and isospectrality is a weaker
property (two surfaces can be isospectral but not isometric, which implies,
following the beautiful expression of Kac [37], that “we cannot hear the shape
of the drum” [30]).
The choice of whether to use intrinsic or extrinsic similarity depends sig-
nificantly on the application. Giving the problem of face recognition as an
example, the intrinsic geometry captures the identity, while the extrinsic one
captures the expression. The drawback of extrinsic similarity is its sensitivity
to non-rigid deformations. Using our example, a gesture of the human hand
can be extrinsically more similar to a rock or scissors rather than another
hand. This make extrinsic criteria unsuitable for the analysis of non-rigid
objects possessing a high degree of flexibility. The intrinsic criterion, on the
4
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 1: Illustration of the difference between intrinsic and extrinsic similarity:
right and middle shapes are extrinsically similar while being intrinsically dissimilar
(the two shapes have different topology, since the fingers touch). Left and middle
shapes are intrinsically similar (one can be obtained from the other by means of a
near-isometric deformation), but extrinsically dissimilar.
other hand, is insensitive to deformations which can be approximated by
isometries, since isometries preserve the metric structure of the shape. How-
ever, intrinsic similarity is sensitive to topological noise and can be problem-
atic in cases where the objects are partially missing (for example, connecting
the fingers of the hand results in a significantly different intrinsic geometry
[6]). Consequently, it appears that two semantically similar shapes can be
substantially different both in their extrinsic and intrinsic geometries.
In this paper, we propose an approach for computing the similarity be-
tween non-rigid shapes trying to take the advantages of both intrinsic and
extrinsic similarity criteria, while avoiding their shortcomings. As an illus-
tration, one can think of fitting a rubber glove onto a hand. The extent to
which the rubber surface is stretched represents the intrinsic geometry dis-
tortion. The fit quality, or in other words, how close the glove is to the hand
5
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
surface, represents the extrinsic distance between the two objects. For two-
dimensional shapes, the motivation is the same, though less visual. Finding
an optimal tradeoff between the two distortions can be posed as a multicri-
terion optimization problem and related to the notion of Pareto optimality.2
The set of all Pareto optimal solutions can be represented as a set-valued
similarity criterion, which conceals much richer information than each of the
intrinsic and extrinsic criteria separately.
The rest of this paper is organized as follows. In Section 2, we intro-
duce the mathematical background and formulate the standard approaches
to measuring intrinsic and extrinsic similarity. In Section 3, we present our
approach for computing the joint intrinsic-extrinsic similarity using the for-
malism of Pareto optimality. In Section 4, we show the numerical framework
for computing this distance. Section 5 is dedicated to experimental results.
Section 6 concludes the paper.
2 Mathematical foundations
We model a non-rigid object as a pair (S, dS), where S is a two-dimensional
smooth compact connected and complete Riemannian manifold (possibly
with boundary), and dS : S × S → R is the geodesic metric measuring the
lengths of the shortest paths on the manifold, induced by the Riemannian
structure. For brevity of notation, we will write simply S, implying (S, dS).
Without loss of generality, we assume that our objects are embedded into R3.
The Euclidean coordinates of S can be represented by the map XS : S → R3,
where XS(s) is a vector of Euclidean coordinates of the point s ∈ S.
We broadly refer to properties described in terms of the metric dS as to
the intrinsic geometry of S, and to properties associated with the Euclidean
2Such an approach was first introduced in [5, 6] as a generalization of the approach
of Latecki et al. [41] in a more restricted setting of the problem, in which an intrinsic
criterion was used to measure partial similarity of shapes, and the tradeoff was between
the sizes of the parts cropped out of the surfaces (referred to as partiality) and the intrinsic
similarity between them.
6
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
coordinatesX as the extrinsic geometry. From the intrinsic point of view, two
objects S and Q are similar (isometric) if there exists a bijective distance-
preserving map (isometry) between S and Q. An isometry from S to itself is
called a self-isometry. The collection of all the self-isometries forms a group
with the function composition operator and is referred to as the isometric
group, denoted by Iso(S). More generally, S and Q are said to be ǫ-isometric
if there exists an ǫ-surjective map ϕ : S → Q (i.e., dQ(q, f(S)) ≤ ǫ for all
q ∈ Q), which has distortion
disϕ = sups,s′∈S
|dS(s, s′) − dQ(ϕ(s), ϕ(s′))| = ǫ.
Such an ϕ is called an ǫ-isometry. Isometries are different from ǫ-isometries.
Particularly, an isometry is always bi-Lipschitz continuous [23], which is not
necessarily true for an ǫ-isometry. If we relax the requirement of ǫ-surjectivity
by demanding that ϕ has only disϕ ≤ ǫ, we refer to such ϕ as an ǫ-isometric
embedding.
2.1 Extrinsic distances
The basic tool in extrinsic shape comparison is the Hausdorff distance, mea-
suring the distance between two sets of points X and Y in R3,
dR3
H (X,Y ) = max
supx∈X
dR3(x, Y ), supy∈Y
dR3(y,X)
,
where dR3(y,X) = infx∈X ‖y − x‖2 denotes the distance between the set X
and the point y. A non-symmetric version of the Hausdorff distance,
dR3
NH(X,Y ) = supx∈X
dR3(x, Y ), (1)
is often preferred since it allows for partial comparison of surfaces. The
Hausdorff distance (both its symmetric and non-symmetric versions) regards
the objects as sets of points in R3, is completely extrinsic, and consequently,
is sensitive to deformations of the shapes. It depends not only on non-rigid
7
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
but also on rigid transformations of shapes; for example, translating one
shape with respect to another changes the Hausdorff distance.
ICP-type methods try to get rid of this alignment ambiguity by minimiz-
ing the Hausdorff distance between the extrinsic coordinates XS and XQ of
the surfaces S and Q over all the isometries in R3 (the isometries in R
3 are
rigid transformations, including rotations, translations and reflections),
dICP(S,Q) = infi∈Iso(R3)
dR3
H (i XS , XQ).
Though theoretically a global minimum should be searched, in practice, local
optimization methods are usually employed, and thus, the ICP algorithm is
usually liable to converge to a suboptimal solution (local minimum). Though
resolving the rigid transformations ambiguity, ICP methods are still sensitive
to non-rigid deformations.
Another approach for extrinsic geometry comparison is based on geomet-
ric moment computation. For a surface in R3, a (p, q, r)-moment is given
by
mpqr(S) =
∫
S
(X1S(s))p(X2
S(s))q(X3S(s))rds, (2)
where X iS denotes the ith coordinate of XS . First and second order moments
(i.e., moments with p+q+r = 1 and 2) are responsible for surface alignment.
The vector (m100,m010,m001)T is the center of gravity of the surface extrinsic
coordinates XS . The matrix
Σ =
m200 m110 m101
m110 m020 m011
m101 m011 m002
(3)
contains information about the variance of XS : its eigenvectors are the di-
rections in which the variance of XS is the maximum. The surface can be
aligned by performing a translation which eliminates the first order moments
and applying a rotation which eliminates the mixed second order moments
(off-diagonal elements of Σ). In order to compare two surfaces, each of them
8
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
is first aligned in this way, and then a Hausdorff-like distance is computed
between the aligned extrinsic coordinates.
It is possible to use high-order geometric moments to compare the sur-
faces. High order moments can be regarded as decomposition coefficients of
the extrinsic coordinates of S in some basis. This can be seen by rewriting
the definition of mpqr as
mpqr(S) =
∫
R3
φpqr(x)ξ(x)dx = 〈φpqr, ξ〉, (4)
where φpqr(x) = (x1)p(x2)q(x3)r, and ξ(x) is the characteristic (indicator)
function of S, defined here a superposition of Dirac’s delta functions, tak-
ing the value of “infinity” for x ∈ XS and zero elsewhere in R3. The set
ψpqr∞p,q,r=0 spans the square integrable functions on R
3, hence, the set of
coefficients mpqr∞p,q,r=0 is a unique descriptor of a given surface, invariant
to rigid isometries if proper alignment is performed. Such a description is
complete, that is, at least theoretically, one can recover the surface from
mpqr∞p,q,r=0. The difference between two sets of coefficients corresponding
to two surfaces can be used as an extrinsic similarity criterion,
dMOM(S,Q) =∑
p,q,r
|mpqr(S) −mpqr(Q)|2. (5)
In practice, we can compute only a finite number of moments up to some
order p + q + r = P . Theoretically, the more mpqr’s we take, the better we
can describe our object, however, since high powers tend to amplify noise,
computation of high-order moments is numerically unstable.
2.2 Intrinsic distances
Canonical form-type methods attempt to formulate the problem of compari-
son of intrinsic geometry (invariant to isometric deformation) as the problem
of extrinsic geometry comparison, which is relatively easy to compute. The
approach consists of two stages. First, an extrinsic representation of the
intrinsic geometry of the shapes in some common metric space (X, dX) is
9
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
constructed by means of isometric embedding, attempting to find two maps
ϕ : S → X and ψ : Q → X with minimum distortions dis ϕ and dis ψ.
The embedding, in a sense, allows to “undo” all the isometric deformations
of the objects (though, some degree of ambiguity stemming from isometries
in X still remains). Typically, X is selected as the Euclidean space, though
other choices are possible [28, 58, 12, 11, 18]. There are a few criteria for the
embedding space choice. First, it is desired that X is homogeneous and its
isometric group is simple, in order to reduce the number of degrees of freedom
in the definition of the canonical forms. Secondly, an analytic expression for
dX allows a numerically relatively simple computation using MDS algorithms
[3]. Finally, it appears that for some classes of shapes, certain embedding
spaces are generally more suitable (see, for example, experimental results in
[12, 11, 18]).
The second stage is extrinsic comparison of the canonical forms ϕ(S)
and ψ(Q), using, for example, the ICP distance dICP(ϕ(S), ψ(Q)). Since
achieving a true isometric embedding is usually impossible [45], the canonical
forms are only an approximate representation of the intrinsic geometry of the
shapes.
The problem of inaccuracy introduced by the embedding into X can be
resolved if we do not assume a given embedding space, but instead, include X
as a variable into the optimization. We can always find a sufficiently compli-
cated metric space into which both S and Q can be embedded isometrically,
and compare the images using the Hausdorff distance,
dGH(S,Q) = infX
ϕ:S→X
ψ:Q→X
dX
H(ϕ(S), ψ(Q)),
(here ϕ and ψ are assumed to be isometric embeddings). Such an approach
is commonly referred to as Gromov-Hausdorff distance [33]. For compact
surfaces, the Gromov-Hausdorff distance can be expressed in terms of the
10
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
distortion obtained by embedding one surface into another,
dGH(S,Q) =1
2inf
ϕ:S→Q
ψ:S→Q
maxdisϕ, disψ, dis (ϕ, ψ).
Here, dis (ϕ, ψ) = sups∈S,q∈Q |dS(s, ψ(q))− dQ(q, ϕ(s))|. The computation of
the distortions is performed using GMDS, a procedure similar in its spirit
to MDS, but not limited to spaces with analytically expressed geodesic dis-
tances.
The Gromov-Hausdorff distance is a metric on the quotient space of non-
rigid objects under the isometry relation. Particularly, this implies that
dGH(S,Q) = 0 if and only if S and Q are isometric. More generally, if
dGH(S,Q) ≤ ǫ, then S and Q are 2ǫ-isometric and conversely, if S and Q are
ǫ-isometric, then dGH(S,Q) ≤ 2ǫ [23].
3 Combining intrinsic and extrinsic similari-
ties
Let us now return to our example of glove fitting. Assume that Q is the
hand surface, and S is the glove we wish to fit. Our goal is to find such a
deformation of S, denoted hereinafter by S ′ = f(S), that is similar the most
to Q in the extrinsic sense, yet, at the same time preserves its intrinsic geom-
etry as much as possible. In a generic setting, we assume that Q (referred
to as probe) is obtained from S (model) by means of some deformation (not
necessarily isometric). We allow for the possibility that the topology of Q is
different from that of S due to the presence of noise or as the result of the
deformation (e.g, in the glove fitting example, the fingers of the hand can
touch each other).
In order to quantify the intrinsic and extrinsic similarity, we define a
generic extrinsic distance dE, and a generic intrinsic distance dI. We require
that dE(S,Q) = 0 if S and Q are extrinsically similar (i.e., XS = XQ)
and dI(S,Q) = 0 if S and Q are intrinsically similar (isometric). Since the
11
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
deformation f gives a one-to-one correspondence between S and S ′, we can
define the mapping ϕ : S → S ′ copying s ∈ S to f(s) on S ′. Fixing ϕ
and setting ψ = ϕ−1 in the Gromov-Hausdorff distance, yields the following
intrinsic distance,
dI(S,S′) = dis ϕ = sup
s,s′∈S
|dS(s, s′) − dS′(ϕ(s), ϕ(s′))|. (6)
Note that the correspondence between the surfaces is now fixed and does not
participate anymore in the minimization. dI(S,S′) defined this way measures
the distortion in the intrinsic geometry of S introduced by the extrinsic
deformation f . However, the geodesic distances in S ′ have to be re-computed
every time the deformation changes.
As the extrinsic distance, we use the non-symmetric Hausdorff distance,
which can be expressed as follows,
dE(Q,S ′) = dR3
NH(XQ, XS′) = sups′∈S′
‖XS′(s′) −XQ(q∗(s′))‖2, (7)
where q∗(s′) = arg minq∈Q ‖XS′(s′) − XQ(q)‖2 is the closest point to s′ on
Q. We write q∗ assuming that the infimum is achieved. Since in practice we
work with discrete objects, this assumption always holds. The set of closest
points q∗(S ′) has to be recomputed every time the deformation changes. The
lack of symmetry of dE(Q,S ′) allows for partial matching.
3.1 Pareto optimality and set-valued distances
Since it is usually impossible to say which of the two criteria is more impor-
tant, we judge the similarity as a tradeoff between them: the joint intrinsic-
extrinsic similarity between two shapes can be quantitatively represented by
the extent to which we have to modify the extrinsic geometry in order to
make the two shapes intrinsically similar, or alternatively, extent to which
we have to modify the intrinsic geometry in order to make the two shapes
extrinsically similar. This, in turn, can be formulated as a multicriterion
optimization problem, in which we bring to minimum with respect to S ′ the
12
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
vector objective function Φ(S ′) = (dI(S′,S), dE(S ′,Q)). Unlike optimiza-
tion with a scalar objective, we cannot define unambiguously the minimum
of Φ, since there does not exist a total order relation between the criteria
– we cannot say, for example, whether it is better to have Φ = (0.5, 1) or
Φ = (1, 0.5). At the same time, we have no doubt that Φ = (0.5, 0.5) is
better than Φ = (1, 1), since both criteria have smaller values. This allows
us define a minimizer of our vector objective Φ as an S∗, such that there is
no any other S ′ for which Φ(S ′) is better that Φ(S∗) (which can be expressed
as a vector inequality Φ(S∗) > Φ(S ′)). Such an S∗ is called non-inferior or
Pareto optimal. Pareto optimum is not unique; we denote by Ω∗ the set of
S∗ that satisfy the above relation. The corresponding values of Φ(Ω∗) are
referred to as a Pareto frontier and can be visualized as a planar curve.
In [12, 6], Bruckstein and the authors proposed considering the entire
Pareto frontier as a criterion of similarity, of which we can think as a gener-
alized, set-valued distance. In our case, this distance measures the tradeoff
between the intrinsic and extrinsic similarity. We denote dP(S,Q) = Φ(Ω∗)
and call it the Pareto distance. Our approach generalizes the similarity cri-
teria based purely on extrinsic or intrinsic geometry. Asserting dI(S,S′) = 0,
our approach in a sense can be considered as a generalization of ICP, since
we now allow for non-rigid isometries and not only for the rigid ones. If we
further restrict the family of transformations to the rigid ones, the traditional
ICP method is obtained.3 On the other hand, if we require dE(Q,S ′) = 0,
the probe is forced to be attached to the model surface, which boils down to
a problem similar to GMDS.
3.2 Salukwadze distances
As we argued in [12, 6], a set-valued distance may be inconvenient to use
since only a partial order relation exists between the Pareto frontiers – given
two set-valued distances, we cannot in general say which of them is “smaller”,
3Polyhedral surfaces are known to be rigid, hence, in this case, the only isometries are
the rigid ones. Allowing for dI(S,S ′) > 0 makes the class of deformations non-trivial.
13
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
unless one curve is entirely above or below the other. In multicriterion op-
timization theory, there is a known way to convert a vector optimization
problem into a scalar one. It has been researched mainly in the field of op-
timal control [53] by Salukwadze. Among all the Pareto optima, we cannot
prefer any since they are non-comparable: we cannot say which Pareto opti-
mum is better. However, ideally we want to bring both of our criteria to zero,
that is, achieve some “utopia point” (0, 0). Salukwadze proposed choosing a
single point on the Pareto frontier the closest to the utopia point, in sense of
some distance.
Using this idea, we define the Salukwadze distance as
dSAL(S,Q) = inf(ǫ1,ǫ2)∈dP(S,Q)
‖(ǫ1, ǫ2)‖R2+, (8)
where ‖ · ‖R2+
is some norm on R2+. In the following, we will consider a family
of norms ‖(ǫ1, ǫ2)‖λ = ǫ1 + λǫ2, for λ > 0. The Salukwadze distance in this
case can be written as
dSAL(S,Q) = minS′
dI(S,S′) + λdE(S ′,Q). (9)
We can intuitively think of dSAL as of a tradeoff between the two criteria.
Different selections of the multiplier λ attribute importance either to dI or
dE, and gives us different points on the Pareto frontier.
4 Numerical framework
For practical computations, we work with discretized shapes. The surface S
is sampled at N points SN = s1, ..., sN ⊆ S, constituting an r-covering,
i.e., S =⋃N
n=1BS(sn, r) (here, BS(s, r) is a metric ball of radius r around
s). The extrinsic coordinates of SN are represented as an N × 3 matrix
XSN, each row of which corresponds to a point XS(si) ∈ R
3. The discrete
shape is represented as a triangular mesh; each triangle is a triplet of indices
of vertices belonging to it. The maximum length of an edge is r. Vertices
connected by an edge (or in other words, belonging to the same triangle) are
14
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
said to be adjacent ; we describe the adjacency by the set E of all adjacent
pairs of vertices in SN . The geodesic distances on SN are approximated using
the fast marching method [39], forming an N ×N matrix DSN.
Assuming the deformed surface S ′N = f(SN) maintains the connectivity
of SN , we can formulate the following minimization problem with respect to
the N × 3 matrix X of the extrinsic coordinates of S ′N :
dSAL(SN ,QM) = minX
1
N2
N∑
i,j=1
(dSN(si, sj) − dij(X))2 +
λ
N
N∑
i=1
d2(xi,QM)
(10)
where dij(X) = dS′
N(s′i, s
′j) denote the geodesic distances on S ′
N , and d(xi,QM)
denotes the Euclidean distance from the point xi to the discretized surface Q.
The first term of the above cost function is the discretization of dI, whereas
the second term is the discretization of dE. In the sequel, we show how to
compute these two terms and their derivatives with respect to X, required
for the minimization of (10).
4.1 Intrinsic distance computation
The main challenge in the computation of the intrinsic distance term dI is
the necessity to evaluate the geodesic distances on S ′N and their derivatives
with respect to the extrinsic geometry of S ′N changing at each iteration of
the minimization algorithm. The simplest remedy would be to modify the
intrinsic term by restricting i and j to be neighboring points only,
dI ≈1
|E|
∑
(i,j)∈E
(dSN(si, sj) − dS′
N(s′i, s
′j))
2. (11)
This way, we use only the local distances on S ′N , which can be approximated
as the Euclidean distances dS′
N(s′i, s
′j) = ‖xi−xj‖. However, such a modifica-
tion makes dI significantly less sensible to large deformations of SN . Indeed,
many deformations change the local distance only slightly, while introducing
large distortion to larger distances.
15
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
In order to penalize for such deformations of SN , we need to approximate
the full matrix of geodesic distances on S ′N . We first define the matrix D(X)
of local distances, whose elements are, as before,
dij(X) =
‖xi − xj‖ : (i, j) ∈ E
0 : (i, j) /∈ E.(12)
Using the Dijkstra algorithm, we compute the set of shortest paths between
all pairs of points (i, j). For example, let Pij = (i, i1), (i1, i2), ..., (in−1, in), (in, j) ⊂
E be the shortest path between the points i and j. Its length is given by
L(Pij) = di,i1 + di1,i2 + ... + din,j, which is a linear combination of the el-
ements of D(X). We can therefore “complete” the missing entries in the
matrix D(X) by defining the matrix of global distances
D(X) = I(D(X))D(X), (13)
where I is a sparse fourth order tensor, with the elements Iijkl = 1 if the
edge (k, l) is contained in the shortest path Pij, and 0 otherwise. Note that I
depends on the connectivity E, which is assumed to be fixed, and the matrix
of local distances D, which, in turn, depends on X.
It is straightforward to verify that the entries of D(X) and D(X) coincide
for all (i, j) ∈ E. The global distance matrix D(X) constitutes an approxi-
mation of the geodesic distances on the surface S ′N , dij ≈ dS′
N(s′i, s
′j), while
having a simple linear form in terms of the local distances. A consistent and
more accurate estimate of dS′
Ncan be produced by replacing the Dijkstra
algorithm with Fast Marching and allowing the shortest paths to pass on
the faces of the mesh representing S ′N . However, such an approach results in
more elaborate expressions and will not be discussed here.
In order to compute the derivative of D(X) with respect to X, we assume
that a small perturbation dX of X does not change the connectivity of the
points on S ′N , and as the results, the trajectory of the shortest paths between
the points on S ′N remains constant (though their length may change). Thus,
we may write
D(X + dX) = I(D(X + dX))D(X + dX) = I(D(X))D(X + dX),
16
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
and compute the derivative of D(X) as the derivative of the linear form
I(X)D(X). If I(D(X)) 6= I(D(X+dX)), the assumption does not hold and
the derivative of D usually does not exist. Yet, the derivative of I(X)D(X)
belongs to the sub-gradient set of D at the point X. This is sufficient for
many minimization algorithms to work correctly.
The intrinsic distance term can be readily written in terms of D as the
Frobenius norm
dI(X) =1
N2‖D(X) −DSN
‖2F
=1
N2trace((D(X) −DSN
)T(D(X) −DSN)). (14)
Its derivative with respect to X is given by
∂dI(X)
∂X=
2
N2(D(X) −DSN
)T∂D(X)
∂X, (15)
where
∂dij(X)
∂X=
∑
k,l
Iijkl∂dkl(X)
∂X, (16)
and the elements of ∂dkl(X)∂X
are given by
∂dkl∂xmn
=1
dkl
xmk − xml : n = k
xml − xmk : n = l
0 : n 6= k, l
(17)
for m = 1, 2, 3.
4.2 Extrinsic distance computation
The computation of the extrinsic distance is similar to the one used in ICP
algorithms, where the main difficulty arises from the need to re-compute the
closest points each time the extrinsic geometry of S ′N changes. The extrinsic
distance term can be written as
dE(X) =1
Ntrace ((X − Y ∗(X))(X − Y ∗(X))T) (18)
17
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
where Y ∗(X) denotes the N×3 matrix, whose i-th row y∗i is the closest point
on Q corresponding to xi. The closest points y∗i are computed as a weighted
average of the points on Q, which are the closest to xi. The weights are
selected in inverse proportion to the distance from xi.
In ICP algorithms, it is common to assume that Y ∗(X + dX) ≈ Y ∗(X).
By fixing Y ∗, dE(X) becomes a simple quadratic function, and its derivative
can be written as
∂dE(X)
∂X=
2
N(X − Y ∗(X))T. (19)
4.3 Iterative minimization algorithm
For X in the neighborhood of some X0, the cost function that needs to be
minimized can be approximated as
σ(X) ≈1
N2trace (D(X)TD(X) − 2DT
SND(X) +DT
SNDSN
)
+λ
Ntrace (XXT − 2Y ∗(X0)X
T + Y ∗T(X0)Y∗(X0)), (20)
where D(X) = I(X0)D(X).Like in ICP, after finding a new X which de-
creases σ(X), the closest points Y ∗ and the operator I are updated. The
iterative minimization algorithm can be summarized as follows:
Compute the closest points Y ∗(X).1
Compute the shortest paths between all pairs of points on S ′N and2
assemble I.
Update X by ∆X such that X + ∆X sufficiently decreasing the cost3
function (20).
If the change in X is small, stop. Else, go to Step 1.4
The update of X in Step 3 can be safeguarded by evaluating the true cost
function (with I(X+∆X) and Y ∗(X+∆X) instead of I(X) and Y ∗(X)). In
our implementation, no safeguard was used, and the minimization in Step 3
was done using conjugate gradients.
18
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
The initialization of the algorithm can be done in several ways, the sim-
plest of which is X = XSN. This choice works well when the extrinsic dissim-
ilarity between S and Q is not too large; for large dE(S,Q), the algorithm
will suffer from poor convergence similar to most ICP methods. Another
choice is to initialize X by the corresponding points on QM resulting from
the solution of the GMDS problem. This choice is suitable for objects having
sufficiently similar intrinsic geometries, making the intrinsic correspondence
computed by GMDS meaningful.
5 Results
5.1 Discriminative power
In order to demonstrate the discriminative power of the proposed approach,
we measured the joint intrinsic-extrinsic similarity quantified in terms of
the Salukwadze distance dSAL between a unit spherical patch and spherical
patches of varying radii. The surfaces were discretized at 150 points. Figure 2
shows that our distance is capable of distinguishing between patches with
curvature varying by as little as 0.5%. A change in curvature by 1% shows a
one order of magnitude increase in the measured distance.
5.2 Topology changing deformations
In this experiment, we used the data set of four objects with four non-rigid
deformations. While the first three deformations of each object were nearly
isometric, the fourth one introduced topological changes modeled by welding
points on the mesh (Figure 3).
Three types of distances were computed: a non-symmetric approxima-
tion of the Gromov-Hausdorff distance, a non-symmetric approximation of
the Hausdorff distance, and the Salukwadze distance dSAL. The Gromov-
Hausdorff distance was computed using GMDS with 100 points embedded
into a surface represented with 1000 points. The Hausdorff distance was
19
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 2: Normalized Salukwadze distance dSAL computed between a unit spher-
ical patch and spherical patches of varying radii.
computed using ICP with meshes sampled at 1000 points. The Salukwadze
distance was computed on meshes sampled at 100 points. From Figures 4–6,
it appears that while the intrinsic similarity is nearly invariant to isometric
deformations, it is sensitive to topological changes, which introduce signifi-
cant distortions in the intrinsic geometry. On the other hand, the extrinsic
geometry is insensitive to topological changes, yet sensitive to intrinsic de-
formation. The joint similarity criterion combines both advantages, and is
insensitive to both isometries and topological changes.
6 Conclusions
We presented a new approach for the computation of non-rigid shape sim-
ilarity as a tradeoff between extrinsic and intrinsic similarity criteria. Our
approach can be illustratively presented as deforming one shape in order to
make it the most similar to another from the extrinsic point of view, while
trying to preserve as much as possible its intrinsic geometry. The joint intrin-
sic and extrinsic similarity appears to be advantageous over traditional purely
extrinsic or intrinsic similarity criteria. While extrinsic similarity is sensitive
20
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
to strong non-rigid deformations of the shapes and intrinsic similarity is sen-
sitive to topology changes resulting from noise or non-rigid deformations, our
joint similarity criterion allows to gracefully handle such problems. Exper-
imental results prove that it can be used in situations where intrinsic and
extrinsic similarities fail.
From the perspective of prior work, our approach can be thought as a
generalization of ICP and GMDS methods. ICP methods compute extrin-
sic similarity of shapes by means of finding a rigid isometry minimizing the
extrinsic distance between them. Our method extends the class of transfor-
mations, allows for non-rigid isometries and near-isometries. Using GMDS,
intrinsic similarity of shapes is computed as the degree of distortion when
trying to embed one shape into another. This problem can be regarded as a
particular case of our approach in which we constraint the extrinsic distance
between the two surfaces to be zero. The joint similarity is also related to
our previous papers with Alfred Bruckstein on set-valued distances, since its
computation can be formulated as a multicriterion optimization problem, in
which we try to simultaneously minimize the intrinsic and extrinsic distances.
The numerical framework presented in this paper extends beyond shape
similarity problems. As a byproduct of our similarity computation, we obtain
non-rigid alignment of two shapes. Additional potential applications are
inverse problems arising in shape reconstruction. Our intrinsic and extrinsic
distances can be used as priors for regularization of such problems.
References
[1] M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality re-
duction and data representation, Neural Computation 13 (2003), 1373–
1396.
[2] P. J. Besl and N. D. McKay, A method for registration of 3D shapes,
IEEE Trans. PAMI 14 (1992), 239–256.
21
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[3] I. Borg and P. Groenen, Modern multidimensional scaling - theory and
applications, Springer-Verlag, Berlin Heidelberg New York, 1997.
[4] A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, and R. Kimmel,
Analysis of two-dimensional non-rigid shapes, IJCV, submitted.
[5] , Paretian similarity of non-rigid objects, Scale Space, accepted.
[6] , Partial similarity of objects, or how to compare a centaur to a
horse, IEEE Trans. PAMI, submitted.
[7] , Matching two-dimensional articulated shapes using generalized
multidimensional scaling, Proc. AMDO, 2006, pp. 48–57.
[8] A. M. Bronstein, M. M. Bronstein, Y. S. Devir, R. Kimmel, and O. We-
ber, Parallel algorithms for approximation of distance maps on paramet-
ric surfaces, Proc. ACM SIGGRAPH (2007), submitted.
[9] A. M. Bronstein, M. M. Bronstein, and R. Kimmel, Calculus of non-rigid
surfaces for geometry and texture manipulation, IEEE Trans. Visualiza-
tion and Computer Graphics, accepted.
[10] , Expression-invariant 3D face recognition, Proc. AVBPA, Lec-
ture Notes on Computer Science, no. 2688, Springer, 2003, pp. 62–69.
[11] , Expression-invariant face recognition via spherical embedding,
Proc. IEEE International Conf. Image Processing (ICIP), vol. 3, 2005,
pp. 756–759.
[12] , On isometric embedding of facial surfaces into S3, Proc. 5th In-
ternation Conf. Scale Space and PDE Methods in Computer Vision, Lec-
ture Notes in Comp. Science, no. 3459, Springer Verlag, 2005, pp. 622–
631.
[13] , Three-dimensional face recognition, IJCV 64 (2005), no. 1, 5–
30.
22
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[14] , Efficient computation of isometry-invariant distances between
surfaces, SIAM Journal Scientific Computing 28 (2006), 1812–1836.
[15] , Face2face: an isometric model for facial animation, Proc.
AMDO, 2006, pp. 38–47.
[16] , Generalized multidimensional scaling: a framework for
isometry-invariant partial surface matching, Proc. National Academy
of Sciences 103 (2006), no. 5, 1168–1172.
[17] , Robust expression-invariant face recognition from partially
missing data, Proc. European Conf. Computer Vision (ECCV), 2006,
pp. 396–408.
[18] , Expression-invariant representation of faces, IEEE Trans. Im-
age Processing 16 (2007), no. 1, 188–197.
[19] A. M. Bronstein, M. M. Bronstein, R. Kimmel, and A. Spira, Face recog-
nition from facial surface metric, Proc. 8th European Conf. Computer
Vision, Lecture Notes in Comp. Science, no. 3023, Springer Verlag, 2004,
pp. 225–237.
[20] M. M. Bronstein, A. M. Bronstein, R. Kimmel, and I. Yavneh, A multi-
grid approach for multi-dimensional scaling, Proc. 12th Copper Moun-
tain Conference on Multigrid Methods, 2005, Best student paper award.
[21] , Multigrid multidimensional scaling, Numerical Linear Algebra
with Applications 13 (2006), 149–171.
[22] A.M. Bruckstein, N. Katzir, M. Lindenbaum, and M. Porat, Similarity-
invariant signatures for partially occluded planar shapes, IJCV 7 (1992),
no. 3, 271–285.
[23] D. Burago, Y. Burago, and S. Ivanov, A course in metric geometry,
Graduate studies in mathematics, vol. 33, American Mathematical So-
ciety, 2001.
23
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[24] Y. Chen and G. Medioni, Object modeling by registration of multiple
range images, Proc. IEEE Conference on Robotics and Automation,
1991.
[25] F. R. K. Chung, Spectral graph theory, Am. Math. Soc., Providence, RI,
1997.
[26] R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner,
and S. W. Zucker, Geometric diffusions as a tool for harmonic analysis
and structure definition of data: Diffusion maps, Proc. National Acad-
emy of Sciences 102 (2005), no. 21, 7426–7431.
[27] D. Donoho and C. Grimes, Hessian eigenmaps: Locally linear embed-
ding techniques for high-dimensional data, Proc. National Academy of
Science, vol. 100, 2003, pp. 5591–5596.
[28] A. Elad and R. Kimmel, Geometric methods in bio-medical image
processing, vol. 2191, ch. Spherical flattening of the cortex surface,
pp. 77–89, Springer-Verlag, Berlin Heidelberg New York, 2002.
[29] , On bending invariant signatures for surfaces, IEEE Trans.
PAMI 25 (2003), no. 10, 1285–1295.
[30] C. Gordon, D. L. Webb, and S. Wolpert, One cannot hear the shape of
the drum, Bulletin Am. Math. Soc. 27 (1992), no. 1, 134–138.
[31] C. Grimes and D. L. Donoho, Hessian eigenmaps: Locally linear embed-
ding techniques for high-dimensional data, Proc. National Academy of
Sciences 100 (2003), no. 10, 5591–5596.
[32] H. Groemer, Geometric applications of fourier series and spherical har-
monics, Cambridge University Press, New York, 1996.
[33] M. Gromov, Structures metriques pour les varietes riemanniennes,
Textes Mathematiques, no. 1, 1981.
24
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[34] R. Grossman, N. Kiryati, and R. Kimmel, Computational surface flat-
tening: a voxel-based approach, IEEE Trans. PAMI 24 (2002), no. 4,
433–441.
[35] B. Horn, Closed-form solution of absolute orientation using unit quater-
nions, Journal of the Optical Society of America A 4 (1987), 629–642.
[36] D. Jacobs, D. Weinshall, and Y. Gdalyahu, Class representation and
image retrieval with non-metric distances, IEEE Trans. PAMI 22 (2000),
583–600.
[37] M. Kac, Can one hear the shape of a drum?, Amer. Math. Monthly 73
(1966), 1–23.
[38] S. Katz, G. Leifman, and A. Tal, Mesh segmentation using feature point
and core extraction, The Visual Computer 21 (2005), no. 8, 649–658.
[39] R. Kimmel and J. A. Sethian, Computing geodesic paths on manifolds,
Proc. National Academy of Sciences 95 (1998), no. 15, 8431–8435.
[40] L. J. Latecki and R. Lakamper, Shape similarity measure based on cor-
respondence of visual parts, IEEE Trans. PAMI 22 (2000), no. 10, 1185–
1190.
[41] L.J. Latecki, R. Lakaemper, and D. Wolter, Optimal Partial Shape Sim-
ilarity, Image and Vision Computing 23 (2005), 227–236.
[42] S. Leopoldseder, H. Pottman, and H. Zhao, The d2-tree: A hierarchical
representation of the squared distance function, Tech. report, Institute
of Geometry, Vienna University of Technology, 2003.
[43] H. Ling and D. Jacobs, Deformation invariant image matching, Proc.
ICCV, 2005.
[44] , Using the inner-distance for classification of articulated shapes,
Proc. CVPR, 2005.
25
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[45] N. Linial, E. London, and Y. Rabinovich, The geometry of graphs and
some its algorithmic applications, Combinatorica 15 (1995), 333–344.
[46] F. Memoli and G. Sapiro, A theoretical and computational framework
for isometry invariant recognition of point cloud data, IMA preprint
series 1980, University of Minnesota, Minneapolis, MN 55455, USA,
June 2004.
[47] N. J. Mitra, N. Gelfand, H. Pottmann, and L. Guibas, Registration of
point cloud data from a geometric optimization perspective, Proc. Euro-
graphics Symposium on Geometry Processing, 2004, pp. 23–32.
[48] B. Mohar, Graph theory, combinatorics, and applications, vol. 2, ch. The
Laplacian spectrum of graphs, pp. 871–898, Wiley, 1991.
[49] M. Reuter, F.-E. Wolter, and N. Peinecke, Laplacebeltrami spectra as
shape-dna of surfaces and solids, Computer-Aided Design 38 (2006),
342–366.
[50] G. Rosman, M. M. Bronstein, A. M. Bronstein, and R. Kimmel, Topolog-
ically constrained isometric embedding, Human Motion - Understanding,
Modeling, Capture and Animation. 13th Workshop, 2006.
[51] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by
locally linear embedding, Science 290 (2000), no. 5500, 2323–2326.
[52] , Nonlinear dimensionality reduction by locally linear embedding,
Science 290 (2000), no. 5500, 2323–2326.
[53] M. E. Salukwadze, Vector-valued optimization problems in control the-
ory, Academic Press, 1979.
[54] E. L. Schwartz, A. Shaw, and E. Wolfson, A numerical solution to the
generalized mapmaker’s problem: flattening nonconvex polyhedral sur-
faces, IEEE Trans. PAMI 11 (1989), 1005–1008.
26
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
[55] A. Tal, M. Elad, and S. Ar, Content based retrieval of VRML objects -
an iterative and interactive approach, Proc. Eurographics Workshop on
Multimedia, 2001.
[56] M. R. Teague, Image analysis via the general theory of moments, Journal
of the Optical Society of America 70 (1979), 920–930.
[57] J. B. Tennenbaum and V. de Silva abd J. C Langford, A global geometric
framework for nonlinear dimensionality reduction, Science 290 (2000),
no. 5500, 2319–2323.
[58] J. Walter and H. Ritter, On interactive visualization of high-dimensional
data using the hyperbolic plane, Proc. ACM SIGKDD Int. Conf. Knowl-
edge Discovery and Data Mining, 2002, pp. 123–131.
[59] Z. Zhang and H. Zha, Principal manifolds and nonlinear dimension re-
duction via local tangent space alignment, Technical Report CSE-02-019,
Department of Computer Science and Engineering, Pennsylvania State
University, 2002.
[60] Z. Y. Zhang, Iterative point matching for registration of free-form curves
and surfaces, IJCV 13 (1994), no. 2, 119–152.
[61] G. Zigelman, R. Kimmel, and N. Kiryati, Texture mapping using surface
flattening via multi-dimensional scaling, IEEE Trans. Visualization and
computer graphics 9 (2002), no. 2, 198–207.
27
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 3: The set objects used in the second experiment. First three columns:
different nearly-isometric deformations of four objects. Fourth column: deforma-
tion from the third column with topology changes modeled by welding the mesh
in points indicated by red circles.
28
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 4: Visualization of the intrinsic similarity computed for the objects from
Figure 3 using GMDS.
29
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 5: Visualization of the extrinsic similarity computed for the objects from
Figure 3 using ICP.
30
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007
Figure 6: Visualization of the joint similarity computed for the objects from
Figure 3.
31
Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007