joint intrinsic and extrinsic similarity for recognition ... · joint intrinsic and extrinsic...

Joint intrinsic and extrinsic similarity for

recognition of non-rigid shapes

Alexander Bronstein, Michael Bronstein and Ron Kimmel

March 12, 2007

Abstract

This paper explores the problem of similarity criteria between non-

rigid shapes. Broadly speaking, such criteria are divided into intrinsic

and extrinsic, the first referring to the metric structure of the objects

and the latter to the Euclidean coordinates of the points of which the

objects comprises. Both criteria have their advantages and disadvan-

tages; extrinsic similarity is sensitive to non-rigid deformations of the

shapes, while intrinsic similarity is sensitive to topological noise. We

present an approach unifying both criteria in a single measure. We

consider the tradeoff between the extrinsic and intrinsic similarity and

use it as a set-valued “distance” related to the notion of Pareto opti-

mality in economics. Numerical results demonstrate the efficiency of

our approach in cases where using solely extrinsic or intrinsic criteria

would fail.

1 Introduction

Most of us are familiar with the childhood Rock, Paper, Scissors game, in

which the players bend their fingers in different ways to make the hand re-

semble one of the three objects (rock, represented by a closed fist; paper –

by an open palm; or scissors – by the extended index and middle fingers). In

1

Technion - Computer Science Department - Technical Report CIS-2007-01 - 2007

the context of pattern recognition, this example demonstrates the difficulty

of defining the similarity of non-rigid shapes: one may recognize the hand

shapes as objects they intend to imitate, while others may say that all these

“objects” are actually just deformations of the human hand. Using geometric

terminology, the first similarity criterion is extrinsic, i.e., looks at the prop-

erties of the shape related to the way it is embedded in the ambient space.

The second criterion, on the other hand, looks at the intrinsic properties of

the shape, described by its metric structure.

Extrinsic similarity of shapes has been widely addressed in computer vi-

sion, pattern recognition and computational geometry literature. Most of

the papers in these fields, either implicitly or explicitly, look at the prob-

lem of shape similarity from the extrinsic point of view (see, for example,

[22, 40, 36]). Here, while considering extrinsic similarity, we focus mainly on

a classical result for rigid object matching, called the iterative closest point

(ICP) method, introduced by Chen and Medioni [24] and Besl and McKay

[2]. ICP and its numerous variations [60, 42, 47] try to find a Euclidean

transformation between two surfaces, minimizing an extrinsic distance be-

tween them, usually a variant of the Hausdorff distance. In some sense, one

can think of ICP as finding the best possible rigid alignment between the

surfaces.

Another important class of extrinsic similarity methods is based on high-

order moments [56, 35, 32, 55]. The shape’s extrinsic geometry is represented

by a vector of its moments and these vectors are compared. Conceptually,

one can think of such methods as of a Fourier-like representation, where only

low-frequency components are used.

On the other hand, methods for the computation of intrinsic similarity

of shapes started penetrating into the computer vision community relatively

late. As the precursor of a recent trend in non-rigid shape analysis, we

can consider the paper of Schwartz et al. [54], in which a method for the

representation of the intrinsic geometry of the cortical surface of the brain

using multidimensional scaling (MDS) was presented. MDS is a family of

2


algorithms [3] commonly used in dimensionality reduction and data analysis

[51, 27, 57] and graph representation [45]. The idea of Schwartz et al. was ex-

tended by Elad and Kimmel [29], who proposed a non-rigid shape recognition

method based on Euclidean embeddings. The key idea is to map the metric

structure of the surfaces to a low-dimensional Euclidean space and compare

the resulting images (called canonical forms) in this space.1 Grossman et al.

extended this method to objects represented as voxels [34]. Applied to the

problem of three-dimensional face recognition, the canonical forms method

proved to be very efficient in being able to recognize identity of people while

being insensitive to their facial expressions [10, 19, 13]. Ling and Jacobs

used this method for recognition of articulated two-dimensional shapes [44].

Other applications were in texture mapping and object morphing [61, 15, 9],

mesh segmentation [38] and image matching [43].

One of the main disadvantages of the canonical forms is the fact that they

can represent the intrinsic geometry of the objects only approximately, as it

is generally impossible to find an isometry between a non-flat surface and a

flat Euclidean space. It was shown empirically in [58, 12, 11, 18] that using

spaces with non-Euclidean geometry, it is possible to obtain more accurate

representations. Memoli and Sapiro [46] showed how the representation er-

ror can be theoretically avoided by using to the Gromov-Hausdorff distance,

introduced by Mikhail Gromov in [33]. The Gromov-Hausdorff distance is

invariant to isometries, which makes it a useful tool for intrinsic shape com-

parison. Unfortunately, the computation of the Gromov-Hausdorff distance is

intractable, as its discretization results in a combinatorial problem. For this

reason, Memoli and Sapiro had to resort to an approximation, computing

1The the measurement of pairwise geodesic distances and the solution of the underlying

MDS problem are the two most computationally-intensive components of the canonical

forms method. The measurement of geodesic distance can be performed very efficiently

using the recently proposed parallel fast marching method [8]. The solution of MDS prob-

lem can be carried out efficiently using the multigrid framework [20, 21] or the vector

extrapolation methods [50]. This makes possible real-time applications like face recogni-

tion.

3


a different distance related to the Gromov-Hausdorff distance by a proba-

bilistic bound, rather than the Gromov-Hausdorff distance itself. Using the

fact that the Gromov-Hausdorff distance can be related to the distortion

of embedding one surface into another, we proposed an efficient and accu-

rate method for the computation of the Gromov-Hausdorff distance based

on a numerical core similar to MDS. This algorithm was named generalized

MDS (GMDS) [16, 14], and can be thought as a natural extension of pre-

vious works on isometric embedding into non-Euclidean spaces. GMDS was

used for the comparison of two-dimensional [7, 4] and three-dimensional [16]

shapes, face recognition [17], as well as problems of texture mapping and

shape manipulation [15, 9].

Another class of intrinsic similarity methods is based on the analysis of

spectral properties of the Laplace-Beltrami operator. Reuter et al. used

Laplace-Beltrami spectra, referring to them as “shape DNA”, for the char-

acterization of surfaces [49]. Since the Laplace-Beltrami operator is an in-

trinsic characteristic of the surface, it is insensitive to isometric deformations

[25, 48]. Intrinsic similarity criteria based on spectral analysis are intimately

related to methods used in manifold learning and data analysis; see e.g.

[52, 59, 1, 31, 26]. It appears, however, that such criteria are able to iden-

tify isospectral rather than isometric surfaces, and isospectrality is a weaker

property (two surfaces can be isospectral but not isometric, which implies,

following the beautiful expression of Kac [37], that “we cannot hear the shape

of the drum” [30]).

The choice of whether to use intrinsic or extrinsic similarity depends sig-

nificantly on the application. Giving the problem of face recognition as an

example, the intrinsic geometry captures the identity, while the extrinsic one

captures the expression. The drawback of extrinsic similarity is its sensitivity

to non-rigid deformations. Using our example, a gesture of the human hand

can be extrinsically more similar to a rock or scissors rather than another

hand. This make extrinsic criteria unsuitable for the analysis of non-rigid

objects possessing a high degree of flexibility. The intrinsic criterion, on the

4


Figure 1: Illustration of the difference between intrinsic and extrinsic similarity:

right and middle shapes are extrinsically similar while being intrinsically dissimilar

(the two shapes have different topology, since the fingers touch). Left and middle

shapes are intrinsically similar (one can be obtained from the other by means of a

near-isometric deformation), but extrinsically dissimilar.

other hand, is insensitive to deformations which can be approximated by

isometries, since isometries preserve the metric structure of the shape. How-

ever, intrinsic similarity is sensitive to topological noise and can be problem-

atic in cases where the objects are partially missing (for example, connecting

the fingers of the hand results in a significantly different intrinsic geometry

[6]). Consequently, it appears that two semantically similar shapes can be

substantially different both in their extrinsic and intrinsic geometries.

In this paper, we propose an approach for computing the similarity be-

tween non-rigid shapes trying to take the advantages of both intrinsic and

extrinsic similarity criteria, while avoiding their shortcomings. As an illus-

tration, one can think of fitting a rubber glove onto a hand. The extent to

which the rubber surface is stretched represents the intrinsic geometry dis-

tortion. The fit quality, or in other words, how close the glove is to the hand

5


surface, represents the extrinsic distance between the two objects. For two-

dimensional shapes, the motivation is the same, though less visual. Finding

an optimal tradeoff between the two distortions can be posed as a multicri-

terion optimization problem and related to the notion of Pareto optimality.2

The set of all Pareto optimal solutions can be represented as a set-valued

similarity criterion, which conceals much richer information than each of the

intrinsic and extrinsic criteria separately.

The rest of this paper is organized as follows. In Section 2, we intro-

duce the mathematical background and formulate the standard approaches

to measuring intrinsic and extrinsic similarity. In Section 3, we present our

approach for computing the joint intrinsic-extrinsic similarity using the for-

malism of Pareto optimality. In Section 4, we show the numerical framework

for computing this distance. Section 5 is dedicated to experimental results.

Section 6 concludes the paper.

2 Mathematical foundations

We model a non-rigid object as a pair (S, dS), where S is a two-dimensional

smooth compact connected and complete Riemannian manifold (possibly

with boundary), and dS : S × S → R is the geodesic metric measuring the

lengths of the shortest paths on the manifold, induced by the Riemannian

structure. For brevity of notation, we will write simply S, implying (S, dS).

Without loss of generality, we assume that our objects are embedded into R3.

The Euclidean coordinates of S can be represented by the map XS : S → R3,

where XS(s) is a vector of Euclidean coordinates of the point s ∈ S.

We broadly refer to properties described in terms of the metric dS as to

the intrinsic geometry of S, and to properties associated with the Euclidean

2Such an approach was first introduced in [5, 6] as a generalization of the approach

of Latecki et al. [41] in a more restricted setting of the problem, in which an intrinsic

criterion was used to measure partial similarity of shapes, and the tradeoff was between

the sizes of the parts cropped out of the surfaces (referred to as partiality) and the intrinsic

similarity between them.

6


coordinatesX as the extrinsic geometry. From the intrinsic point of view, two

objects S and Q are similar (isometric) if there exists a bijective distance-

preserving map (isometry) between S and Q. An isometry from S to itself is

called a self-isometry. The collection of all the self-isometries forms a group

with the function composition operator and is referred to as the isometric

group, denoted by Iso(S). More generally, S and Q are said to be ǫ-isometric

if there exists an ǫ-surjective map ϕ : S → Q (i.e., dQ(q, f(S)) ≤ ǫ for all

q ∈ Q), which has distortion

disϕ = sups,s′∈S

|dS(s, s′) − dQ(ϕ(s), ϕ(s′))| = ǫ.

Such an ϕ is called an ǫ-isometry. Isometries are different from ǫ-isometries.

Particularly, an isometry is always bi-Lipschitz continuous [23], which is not

necessarily true for an ǫ-isometry. If we relax the requirement of ǫ-surjectivity

by demanding that ϕ has only disϕ ≤ ǫ, we refer to such ϕ as an ǫ-isometric

embedding.

2.1 Extrinsic distances

The basic tool in extrinsic shape comparison is the Hausdorff distance, mea-

suring the distance between two sets of points X and Y in R3,

dR3

H (X,Y ) = max

supx∈X

dR3(x, Y ), supy∈Y

dR3(y,X)

,

where dR3(y,X) = infx∈X ‖y − x‖2 denotes the distance between the set X

and the point y. A non-symmetric version of the Hausdorff distance,

dR3

NH(X,Y ) = supx∈X

dR3(x, Y ), (1)

is often preferred since it allows for partial comparison of surfaces. The

Hausdorff distance (both its symmetric and non-symmetric versions) regards

the objects as sets of points in R3, is completely extrinsic, and consequently,

is sensitive to deformations of the shapes. It depends not only on non-rigid

7


but also on rigid transformations of shapes; for example, translating one

shape with respect to another changes the Hausdorff distance.

ICP-type methods try to get rid of this alignment ambiguity by minimiz-

ing the Hausdorff distance between the extrinsic coordinates XS and XQ of

the surfaces S and Q over all the isometries in R3 (the isometries in R

3 are

rigid transformations, including rotations, translations and reflections),

dICP(S,Q) = infi∈Iso(R3)

dR3

H (i XS , XQ).

Though theoretically a global minimum should be searched, in practice, local

optimization methods are usually employed, and thus, the ICP algorithm is

usually liable to converge to a suboptimal solution (local minimum). Though

resolving the rigid transformations ambiguity, ICP methods are still sensitive

to non-rigid deformations.

Another approach for extrinsic geometry comparison is based on geomet-

ric moment computation. For a surface in R3, a (p, q, r)-moment is given

by

mpqr(S) =

∫

S

(X1S(s))p(X2

S(s))q(X3S(s))rds, (2)

where X iS denotes the ith coordinate of XS . First and second order moments

(i.e., moments with p+q+r = 1 and 2) are responsible for surface alignment.

The vector (m100,m010,m001)T is the center of gravity of the surface extrinsic

coordinates XS . The matrix

Σ =

m200 m110 m101

m110 m020 m011

m101 m011 m002

(3)

contains information about the variance of XS : its eigenvectors are the di-

rections in which the variance of XS is the maximum. The surface can be

aligned by performing a translation which eliminates the first order moments

and applying a rotation which eliminates the mixed second order moments

(off-diagonal elements of Σ). In order to compare two surfaces, each of them

8


is first aligned in this way, and then a Hausdorff-like distance is computed

between the aligned extrinsic coordinates.

It is possible to use high-order geometric moments to compare the sur-

faces. High order moments can be regarded as decomposition coefficients of

the extrinsic coordinates of S in some basis. This can be seen by rewriting

the definition of mpqr as

mpqr(S) =

∫

R3

φpqr(x)ξ(x)dx = 〈φpqr, ξ〉, (4)

where φpqr(x) = (x1)p(x2)q(x3)r, and ξ(x) is the characteristic (indicator)

function of S, defined here a superposition of Dirac’s delta functions, tak-

ing the value of “infinity” for x ∈ XS and zero elsewhere in R3. The set

ψpqr∞p,q,r=0 spans the square integrable functions on R

3, hence, the set of

coefficients mpqr∞p,q,r=0 is a unique descriptor of a given surface, invariant

to rigid isometries if proper alignment is performed. Such a description is

complete, that is, at least theoretically, one can recover the surface from

mpqr∞p,q,r=0. The difference between two sets of coefficients corresponding

to two surfaces can be used as an extrinsic similarity criterion,

dMOM(S,Q) =∑

p,q,r

|mpqr(S) −mpqr(Q)|2. (5)

In practice, we can compute only a finite number of moments up to some

order p + q + r = P . Theoretically, the more mpqr’s we take, the better we

can describe our object, however, since high powers tend to amplify noise,

computation of high-order moments is numerically unstable.

2.2 Intrinsic distances

Canonical form-type methods attempt to formulate the problem of compari-

son of intrinsic geometry (invariant to isometric deformation) as the problem

of extrinsic geometry comparison, which is relatively easy to compute. The

approach consists of two stages. First, an extrinsic representation of the

intrinsic geometry of the shapes in some common metric space (X, dX) is

9


constructed by means of isometric embedding, attempting to find two maps

ϕ : S → X and ψ : Q → X with minimum distortions dis ϕ and dis ψ.

The embedding, in a sense, allows to “undo” all the isometric deformations

of the objects (though, some degree of ambiguity stemming from isometries

in X still remains). Typically, X is selected as the Euclidean space, though

other choices are possible [28, 58, 12, 11, 18]. There are a few criteria for the

embedding space choice. First, it is desired that X is homogeneous and its

isometric group is simple, in order to reduce the number of degrees of freedom

in the definition of the canonical forms. Secondly, an analytic expression for

dX allows a numerically relatively simple computation using MDS algorithms

[3]. Finally, it appears that for some classes of shapes, certain embedding

spaces are generally more suitable (see, for example, experimental results in

[12, 11, 18]).

The second stage is extrinsic comparison of the canonical forms ϕ(S)

and ψ(Q), using, for example, the ICP distance dICP(ϕ(S), ψ(Q)). Since

achieving a true isometric embedding is usually impossible [45], the canonical

forms are only an approximate representation of the intrinsic geometry of the

shapes.

The problem of inaccuracy introduced by the embedding into X can be

resolved if we do not assume a given embedding space, but instead, include X

as a variable into the optimization. We can always find a sufficiently compli-

cated metric space into which both S and Q can be embedded isometrically,

and compare the images using the Hausdorff distance,

dGH(S,Q) = infX

ϕ:S→X

ψ:Q→X

dX

H(ϕ(S), ψ(Q)),

(here ϕ and ψ are assumed to be isometric embeddings). Such an approach

is commonly referred to as Gromov-Hausdorff distance [33]. For compact

surfaces, the Gromov-Hausdorff distance can be expressed in terms of the

10


distortion obtained by embedding one surface into another,

dGH(S,Q) =1

2inf

ϕ:S→Q

ψ:S→Q

maxdisϕ, disψ, dis (ϕ, ψ).

Here, dis (ϕ, ψ) = sups∈S,q∈Q |dS(s, ψ(q))− dQ(q, ϕ(s))|. The computation of

the distortions is performed using GMDS, a procedure similar in its spirit

to MDS, but not limited to spaces with analytically expressed geodesic dis-

tances.

The Gromov-Hausdorff distance is a metric on the quotient space of non-

rigid objects under the isometry relation. Particularly, this implies that

dGH(S,Q) = 0 if and only if S and Q are isometric. More generally, if

dGH(S,Q) ≤ ǫ, then S and Q are 2ǫ-isometric and conversely, if S and Q are

ǫ-isometric, then dGH(S,Q) ≤ 2ǫ [23].

3 Combining intrinsic and extrinsic similari-

ties

Let us now return to our example of glove fitting. Assume that Q is the

hand surface, and S is the glove we wish to fit. Our goal is to find such a

deformation of S, denoted hereinafter by S ′ = f(S), that is similar the most

to Q in the extrinsic sense, yet, at the same time preserves its intrinsic geom-

etry as much as possible. In a generic setting, we assume that Q (referred

to as probe) is obtained from S (model) by means of some deformation (not

necessarily isometric). We allow for the possibility that the topology of Q is

different from that of S due to the presence of noise or as the result of the

deformation (e.g, in the glove fitting example, the fingers of the hand can

touch each other).

In order to quantify the intrinsic and extrinsic similarity, we define a

generic extrinsic distance dE, and a generic intrinsic distance dI. We require

that dE(S,Q) = 0 if S and Q are extrinsically similar (i.e., XS = XQ)

and dI(S,Q) = 0 if S and Q are intrinsically similar (isometric). Since the

11


deformation f gives a one-to-one correspondence between S and S ′, we can

define the mapping ϕ : S → S ′ copying s ∈ S to f(s) on S ′. Fixing ϕ

and setting ψ = ϕ−1 in the Gromov-Hausdorff distance, yields the following

intrinsic distance,

dI(S,S′) = dis ϕ = sup

s,s′∈S

|dS(s, s′) − dS′(ϕ(s), ϕ(s′))|. (6)

Note that the correspondence between the surfaces is now fixed and does not

participate anymore in the minimization. dI(S,S′) defined this way measures

the distortion in the intrinsic geometry of S introduced by the extrinsic

deformation f . However, the geodesic distances in S ′ have to be re-computed

every time the deformation changes.

As the extrinsic distance, we use the non-symmetric Hausdorff distance,

which can be expressed as follows,

dE(Q,S ′) = dR3

NH(XQ, XS′) = sups′∈S′

‖XS′(s′) −XQ(q∗(s′))‖2, (7)

where q∗(s′) = arg minq∈Q ‖XS′(s′) − XQ(q)‖2 is the closest point to s′ on

Q. We write q∗ assuming that the infimum is achieved. Since in practice we

work with discrete objects, this assumption always holds. The set of closest

points q∗(S ′) has to be recomputed every time the deformation changes. The

lack of symmetry of dE(Q,S ′) allows for partial matching.

3.1 Pareto optimality and set-valued distances

Since it is usually impossible to say which of the two criteria is more impor-

tant, we judge the similarity as a tradeoff between them: the joint intrinsic-

extrinsic similarity between two shapes can be quantitatively represented by

the extent to which we have to modify the extrinsic geometry in order to

make the two shapes intrinsically similar, or alternatively, extent to which

we have to modify the intrinsic geometry in order to make the two shapes

extrinsically similar. This, in turn, can be formulated as a multicriterion

optimization problem, in which we bring to minimum with respect to S ′ the

12


vector objective function Φ(S ′) = (dI(S′,S), dE(S ′,Q)). Unlike optimiza-

tion with a scalar objective, we cannot define unambiguously the minimum

of Φ, since there does not exist a total order relation between the criteria

– we cannot say, for example, whether it is better to have Φ = (0.5, 1) or

Φ = (1, 0.5). At the same time, we have no doubt that Φ = (0.5, 0.5) is

better than Φ = (1, 1), since both criteria have smaller values. This allows

us define a minimizer of our vector objective Φ as an S∗, such that there is

no any other S ′ for which Φ(S ′) is better that Φ(S∗) (which can be expressed

as a vector inequality Φ(S∗) > Φ(S ′)). Such an S∗ is called non-inferior or

Pareto optimal. Pareto optimum is not unique; we denote by Ω∗ the set of

S∗ that satisfy the above relation. The corresponding values of Φ(Ω∗) are

referred to as a Pareto frontier and can be visualized as a planar curve.

In [12, 6], Bruckstein and the authors proposed considering the entire

Pareto frontier as a criterion of similarity, of which we can think as a gener-

alized, set-valued distance. In our case, this distance measures the tradeoff

between the intrinsic and extrinsic similarity. We denote dP(S,Q) = Φ(Ω∗)

and call it the Pareto distance. Our approach generalizes the similarity cri-

teria based purely on extrinsic or intrinsic geometry. Asserting dI(S,S′) = 0,

our approach in a sense can be considered as a generalization of ICP, since

we now allow for non-rigid isometries and not only for the rigid ones. If we

further restrict the family of transformations to the rigid ones, the traditional

ICP method is obtained.3 On the other hand, if we require dE(Q,S ′) = 0,

the probe is forced to be attached to the model surface, which boils down to

a problem similar to GMDS.

3.2 Salukwadze distances

As we argued in [12, 6], a set-valued distance may be inconvenient to use

since only a partial order relation exists between the Pareto frontiers – given

two set-valued distances, we cannot in general say which of them is “smaller”,

3Polyhedral surfaces are known to be rigid, hence, in this case, the only isometries are

the rigid ones. Allowing for dI(S,S ′) > 0 makes the class of deformations non-trivial.

13


unless one curve is entirely above or below the other. In multicriterion op-

timization theory, there is a known way to convert a vector optimization

problem into a scalar one. It has been researched mainly in the field of op-

timal control [53] by Salukwadze. Among all the Pareto optima, we cannot

prefer any since they are non-comparable: we cannot say which Pareto opti-

mum is better. However, ideally we want to bring both of our criteria to zero,

that is, achieve some “utopia point” (0, 0). Salukwadze proposed choosing a

single point on the Pareto frontier the closest to the utopia point, in sense of

some distance.

Using this idea, we define the Salukwadze distance as

dSAL(S,Q) = inf(ǫ1,ǫ2)∈dP(S,Q)

‖(ǫ1, ǫ2)‖R2+, (8)

where ‖ · ‖R2+

is some norm on R2+. In the following, we will consider a family

of norms ‖(ǫ1, ǫ2)‖λ = ǫ1 + λǫ2, for λ > 0. The Salukwadze distance in this

case can be written as

dSAL(S,Q) = minS′

dI(S,S′) + λdE(S ′,Q). (9)

We can intuitively think of dSAL as of a tradeoff between the two criteria.

Different selections of the multiplier λ attribute importance either to dI or

dE, and gives us different points on the Pareto frontier.

4 Numerical framework

For practical computations, we work with discretized shapes. The surface S

is sampled at N points SN = s1, ..., sN ⊆ S, constituting an r-covering,

i.e., S =⋃N

n=1BS(sn, r) (here, BS(s, r) is a metric ball of radius r around

s). The extrinsic coordinates of SN are represented as an N × 3 matrix

XSN, each row of which corresponds to a point XS(si) ∈ R

3. The discrete

shape is represented as a triangular mesh; each triangle is a triplet of indices

of vertices belonging to it. The maximum length of an edge is r. Vertices

connected by an edge (or in other words, belonging to the same triangle) are

14


said to be adjacent ; we describe the adjacency by the set E of all adjacent

pairs of vertices in SN . The geodesic distances on SN are approximated using

the fast marching method [39], forming an N ×N matrix DSN.

Assuming the deformed surface S ′N = f(SN) maintains the connectivity

of SN , we can formulate the following minimization problem with respect to

the N × 3 matrix X of the extrinsic coordinates of S ′N :

dSAL(SN ,QM) = minX

1

N2

N∑

i,j=1

(dSN(si, sj) − dij(X))2 +

λ

N

N∑

i=1

d2(xi,QM)

(10)

where dij(X) = dS′

N(s′i, s

′j) denote the geodesic distances on S ′

N , and d(xi,QM)

denotes the Euclidean distance from the point xi to the discretized surface Q.

The first term of the above cost function is the discretization of dI, whereas

the second term is the discretization of dE. In the sequel, we show how to

compute these two terms and their derivatives with respect to X, required

for the minimization of (10).

4.1 Intrinsic distance computation

The main challenge in the computation of the intrinsic distance term dI is

the necessity to evaluate the geodesic distances on S ′N and their derivatives

with respect to the extrinsic geometry of S ′N changing at each iteration of

the minimization algorithm. The simplest remedy would be to modify the

intrinsic term by restricting i and j to be neighboring points only,

dI ≈1

|E|

∑

(i,j)∈E

(dSN(si, sj) − dS′

N(s′i, s

′j))

2. (11)

This way, we use only the local distances on S ′N , which can be approximated

as the Euclidean distances dS′

N(s′i, s

′j) = ‖xi−xj‖. However, such a modifica-

tion makes dI significantly less sensible to large deformations of SN . Indeed,

many deformations change the local distance only slightly, while introducing

large distortion to larger distances.

15


In order to penalize for such deformations of SN , we need to approximate

the full matrix of geodesic distances on S ′N . We first define the matrix D(X)

of local distances, whose elements are, as before,

dij(X) =

‖xi − xj‖ : (i, j) ∈ E

0 : (i, j) /∈ E.(12)

Using the Dijkstra algorithm, we compute the set of shortest paths between

all pairs of points (i, j). For example, let Pij = (i, i1), (i1, i2), ..., (in−1, in), (in, j) ⊂

E be the shortest path between the points i and j. Its length is given by

L(Pij) = di,i1 + di1,i2 + ... + din,j, which is a linear combination of the el-

ements of D(X). We can therefore “complete” the missing entries in the

matrix D(X) by defining the matrix of global distances

D(X) = I(D(X))D(X), (13)

where I is a sparse fourth order tensor, with the elements Iijkl = 1 if the

edge (k, l) is contained in the shortest path Pij, and 0 otherwise. Note that I

depends on the connectivity E, which is assumed to be fixed, and the matrix

of local distances D, which, in turn, depends on X.

It is straightforward to verify that the entries of D(X) and D(X) coincide

for all (i, j) ∈ E. The global distance matrix D(X) constitutes an approxi-

mation of the geodesic distances on the surface S ′N , dij ≈ dS′

N(s′i, s

′j), while

having a simple linear form in terms of the local distances. A consistent and

more accurate estimate of dS′

Ncan be produced by replacing the Dijkstra

algorithm with Fast Marching and allowing the shortest paths to pass on

the faces of the mesh representing S ′N . However, such an approach results in

more elaborate expressions and will not be discussed here.

In order to compute the derivative of D(X) with respect to X, we assume

that a small perturbation dX of X does not change the connectivity of the

points on S ′N , and as the results, the trajectory of the shortest paths between

the points on S ′N remains constant (though their length may change). Thus,

we may write

D(X + dX) = I(D(X + dX))D(X + dX) = I(D(X))D(X + dX),

16


and compute the derivative of D(X) as the derivative of the linear form

I(X)D(X). If I(D(X)) 6= I(D(X+dX)), the assumption does not hold and

the derivative of D usually does not exist. Yet, the derivative of I(X)D(X)

belongs to the sub-gradient set of D at the point X. This is sufficient for

many minimization algorithms to work correctly.

The intrinsic distance term can be readily written in terms of D as the

Frobenius norm

dI(X) =1

N2‖D(X) −DSN

‖2F

=1

N2trace((D(X) −DSN

)T(D(X) −DSN)). (14)

Its derivative with respect to X is given by

∂dI(X)

∂X=

2

N2(D(X) −DSN

)T∂D(X)

∂X, (15)

where

∂dij(X)

∂X=

∑

k,l

Iijkl∂dkl(X)

∂X, (16)

and the elements of ∂dkl(X)∂X

are given by

∂dkl∂xmn

=1

dkl

xmk − xml : n = k

xml − xmk : n = l

0 : n 6= k, l

(17)

for m = 1, 2, 3.

4.2 Extrinsic distance computation

The computation of the extrinsic distance is similar to the one used in ICP

algorithms, where the main difficulty arises from the need to re-compute the

closest points each time the extrinsic geometry of S ′N changes. The extrinsic

distance term can be written as

dE(X) =1

Ntrace ((X − Y ∗(X))(X − Y ∗(X))T) (18)

17


where Y ∗(X) denotes the N×3 matrix, whose i-th row y∗i is the closest point

on Q corresponding to xi. The closest points y∗i are computed as a weighted

average of the points on Q, which are the closest to xi. The weights are

selected in inverse proportion to the distance from xi.

In ICP algorithms, it is common to assume that Y ∗(X + dX) ≈ Y ∗(X).

By fixing Y ∗, dE(X) becomes a simple quadratic function, and its derivative

can be written as

∂dE(X)

∂X=

2

N(X − Y ∗(X))T. (19)

4.3 Iterative minimization algorithm

For X in the neighborhood of some X0, the cost function that needs to be

minimized can be approximated as

σ(X) ≈1

N2trace (D(X)TD(X) − 2DT

SND(X) +DT

SNDSN

)

+λ

Ntrace (XXT − 2Y ∗(X0)X

T + Y ∗T(X0)Y∗(X0)), (20)

where D(X) = I(X0)D(X).Like in ICP, after finding a new X which de-

creases σ(X), the closest points Y ∗ and the operator I are updated. The

iterative minimization algorithm can be summarized as follows:

Compute the closest points Y ∗(X).1

Compute the shortest paths between all pairs of points on S ′N and2

assemble I.

Update X by ∆X such that X + ∆X sufficiently decreasing the cost3

function (20).

If the change in X is small, stop. Else, go to Step 1.4

The update of X in Step 3 can be safeguarded by evaluating the true cost

function (with I(X+∆X) and Y ∗(X+∆X) instead of I(X) and Y ∗(X)). In

our implementation, no safeguard was used, and the minimization in Step 3

was done using conjugate gradients.

18


The initialization of the algorithm can be done in several ways, the sim-

plest of which is X = XSN. This choice works well when the extrinsic dissim-

ilarity between S and Q is not too large; for large dE(S,Q), the algorithm

will suffer from poor convergence similar to most ICP methods. Another

choice is to initialize X by the corresponding points on QM resulting from

the solution of the GMDS problem. This choice is suitable for objects having

sufficiently similar intrinsic geometries, making the intrinsic correspondence

computed by GMDS meaningful.

5 Results

5.1 Discriminative power

In order to demonstrate the discriminative power of the proposed approach,

we measured the joint intrinsic-extrinsic similarity quantified in terms of

the Salukwadze distance dSAL between a unit spherical patch and spherical

patches of varying radii. The surfaces were discretized at 150 points. Figure 2

shows that our distance is capable of distinguishing between patches with

curvature varying by as little as 0.5%. A change in curvature by 1% shows a

one order of magnitude increase in the measured distance.

5.2 Topology changing deformations

In this experiment, we used the data set of four objects with four non-rigid

deformations. While the first three deformations of each object were nearly

isometric, the fourth one introduced topological changes modeled by welding

points on the mesh (Figure 3).

Three types of distances were computed: a non-symmetric approxima-

tion of the Gromov-Hausdorff distance, a non-symmetric approximation of

the Hausdorff distance, and the Salukwadze distance dSAL. The Gromov-

Hausdorff distance was computed using GMDS with 100 points embedded

into a surface represented with 1000 points. The Hausdorff distance was

19


Figure 2: Normalized Salukwadze distance dSAL computed between a unit spher-

ical patch and spherical patches of varying radii.

computed using ICP with meshes sampled at 1000 points. The Salukwadze

distance was computed on meshes sampled at 100 points. From Figures 4–6,

it appears that while the intrinsic similarity is nearly invariant to isometric

deformations, it is sensitive to topological changes, which introduce signifi-

cant distortions in the intrinsic geometry. On the other hand, the extrinsic

geometry is insensitive to topological changes, yet sensitive to intrinsic de-

formation. The joint similarity criterion combines both advantages, and is

insensitive to both isometries and topological changes.

6 Conclusions

We presented a new approach for the computation of non-rigid shape sim-

ilarity as a tradeoff between extrinsic and intrinsic similarity criteria. Our

approach can be illustratively presented as deforming one shape in order to

make it the most similar to another from the extrinsic point of view, while

trying to preserve as much as possible its intrinsic geometry. The joint intrin-

sic and extrinsic similarity appears to be advantageous over traditional purely

extrinsic or intrinsic similarity criteria. While extrinsic similarity is sensitive

20


to strong non-rigid deformations of the shapes and intrinsic similarity is sen-

sitive to topology changes resulting from noise or non-rigid deformations, our

joint similarity criterion allows to gracefully handle such problems. Exper-

imental results prove that it can be used in situations where intrinsic and

extrinsic similarities fail.

From the perspective of prior work, our approach can be thought as a

generalization of ICP and GMDS methods. ICP methods compute extrin-

sic similarity of shapes by means of finding a rigid isometry minimizing the

extrinsic distance between them. Our method extends the class of transfor-

mations, allows for non-rigid isometries and near-isometries. Using GMDS,

intrinsic similarity of shapes is computed as the degree of distortion when

trying to embed one shape into another. This problem can be regarded as a

particular case of our approach in which we constraint the extrinsic distance

between the two surfaces to be zero. The joint similarity is also related to

our previous papers with Alfred Bruckstein on set-valued distances, since its

computation can be formulated as a multicriterion optimization problem, in

which we try to simultaneously minimize the intrinsic and extrinsic distances.

The numerical framework presented in this paper extends beyond shape

similarity problems. As a byproduct of our similarity computation, we obtain

non-rigid alignment of two shapes. Additional potential applications are

inverse problems arising in shape reconstruction. Our intrinsic and extrinsic

distances can be used as priors for regularization of such problems.

References

[1] M. Belkin and P. Niyogi, Laplacian eigenmaps for dimensionality re-

duction and data representation, Neural Computation 13 (2003), 1373–

1396.

[2] P. J. Besl and N. D. McKay, A method for registration of 3D shapes,

IEEE Trans. PAMI 14 (1992), 239–256.

21


[3] I. Borg and P. Groenen, Modern multidimensional scaling - theory and

applications, Springer-Verlag, Berlin Heidelberg New York, 1997.

[4] A. M. Bronstein, M. M. Bronstein, A. M. Bruckstein, and R. Kimmel,

Analysis of two-dimensional non-rigid shapes, IJCV, submitted.

[5] , Paretian similarity of non-rigid objects, Scale Space, accepted.

[6] , Partial similarity of objects, or how to compare a centaur to a

horse, IEEE Trans. PAMI, submitted.

[7] , Matching two-dimensional articulated shapes using generalized

multidimensional scaling, Proc. AMDO, 2006, pp. 48–57.

[8] A. M. Bronstein, M. M. Bronstein, Y. S. Devir, R. Kimmel, and O. We-

ber, Parallel algorithms for approximation of distance maps on paramet-

ric surfaces, Proc. ACM SIGGRAPH (2007), submitted.

[9] A. M. Bronstein, M. M. Bronstein, and R. Kimmel, Calculus of non-rigid

surfaces for geometry and texture manipulation, IEEE Trans. Visualiza-

tion and Computer Graphics, accepted.

[10] , Expression-invariant 3D face recognition, Proc. AVBPA, Lec-

ture Notes on Computer Science, no. 2688, Springer, 2003, pp. 62–69.

[11] , Expression-invariant face recognition via spherical embedding,

Proc. IEEE International Conf. Image Processing (ICIP), vol. 3, 2005,

pp. 756–759.

[12] , On isometric embedding of facial surfaces into S3, Proc. 5th In-

ternation Conf. Scale Space and PDE Methods in Computer Vision, Lec-

ture Notes in Comp. Science, no. 3459, Springer Verlag, 2005, pp. 622–

631.

[13] , Three-dimensional face recognition, IJCV 64 (2005), no. 1, 5–

30.

22


[14] , Efficient computation of isometry-invariant distances between

surfaces, SIAM Journal Scientific Computing 28 (2006), 1812–1836.

[15] , Face2face: an isometric model for facial animation, Proc.

AMDO, 2006, pp. 38–47.

[16] , Generalized multidimensional scaling: a framework for

isometry-invariant partial surface matching, Proc. National Academy

of Sciences 103 (2006), no. 5, 1168–1172.

[17] , Robust expression-invariant face recognition from partially

missing data, Proc. European Conf. Computer Vision (ECCV), 2006,

pp. 396–408.

[18] , Expression-invariant representation of faces, IEEE Trans. Im-

age Processing 16 (2007), no. 1, 188–197.

[19] A. M. Bronstein, M. M. Bronstein, R. Kimmel, and A. Spira, Face recog-

nition from facial surface metric, Proc. 8th European Conf. Computer

Vision, Lecture Notes in Comp. Science, no. 3023, Springer Verlag, 2004,

pp. 225–237.

[20] M. M. Bronstein, A. M. Bronstein, R. Kimmel, and I. Yavneh, A multi-

grid approach for multi-dimensional scaling, Proc. 12th Copper Moun-

tain Conference on Multigrid Methods, 2005, Best student paper award.

[21] , Multigrid multidimensional scaling, Numerical Linear Algebra

with Applications 13 (2006), 149–171.

[22] A.M. Bruckstein, N. Katzir, M. Lindenbaum, and M. Porat, Similarity-

invariant signatures for partially occluded planar shapes, IJCV 7 (1992),

no. 3, 271–285.

[23] D. Burago, Y. Burago, and S. Ivanov, A course in metric geometry,

Graduate studies in mathematics, vol. 33, American Mathematical So-

ciety, 2001.

23


[24] Y. Chen and G. Medioni, Object modeling by registration of multiple

range images, Proc. IEEE Conference on Robotics and Automation,

1991.

[25] F. R. K. Chung, Spectral graph theory, Am. Math. Soc., Providence, RI,

1997.

[26] R. R. Coifman, S. Lafon, A. B. Lee, M. Maggioni, B. Nadler, F. Warner,

and S. W. Zucker, Geometric diffusions as a tool for harmonic analysis

and structure definition of data: Diffusion maps, Proc. National Acad-

emy of Sciences 102 (2005), no. 21, 7426–7431.

[27] D. Donoho and C. Grimes, Hessian eigenmaps: Locally linear embed-

ding techniques for high-dimensional data, Proc. National Academy of

Science, vol. 100, 2003, pp. 5591–5596.

[28] A. Elad and R. Kimmel, Geometric methods in bio-medical image

processing, vol. 2191, ch. Spherical flattening of the cortex surface,

pp. 77–89, Springer-Verlag, Berlin Heidelberg New York, 2002.

[29] , On bending invariant signatures for surfaces, IEEE Trans.

PAMI 25 (2003), no. 10, 1285–1295.

[30] C. Gordon, D. L. Webb, and S. Wolpert, One cannot hear the shape of

the drum, Bulletin Am. Math. Soc. 27 (1992), no. 1, 134–138.

[31] C. Grimes and D. L. Donoho, Hessian eigenmaps: Locally linear embed-

ding techniques for high-dimensional data, Proc. National Academy of

Sciences 100 (2003), no. 10, 5591–5596.

[32] H. Groemer, Geometric applications of fourier series and spherical har-

monics, Cambridge University Press, New York, 1996.

[33] M. Gromov, Structures metriques pour les varietes riemanniennes,

Textes Mathematiques, no. 1, 1981.

24


[34] R. Grossman, N. Kiryati, and R. Kimmel, Computational surface flat-

tening: a voxel-based approach, IEEE Trans. PAMI 24 (2002), no. 4,

433–441.

[35] B. Horn, Closed-form solution of absolute orientation using unit quater-

nions, Journal of the Optical Society of America A 4 (1987), 629–642.

[36] D. Jacobs, D. Weinshall, and Y. Gdalyahu, Class representation and

image retrieval with non-metric distances, IEEE Trans. PAMI 22 (2000),

583–600.

[37] M. Kac, Can one hear the shape of a drum?, Amer. Math. Monthly 73

(1966), 1–23.

[38] S. Katz, G. Leifman, and A. Tal, Mesh segmentation using feature point

and core extraction, The Visual Computer 21 (2005), no. 8, 649–658.

[39] R. Kimmel and J. A. Sethian, Computing geodesic paths on manifolds,

Proc. National Academy of Sciences 95 (1998), no. 15, 8431–8435.

[40] L. J. Latecki and R. Lakamper, Shape similarity measure based on cor-

respondence of visual parts, IEEE Trans. PAMI 22 (2000), no. 10, 1185–

1190.

[41] L.J. Latecki, R. Lakaemper, and D. Wolter, Optimal Partial Shape Sim-

ilarity, Image and Vision Computing 23 (2005), 227–236.

[42] S. Leopoldseder, H. Pottman, and H. Zhao, The d2-tree: A hierarchical

representation of the squared distance function, Tech. report, Institute

of Geometry, Vienna University of Technology, 2003.

[43] H. Ling and D. Jacobs, Deformation invariant image matching, Proc.

ICCV, 2005.

[44] , Using the inner-distance for classification of articulated shapes,

Proc. CVPR, 2005.

25


[45] N. Linial, E. London, and Y. Rabinovich, The geometry of graphs and

some its algorithmic applications, Combinatorica 15 (1995), 333–344.

[46] F. Memoli and G. Sapiro, A theoretical and computational framework

for isometry invariant recognition of point cloud data, IMA preprint

series 1980, University of Minnesota, Minneapolis, MN 55455, USA,

June 2004.

[47] N. J. Mitra, N. Gelfand, H. Pottmann, and L. Guibas, Registration of

point cloud data from a geometric optimization perspective, Proc. Euro-

graphics Symposium on Geometry Processing, 2004, pp. 23–32.

[48] B. Mohar, Graph theory, combinatorics, and applications, vol. 2, ch. The

Laplacian spectrum of graphs, pp. 871–898, Wiley, 1991.

[49] M. Reuter, F.-E. Wolter, and N. Peinecke, Laplacebeltrami spectra as

shape-dna of surfaces and solids, Computer-Aided Design 38 (2006),

342–366.

[50] G. Rosman, M. M. Bronstein, A. M. Bronstein, and R. Kimmel, Topolog-

ically constrained isometric embedding, Human Motion - Understanding,

Modeling, Capture and Animation. 13th Workshop, 2006.

[51] S. T. Roweis and L. K. Saul, Nonlinear dimensionality reduction by

locally linear embedding, Science 290 (2000), no. 5500, 2323–2326.

[52] , Nonlinear dimensionality reduction by locally linear embedding,

Science 290 (2000), no. 5500, 2323–2326.

[53] M. E. Salukwadze, Vector-valued optimization problems in control the-

ory, Academic Press, 1979.

[54] E. L. Schwartz, A. Shaw, and E. Wolfson, A numerical solution to the

generalized mapmaker’s problem: flattening nonconvex polyhedral sur-

faces, IEEE Trans. PAMI 11 (1989), 1005–1008.

26


[55] A. Tal, M. Elad, and S. Ar, Content based retrieval of VRML objects -

an iterative and interactive approach, Proc. Eurographics Workshop on

Multimedia, 2001.

[56] M. R. Teague, Image analysis via the general theory of moments, Journal

of the Optical Society of America 70 (1979), 920–930.

[57] J. B. Tennenbaum and V. de Silva abd J. C Langford, A global geometric

framework for nonlinear dimensionality reduction, Science 290 (2000),

no. 5500, 2319–2323.

[58] J. Walter and H. Ritter, On interactive visualization of high-dimensional

data using the hyperbolic plane, Proc. ACM SIGKDD Int. Conf. Knowl-

edge Discovery and Data Mining, 2002, pp. 123–131.

[59] Z. Zhang and H. Zha, Principal manifolds and nonlinear dimension re-

duction via local tangent space alignment, Technical Report CSE-02-019,

Department of Computer Science and Engineering, Pennsylvania State

University, 2002.

[60] Z. Y. Zhang, Iterative point matching for registration of free-form curves

and surfaces, IJCV 13 (1994), no. 2, 119–152.

[61] G. Zigelman, R. Kimmel, and N. Kiryati, Texture mapping using surface

flattening via multi-dimensional scaling, IEEE Trans. Visualization and

computer graphics 9 (2002), no. 2, 198–207.

27


Figure 3: The set objects used in the second experiment. First three columns:

different nearly-isometric deformations of four objects. Fourth column: deforma-

tion from the third column with topology changes modeled by welding the mesh

in points indicated by red circles.

28


Figure 4: Visualization of the intrinsic similarity computed for the objects from

Figure 3 using GMDS.

29


Figure 5: Visualization of the extrinsic similarity computed for the objects from

Figure 3 using ICP.

30


Figure 6: Visualization of the joint similarity computed for the objects from

Figure 3.

31


joint intrinsic and extrinsic similarity for recognition ... · joint intrinsic and extrinsic...

Documents