from a non-local ambrosio-tortorelli phase field to a randomized part hierarchy...

JMIV manuscript No.(will be inserted by the editor)

From a Non-Local Ambrosio-Tortorelli Phase Field to a

Randomized Part Hierarchy Tree

Sibel Tari · Murat Genctav

Received: date / Accepted: date

Abstract In its most widespread imaging and visionapplications, Ambrosio and Tortorelli (AT) phase fieldis a technical device for applying gradient descent toMumford and Shah simultaneous segmentation and res-toration functional or its extensions. As such, it formsa diffuse alternative to sharp interfaces or level setsand parametric techniques. The functionality of the ATfield, however, is not limited to segmentation and res-toration applications. We demonstrate the possibilityof coding parts – features that are higher level thanedges and boundaries – after incorporating higher levelinfluences via distances and averages. The iterativelyextracted parts using the level curves with double pointsingularities are organized as a proper binary tree. In-consistencies due to non-generic configurations for levelcurves as well as due to visual changes such as occlu-sion are successfully handled once the tree is endowedwith a probabilistic structure. As a proof of concept,we present 1) the most probable configurations fromour randomized trees; and 2) correspondence matchingresults between illustrative shape pairs.

The work is a significant step towards establishingexponentially decaying diffuse distance fields as bridgesbetween low level visual processing and shape compu-tations.

Keywords bridging low level and high level vision ·

shape computation · screened Poisson PDE · implicitrepresentations · linear model for reaction-diffusion

Preliminary conference version introducing randomized parthierarchy tree has appeared in SSVM 2011. The non-localfield has first presented in [48]

S. Tari and M. GenctavMiddle East Technical University06800 Ankara , Turkey Tel.: +90-312-2105554E-mail: [email protected]

1 Introduction

The phase field of Ambrosio and Tortorelli [1] serving asa continuous indicator for the boundary/non-boundarystate at every image point has proven to be a power-ful tool in variational image and shape analysis, whereformulations typically involve both region and boun-dary terms. Some examples are [8, 16, 18, 24, 34, 38].In computer vision, Ambrosio and Tortorelli (AT) fieldhas first appeared in [43] merely as a technical deviceto apply gradient descent to the well-known segmenta-tion energy of Mumford and Shah [32]. Over the years,however, it has been extended in numerous ways to ad-dress a rich variety of visual applications. Earlier worksinclude Shah et al [46], Pien et al [37], Proesman et al[39], March and Dozio [29], Teboul et al [53]. During thelast couple of years, we have witnessed an increasingnumber of promising works modifying or extending ATbased models, including stochastic extension [34, 33]and incorporation of contextual feedback [18, 19].

In a nutshell, the AT phase field is a minimizer ofan energy composed of two competing terms. One termfavors configurations that take values close to either 0or 1 (separation into boundary/non-boundary phases)and the other term encourages local interaction in thedomain by penalizing spatial inhomogeneity. The ATfield is given by

argminvρ

Ω

1

ρ(vρ(x)− 1)2

boundary/interiorseparation

+ρ |∇vρ(x)|2

local

interaction

dx dy

with vρ(x) = 0 for x = (x, y) ∈ ∂Ω (1)

Here, Ω ⊂ R2 is an open set denoting a shape and ∂Ω

is its boundary; vρ is the field defined over the com-pact set Ω ∪ ∂Ω and parametrized by ρ controlling the

2 Tari and Genctav

relative influences of the two terms. (Take a note forlater reference that the constant 1 is to be interpretedas the restriction of the characteristic function χΩ tothe interior.)

As the parameter ρ tends to 0, the separation termis strongly emphasized; consequently, the field tends tothe characteristic function 1−χ∂Ω of the boundary set∂Ω and the energy to twice the boundary length, fol-lowing the Γ convergence framework [10]. This featureof the AT field makes it convenient to integrate regionand boundary terms in image segmentation formula-tions, when the main task is to detect shapes of objectsand separate them from the background. For example,the energy,

E(u, v) =

R

β(u(x)− g(x))2 + α(v2|∇u(x)|2)

+1

2

ρ|∇vρ(x)|

2 +(1− vρ(x))2

ρ

dx dy (2)

serves as a basic model for simultaneous edge preserv-ing smoothing and segmentation, where R ⊂ R2 is con-nected, bounded, open subset denoting the image do-main, g is an image defined on R, u is the smoothed im-age, and α and β are scale space parameters determin-ing contrast and blurring. As the interaction paramaterρ tends to 0, the energy in (2) tends to the Mumford-Shah energy:

EMS(u,Γ ) =

Rβ(u(x)− g(x))2 dx dy

+

R\Γα|∇u(x)|2 dx dy +H

1(Γ ) (3)

where Γ ⊂ R is the edge set segmenting R and H1(.) isthe one dimensional Hausdorff measure.

In its most widespread imaging and vision appli-cations, the AT field is a technical device to approxi-mate Mumford-Shah or its variants. As such, it formsa diffuse alternative to sharp interfaces or level sets[13] and parametric techniques [14]. The functionalityof the AT field, however, is not limited to segmenta-tion/restoration applications. Indeed, the minimizer of(1) is the solution of−

1

ρ2

vρ(x) = −

1

ρ2(4)

subject to homogeneous Dirichlet condition on the boun-dary ∂Ω. The above Partial Differential Equation (PDE)is called screened or damped Poisson, and its solutionmay be interpreted as a diffuse distance transform [52].Note that

1− vρ(x) ≈ e− d(x)

ρ

That is, vρ is a monotone function of the distance to theboundary, approximately exponential. Its diffuse char-acter is better reflected in the following relation (whichholds based on [32, Appendix]):

1− vρ(x) = −ρ

1 +

ρκ(x)

2

∂v

∂n(x) +O(ρ3) (5)

where κ(x) is the curvature of the level curve of vρ

passing through the point x and n is the direction ofthe normal. In Fig.1, the level curves of the field v fora cat shape is shown. This reciprocality between thegradient of the field and the level curve curvature hasbeen exploited in extending the ability of the field incoding medial axis, a feature that is at a higher levelthan an edge or a shape boundary in Tari and Shah[51], Tari, Shah, and Pien [52]. A parameterless form of(4) has been independently proposed by Gorelick et al[22] as a random walk distance, and successfully usedfor representing silhouettes.

Fig. 1 The level curves of vρ for a cat, based on [52]. In theleft column, ρ is on the order of the pixel size whereas in theright column on the order of the shape width. As ρ increasesthe level curves gets smoother. Observe that the level curvesof the field mimic the evolution of the shape boundary with aspeed proportional to the level curve curvature. Even thoughthe PDE (4) is linear, the behavior of the level curves ofits solution is governed by a non-linear equation, reaction-diffusion [52].

One motivation for the work in [52] may be relatedto a neurophysiological observation by Lee [28, 27] thatthe cells of the primary visual cortex V 1 respond in dif-ferent ways in different time intervals after visual stimu-lus. In line with a more traditional view, early responsesof cells exhibit local edge detector characteristics, i.e.,they are sensitive to the physical discontinuities in avisual input. At later stages, however, these early localresponses evolve to reflect higher order shape features.Specifically, indications of shape computation are foundin the primary visual cortex: for certain stimuli, cellsbecome active along medial locations of an object.

In the long run, AT-like implicit intermediate levelrepresentations may serve as a basis to computationallyunderstand this neurophysiological observation. They

Non-local AT to Randomized Part Tree 3

may provide a bridge between the high level process ofshape abstraction and low level processes of smoothingand edge detection. It is therefore not surprising thatequivalent fields have been independently re-inventedover the last decade. Most recently, Zucker [15, 56] pro-moted the same PDE with a scaled right hand side forintermediate level vision based on biological motives.Notice that scaling the right hand side merely scalesthe solution, but does not change the geometry of thelevel curves. After scaling the right hand side by ρ2,one may even explore letting the interaction parameterρ tend to ∞ [3] to obtain a parameterless form. Thisyields, as in Gorelick et al [22], the Poisson PDE:

vρ(x) = −1 (6)

subject to homogeneous Dirichlet condition.

Our contribution Going beyond the basic AT fieldand its adaptation in [52], we further the ability ofthe AT field in coding features that are higher levelthan edges and boundaries. Our basic tool is a newfield which can be cast as a minimizer of the AT en-ergy augmented with a non-local term. This non-localterm is the squared average of the field. Thus, it forcesthe new minimizer to acquire both positive and nega-tive values. Interestingly, positive and negative valuescluster in a way that makes parts emerge. A preview isprovided in Fig. 2. For ease of color visualization, abso-lute value of the field is used. Compare the level curvesin Fig. 2 to those of the AT field and observe how partsemerge without requiring any features to be explicitlyestimated.

After discussing the formulation of the new field, wepropose a new data structure which we call Random-

ized Hierarchy Tree. It fully exploits the geometry andtopology of the level curves of the new field in a ro-bust manner. The Randomized Part Hierarchy Tree isa part hierarchy tree, say, a Region Adjacency Graphfor the level curves of the new field passing through sad-dle points, but endowed with a probabilistic structureto achieve robustness. The family of level curves of thefield can be easily given a tree structure since they areordered by inclusion relation.

The Randomized Part Hierarchy Tree is a rich struc-ture, and a natural question is which sample from arandomized tree is the best one. Samples with highprobabilities may be ideal choices in the absence of anyother bias. But, of course, the choice of the best sam-ple depends on the task. Thus, as a proof of concept,we present not only the most probable trees for a givenshape but also pairs of trees that yield the best corre-spondence matching results between illustrative pairs ofshapes. We employ the pairwise matching as a sample

Fig. 2 The new field. Observe how parts emerge, withoutrequiring any features to be explicitly estimated. The upperrow depicts level curves for the cat and a hand shape. Com-pare them to those of the AT field in Fig. 1. For ease of colorvisualization, absolute value of the field is used. The new fieldattains zero (dark blue color) not only on the boundary butalso in the interior. The zero locus in the interior separatesthe shape into two regions: the peripheral and the central.Inside both regions, the level curves are analogous to those ofthe AT field. The bottom rows depict the absolute values ofthe field for a variety of shapes including the cat.

task to guide task dependent choice of a sample froma randomized tree. Our experiments with illustrativeshape pairs demonstrate the potential of the new rep-resentation for shape matching. Shape retrieval, how-ever, is out of the scope of this paper. It is an activeand growing field with many interesting works, includ-ing Kontschieder et al [26], Rosman et al [42], Bai et al[5], Yang et al [54]. Ultimate success in shape retrieval,especially from large data sets, is to a large extent de-termined by the choice of clever clustering schemes aswell as optimum feature weight selection; we addressneither. Our focus is the exploration of the AT-field asa means for shape computation; it is more of a steptoward establishing AT-like implicit diffuse representa-tions as bridges between low level and high level vision,combining shape abstraction with image processing, inthe spirits of both the Mumford and Shah [32] and Tari,Shah, and Pien [52].

4 Tari and Genctav

More on related work Hierarchical subdivision of adomain using the level curves of functions defined onthe domain is a well rooted and widely applied prac-tice in image and shape analysis, e.g., Ballester et al[7], Reuter [40], Meyer [30], Bajaj et al [6]. The con-cept goes back to Morse [31]. Hierarchical representa-tion of the functions are achieved by the persistencepairs of the Morse-Smale graph (See [9, 17]). Here, thechoice of domain function makes a difference. For exam-ple, Reuter [40] uses the eigenfunctions of the Laplace-Beltrami operator. Partitioning is achieved via level setspassing through important saddles of the eigenfunc-tions. Concepts from persistence theory [9, 17] are em-ployed for simplification. Our approach is quite differentfrom persistence graph based ones. Firstly, in our case,the new field does most of the trick. Secondly, insteadof forming a persistence graph, we form a soft saliencymeasure for each partition based on the closeness of twoconsecutive saddles ordered by inclusion. Thirdly, thissaliency measure is merely used for assigning probabil-ities; the partition trees are not simplified. Randomiza-tion achieves robustness not by simplification based onpersistence but by retaining multiple possibilities. Sta-bilization based on retaining multiple possibilities is avery useful idea which had also used in other contexts,e.g., [12] and [55].

The Euler-Lagrange equation for the AT energy, (4),suggests an alternative interpretation of the AT field:The AT field is the projection of the shape indicatorfunction (i.e., the constant function in the right handside) to the eigen space of the Dirichlet −

1ρ2 , or

in the parameterless case. This observation, by itself,may seem insignificant. However, it does say that theAT field is a summary representation of the DirichletLaplace spectrum. As such, it has a connection to a re-cent popular shape representation scheme called HeatKernel Signature (HKS) [47] or Auto Diffusion Func-tion (ADF) [20]. Both the AT field and HKS/ADF arescalar valued functions defined over a set representingthe shape. Whereas for HKS/ADF this set is a manifoldrepresenting the boundary of the shape, it is the shapeinterior for the AT field. Relatedly, Aubry, Schlickewei,and Cremers [4] classifies Tari and Shah [51] and Gore-lick et al [22] as precursors of the use of the DirichletLaplace operator, in the Euclidean setting.

Following Buades et al [11], non-local terms widelyappear in recent variational formulations, e.g., Gilboaet al [21], Peng et al [36], Jin et al [23], Jung and Vese[24], Jung et al [25]. Typically, local image derivativesare replaced by non-local ones to model self-similaritybased denoising. UCLA group also proposed non-localversions of the Ambrosio-Tortorelli approximations ofMumford-Shah [32] and Shah [44] functionals (Jung

and Vese [24], Jung et al [25]). Our non-local modifi-cation is quite different than these works. Instead ofmodifying image derivatives, we modify the phase fielditself. This is achieved by adding the integral over theshape domain of squared average of the phase field tothe phase field energy. This average term should notbe confused with the stochastic extension in Patz et al[34]. They retain the Mumford-Shah functional as it isbut replace the images with their expectations in orderto cope with uncertainty. On the other hand we add anadditional term which pushes the phase field towards azero mean.

Our preliminary work on Randomized Part Hier-archy Tree has been presented in [50], and an earlierconstruction of the field from a discrete perspectivealong with preliminary single layer decompositions in[48, 49]. We generalize our constructions to cases wherethe shape boundary is not given in closed binary formwith the help of Rosin and West Transform [41].

2 The Non-local Energy and Its Minimizer

Let us consider

argminω

Ω

ρ

|∇ω(x)|2 + (Avg (ω(x)))2

+1

ρ(ω(x)− f(x))2

dx dy (7)

with ω(x) = 0 for x = (x, y) ∈ ∂Ω

Here, f(x) is a monotone function of the distance to∂Ω, e.g., the usual distance transform, and Avg (ω(x))is the average of ω given by

1

|Ω|

ω(x) dx dy.

Following the same rationale in [52, 45, 2, 3], the inter-action parameter ρ will be chosen large enough (moreon this later). Hence, ω will not be parameterized byρ. Notice that the new energy (7) which is composed ofthree terms is obtained by modifying the AT energy intwo aspects. First, the interaction term of (1) is addi-tively augmented with a quadratic cost: (Avg (ω(x)))2.This non-local cost pushes the minimizer towards ac-quiring a low average – the average being computedover the entire domain Ω. Second, the constant 1 inter-preted as the restriction of the domain indicator func-tion χΩ(x) to the interior is replaced by a weighted do-main indicator f(x) which is an increasing function ofthe shortest distance to the boundary, e.g., the distancetransform. This pushes the minimizer towards acquir-ing larger values in central locations as compared toperipheral ones.


At a first glance,

Ω(Avg (ω(x)))2 dxdy seems to

favor spatial homogeneity. Obviously, its minimizer sub-ject to homogenous Dirichlet boundary condition is theflat function ω = 0. But this term is also minimized byan oscillating function of which positive and negativevalues add up to zero. In this respect, it is a separationterm. For the new term to act as a separation term,however, an external inhomogeneity is required. Whena term is added to favor sign changes, the smoothnessterm |∇.|2 forces the locations of identical sign formspatial proximity groups to prevent wild oscillations.

Now the issue is how to influence ω in order to makeemerging spatial proximity groups perceptually mean-ingful. Indeed, the purpose of the second modification,i.e., replacing the domain indicator function χΩ(x) witha weighted one, is to influence spatial grouping of loca-tions of equal sign in a particular way that sign changesseparate gross structures from boundary details. Theupper zero level set Ω+ =

(x, y) ∈ Ω

ω(x, y) > 0

covers central structure while on the contrary the lowerzero level set Ω− =

(x, y) ∈ Ω

ω(x, y) < 0

coversthe peripheral one, including limbs, protrusions, boun-dary texture and noise.

Some comments on the term (ω(x)− f(x))2: Justlike (v(x)− χΩ(x))

2, it is a phase separation term. Theimposed phases, however, are the level curves of thedistance transform. Ignoring the non-local term of theenergy, the minimizer of the remaining two terms,

argminω

Ω|∇ω(x)|2 + (ω(x)− f(x))2,

is geometrically equivalent to the AT field. This is be-cause the level curves of the AT function in (1) areequivalent to the level curves of a smooth distance trans-form [52]. Thus, replacing the domain indicator func-tion with a weighted one merely scales ω without qual-itatively affecting the geometry of its level curves. Thechoice of the weighted indicator function is critical: Itimplicitly codes medialness. When all the three termsare considered together, the minimizer tends to havepositive values at central locations and negative valuesat peripheral locations because the penalty incurredby assigning negative values to central locations withhigher positive f values is higher than the penalty in-curred by assigning negative values to locations withlower f values.

Similar to the AT function, the new minimizer is acompromise between inhomogeneity and homogeneity.The inhomogeneity is forced not only externally (by f)but also internally (by the non-local term).

A side comment is that an alternative strategy couldhave been to make the original phase separation termspace variant by multiplying it with a medialness de-

pendent coefficient. We prefer using a weighted indi-cator: In our framework, ω is the best approximationof an externally provided function subject to internalregularities imposed by the remaining two terms.

The parameter ρ should be chosen large enoughso that the attachment to the external inhomogeneityshould not dominate over the tendency to interact. In-deed, in the absence of the third term, a good prac-tice is to chose ρ at least on the order of the maximumthickness for the diffusive effect of |∇.|2 to influence theentire shape; see [52, Fig. 1]. The same argument holdshere, too; since the effect of the third term is to partitionΩ into subdomains within which ω is morphologicallysimilar to the AT function. Additionally, notice that theexpression responsible for sign change,

ω(x) dx dy,

has already been normalized by 1|Ω| . Hence, ρ should

be larger than|Ω|.

In Fig. 3, an illustration for a 1-D case is given: ωis plotted for four different values of ρ ranging between|Ω| and 0.5 ∗ |Ω|. The locations of the extrema and

the zero crossings remain the same unless ρ is signifi-cantly smaller than

|Ω|. Naturally, ω gets flatter as ρ

increases, but the flattening can be avoided by scalingeither f or ω.

Fig. 3 ω for an interval; ρ varying from

|Ω| to 0.5 ∗ |Ω|.

Similarly in 2-D, the geometry of the level curvesis robust as long as ρ is chosen suitably large. Someillustrations have already been depicted in Fig. 2. Foreach shape, zero level curves separate central and pe-ripheral structures in the form of upper and lower zerolevel sets: Ω+ and Ω−.

Ω−, the peripheral structure, includes all the de-tail: limbs, protrusions, and boundary texture or noise.In contrast, Ω+, the central structure, is a very coarseblob-like form. It can even be thought as an intervalestimate of the center whereas the centroid is the pointestimate. Most commonly, Ω+ is a simply connectedset. Of course, it may also be either disconnected or

6 Tari and Genctav

multiply connected. For instance, it is disconnected fora dumbbell-like shape (two blobs of comparable radiicombined through a thin neck), and multiply connectedfor an annulus formed by two concentric circles. Indeed,the annulus gets split into three concentric rings wherethe middle ring is the Ω+. For many shapes, however,Ω+ is a simply connected set:

Firstly, shapes obtained by protruding a blob as wellas shapes whose peripheral parts are smaller or thinnerthan their main parts always have a simply connectedΩ+. This is because when the width of a part is small,the highest value of f inside the part is small. Thus,local contribution to (ω − f)2 incurring due to negativevalues is less significant for such a part as compared tolocations with higher positive values of f . As a result,ω tends to attain negative values on narrow or smallparts as well as on protrusions. Shapes with holes alsomay have a simply connected Ω+ – as long as the holesare far from the center.

Secondly, even a dumbbell-like shape may have asimply connected Ω+. This happens if the joining area,namely the neck, is wide enough. Nevertheless, this doesnot cause any representational instability: Whereas theΩ+ for a blob-like shape has a unique maximum lo-cated roughly at its centroid, theΩ+ for a dumbbell-likeshape has two local maxima indicating two bodies. Eachbody is captured by a connected component of an upperlevel set whose bounding curve passes through a saddlepoint. At a saddle point p, such that ω(p) = s, the s−

level curve has a double point singularity, i.e. it forms across. Hence, the upper level set Ωs =

(x, y) ∈ Ω+

ω(x, y) > s yields two disjoint connected componentscapturing the two parts of the central structure.

In contrast to Ω+, the peripheral structure Ω− isnever simply connected. Indeed, its hole(s) are carvedby Ω+. It is also possible that Ω− is disconnected. Forinstance, for an annulus, it is two concentric rings. Ad-ditionally, Ω− may be disconnected when there are sev-eral elongated limbs organized around a rather smallcentral body, e.g., a palm tree. Ω+, being small, is tol-erated to grow and reach to the most concave parts ofthe shape boundary creating a split of Ω− by the zero-level curve. Similar to those in Ω+, the level curvesin Ω− that are passing through saddle points providefurther partitioning. The partitions are in the form oflower level sets Ωs =

(x, y) ∈ Ω−

ω(x, y) < s.

To sum up, within both Ω+ and Ω−, nested opensets (upper level sets inside Ω+ and lower level setsinside Ω−) characterize the domain. The level curvesbounding the level sets are either closed curves or closedcurves with crossing points. The ones with crossingpoints are of particular interest because the respectivelevel sets are partitioned at those points into two dis-

tinct connected components. A crossing of a level curveoccurs at a saddle point of ω. Of course, each lower levelset may contain other saddle points. Consequently, thepartitioning is binary and iterative and determined bythe order of saddle points.

It is not generically possible that a level curve hassingular points of higher order because such singularpoints are unstable and may be removed by a slightchange in ω. It is also highly unlikely that a connectedcomponent of an s−level curve has two distinct crossingpoints. This issue is tackled in §3.1 via randomization.

If the input is not a silhouette Even though theconstruction of the ω field has been motivated for boun-ded connected open sets, it can be generalized to awider class of input images. In order not to interruptthe flow, we postpone this topic to § 5 where we will dis-cuss the computation of the field for three illustrativeimages shown in Fig. 4.

If the input is in higher dimensions There is noth-ing that prevents the generalization of the field con-struction (i.e. the proposed non-local modification) toarbitrary dimensions. Of course, different types of sad-dle points will appear. Again, in order not to interruptthe flow, we postpone this issue till Appendix where weprovide the details of saddle point detection.

Minimizing the new energy To find the minimizerof (7), let us consider the Gateaux variation

δE(ω, ν) := lim→0

E(ω + ν)− E(ω)

=

∂

∂E(ω + ν)

=0

for all test functions ν properly prescribed on the boun-dary. Setting δE(ω, ν) = 0 yields

Ω

1

ρ2

ω − f

−ω

ν + Avg

ωAvg

ν= 0 (8)

where

Avgg=

1

|Ω|

Ω

g dx dy.

The condition is not intrinsic, but satisfied for all testfunctions if both of the following holds:

1

|Ω|

Ω

ω(x, y) dx dy = 0 (9a)

−ω(x, y) +1

ρ2ω(x, y)−

1

ρ2f(x, y) = 0 (9b)


Fig. 4 The shape may not be given in the form of bounded connected open domain. Computing ω from these images will bediscussed in § 5.

Fig. 5 An initial part hierarchy tree (leftmost) and its possible re-organizations. A random sample from the given hierarchytree should be in one of the four forms. See the text.

3 Randomized Part Hierarchy Tree

Since the partitioning of either of the two structures(Ω+ and Ω−) is iterative and binary, the parts can beorganized in the form of a proper binary tree start-ing from the second level. Let the root node hold theshape, and its children (i.e., the upper and lower zerolevel sets of ω) hold disjoint regions capturing eithera central or a peripheral structure. Suppose that thecentral and peripheral structures are composed of Nc

and Np disjoint sets respectively. We can enumerate thenodes holding these sets as 11, 12, · · · , 1Nc for the Ω+,and as 21, 22, · · · , 2Np for the Ω−. This is the secondlevel of the tree and the first level of our partitioning.Even though the root may have more than two chil-dren, each subtree starting from its children is a properbinary tree. This is because any split inside an Ω+ oran Ω− occurs at saddle points. Each connected compo-nent of the second level as well as its children either getssplit into two level sets or remains as it is. We call thishierarchical organization as initial part hierarchy tree.It is analogous to Region Adjacency Graph for a binaryimage, a common data structure in image processing.A hypothetical initial part hierarchy tree is illustratedin the leftmost column of Fig. 5. In a real example, thenodes hold application dependently selected propertiesof the respective level sets.

Binary splits according to saddle points ultimatelyproduce a collection of leaf parts (parts stored at leafnodes) covering the shape. These leaf parts are con-siderably robust with respect to visual transformations

and within category variations. The hierarchical orderand granularity of parts, however, are typically not.For instance, a weak saddle is easily removed whenthe shape is slightly smoothed. Likewise, certain non-generic configurations such as level curves with spa-tially distinct saddle point or triple point singularitiescan not occur. Indeed, such configurations are easilyreplaced by one of the corresponding generic configu-rations. These generic configurations may differ fromshape to shape within a category. Furthermore, whena shape is occluded by another shape, new spuriousperipheral parts due to occluder cause a shift in thehierarchical positions of some parts. Thus, on the onehand, all these make the part hierarchy tree an unstablerepresentation.

On the other hand, the leaf nodes alone are notcompletely free from unintuitive over and under parti-tioning. See Fig. 6, the head of the second cat and thefront legs of the first horse. These kind of instabilitiesdue to initial partitioning can only be handled if theleaf parts are considered within the entire context of apart hierarchy tree.

The good thing is that the relative values of ω attwo saddle points prompting any two consecutive splitsare very stable indicators of the hierarchical structure.Therefore, we use the difference between the valuesof two successive saddle points as a measure of thesaliency of the partitioning prompted by the latter sad-dle point. Our saliency criterion requires two consecu-tive saddle points (note that converting a single saddlepoint value to a tree depth by discretization would bring

8 Tari and Genctav

Fig. 6 Part structure at leaf level. For each shape, the cen-tral structure is shown in whitish gray. To a large extent, leafnodes are consistent with respect to visual transformationsand inter-category variations, but not completely free fromun-intuitive over and under partitioning (the head of the sec-ond cat and the front legs of the first horse). These kind ofinstabilities due to initial partitioning are to be handled byconsidering the leaf parts within the entire context of a parthierarchy tree.

back the previous robustness issue). For normalization,the saliency measure is to be converted to a probabilitymeasure for the respective node. Considering the proba-bility measures for all the nodes, the initial part hierar-chy tree can be endowed with a probabilistic structurefrom which possible re-organizations of the initial hi-erarchy tree are to be sampled. We call this new datastructure as Randomized Part Hierarchy Tree. In con-trast to an initial part hierarchy tree, a random samplefrom a randomized part hierarchy tree is not necessar-ily a proper binary tree. Below, we give the details ofour randomization procedure.

3.1 Randomization

The randomization starts from level 3 nodes and prop-agates through their children. Recall that this is thefirst level of nodes that are created by a saddle point.For each pair of siblings, there are two possible events:The pair of siblings either maintains their depth (nochange in the local tree structure) or assumes the depthof their parent (change in the local tree structure). Inthe latter case, the sibling nodes replace the parent,each becoming a child of the grandparent. The prob-abilities of the two events are derived from a quantitythat we denote by ω. It is a property of a split; thatis, ω values of any two siblings are equal. We defineω as the difference between the saddle point values ofa node and its parent, normalized by the saddle pointvalue of the node. Because the magnitude of the saddlepoint value of a node is always greater than that of its

parent, 0 < ω ≤ 1. The equality is attained at thethird level.

A small value of ω implies that the consecutivesaddle points are closer in value. Therefore, a slightchange in their values may change their order hencethe local tree structure. In contrast, a large value ofω implies that the consecutive saddle points are wellseparated. Therefore, the local structure is stable. Werequire that the probability p that a local structurechange is necessary approaches 1 as ω approaches tothe smallest possible value which is 0. Equivalently, pshould approach 0 as ω approaches to its largest pos-sible value. The function e−cω is a good candidate forestimating p. We have set c = 4, so the probability ofre-organization when ω is the largest is less than 2%,since e−4 = 0.018

Let us consider the hypothetical initial part hierar-chy tree previously shown in Fig. 5 (left). Assume thatω = 0.301 for nodes 211 and 212. With probability(1− p) = 0.7, the local structure is preserved, whereaswith probability p = 0.3 nodes 211 and 212 replacetheir parent and become children of their grandparent,the root. Assume that ω = 0.128 for 2111 and 2112.Then with probability (1− q) = 0.4, the local structureis preserved, whereas with probability q = 0.6 nodes2111 and 2112 replace their parent and become childrenof their grandparent; the grandparent is either node 21with (1−p) = 0.7 or the root with p = 0.3. Thus, thereare four possible organizations, as shown in Fig. 5: withprobability (1− p)(1− q) = 0.28, the entire structure ispreserved; the organization takes the form of the first(the leftmost) tree; with probability (1 − p)q = 0.42,the organization takes the form of the second tree; withprobability p(1 − q) = 0.12, the organization takes theform of the third tree; with probability pq = 0.18, theorganization takes the form of the fourth (the right-most) tree.

Depending on the application, various properties re-lated to seed parts, i.e., the level sets passing throughsaddle points, may be extracted. For instance, for eachseed part, we extract a region enclosing it. Since bound-aries of level sets serving as a seed part pass throughsaddle points, an enclosing region is easily obtained as amorphologic watershed zone, whose seed is the respec-tive level set. From now on, we refer to level sets asseed parts, and to watershed regions as enclosing partsor simply parts. Both to keep illustrations simple andresource requirements low, we require that each enclos-ing part to neighbor the central structure, and bothseed parts and enclosing parts to have a certain size.Implementation-wise, splits are performed only throughthe saddle points that reside on the boundaries of thewatershed zones that are touching to the closures of


central structures, and any split producing a seed partthat is less than 0.05% of the shape or an enclosing partthat is less than 0.5% of the shape is ignored.

In the next two figures, a realization using the chee-tah is illustrated. The first illustration, Fig. 7, is theinitial part hierarchy tree, where each node depicts therestriction of the cheetah image to a respective enclos-ing part. As there are four parent nodes other than theroot, there are total 24 = 16 possible re-organizations.Four of them are depicted in Fig. 8. Observe that thethird re-organization coincides with the initial part hi-erarchy tree.

The randomized part hierarchy tree provides a gradedpart representation, retaining multiple possibilities; thisbrings stability. A natural question, however, is basedon which criteria to choose one or few of these possibil-ities of the arrangement of parts, i.e., the samples froma randomized tree.

4 Which samples?

The best sample(s) depends on the task. For exam-ple, if the goal is to find corresponding parts betweentwo given shapes of similar structures, the best samplescould be the ones that yield the best match. Postpon-ing this issue till (§4.1), we now show the most prob-able samples from the randomized trees of illustrative

shapes. We do so because, in the absence of any task-dependent bias, the samples with higher probabilitiesare expected to be more intuitive than the ones withlower probabilities.

Let us first explain how we are going to visualizeour trees from now on in a more informative way thanwe have done in previous visualizations, i.e., for thecheetah. The new visualization makes it possible to de-duce the full binary partition history (as conveyed byan initial hierarchy tree) based on the knowledge ofany random sample from the randomized tree. At eachnode, the enclosing part is depicted as dark gray, andthe neighboring part light gray. The neighboring partgives a clue on both the level and the context in whicha respective leaf part is obtained. Using node numbersand neighboring parts at each leaf node in a samplere-organization, one can deduce the initial partitioningstructure, even in the absence of the initial part hier-archy tree. At each node, the seed part is overlayed asblack on the dark gray enclosing part. For compoundparts stored at levels closer to the root, seed and en-closing parts almost overlap. In contrast, for finer parts(e.g. leaf parts), the discrepancy between the seed andenclosing parts is higher. The discrepancy is also higherfor more protruding parts. Therefore, at some nodes,the seed part 1) may be too small to be visible; or

Fig. 7 The initial part hierarchy tree for the cheetah. It is a proper binary tree.

10 Tari and Genctav

Fig. 8 Four possible re-organizations in the order of decreasing probabilities ≈ 0.38, 0.28, 0.15 and 0.10. The third onecoincides with the initial part hierarchy tree.


2) may get nearly as large as the enclosing part, thus,occluding the latter one.

For our illustrations, we have picked the hand cat-egory, for which we have a good reason. It readily sug-gests a commonly agreed upon hierarchical organiza-tion among the five fingers. On one hand, the first splitis expected to be binary, separating the thumb fromthe remaining four fingers. On the other hand, the nextsplit is expected to be neither binary or ternary butquaternary: simultaneously separating the four fingers.A quaternary split is not generic. Thus, the hand is asimple case containing non-generic splits.

Let us first examine Fig. 9. As expected, the thumbis separated from the remaining four fingers, and thenthe pinky (21211) from the remaining three. Note thatthe node holding the remaining three fingers has beeneliminated in the most probable re-organization. Thirdly,the ring finger 212121 gets separated from the middleand pointing fingers. Finally, the middle and pointingfingers get separated, producing four individual fingers.Let us now examine the trees in Fig. 10. Again, in bothcases, the first split is between the thumb and the re-maining four, and the next between the pinky and theremaining three. But then the pointing finger – not thering finger – gets separated, leaving the middle two. Inall three cases, each of the four fingers gets separated atdifferent levels ranging between three to six. For this,there is no intuitive reason.

Nevertheless, in the most probable samples, all fourfingers are brought to the same level; compatible withour intuition. Moreover, the probabilities for the mostprobable trees are significantly higher than the remain-ing ones for each of the three hand shapes. For exam-ple, for the first hand, the chance for any organizationin which the four fingers reside at different levels inthe hierarchy is less than %6. In the other two cases,pinky is a bit separate from the remaining three fingers.Hence, chance of an organization in which the four fin-gers reside at different levels in the hierarchy is a bithigher. Nevertheless, chance of an organization in whichthe remaining three fingers reside at different levels inthe hierarchy is again less than %6.

Finally, let us examine Fig. 11. Here, the hand shapeis drawn in a way that visually forces stronger pairwisegrouping of 1) the pointing finger and the middle one;2) the pinky and the ring finger. The top four probabil-ities are closer (0.25, 0.23, 0.14, 0.12) and none of themcontradicts to pairwise grouping. In the figure, only thesecond and third most probable organizations are de-picted. The third most probable organization (on theright) better reflects the pairwise grouping of the fin-gers on the left and the right.

Clearly, more likely hierarchies are more intuitiveand consistent with respect to visual transformations.But still, we believe that it is important to retain theentire tree with its probabilistic structure and let thechoice be dictated by the problem. Therefore, we ex-periment with a specific task as a proof of concept. Asalso validated by our experiments in § 4.1, the best or-ganization depends not only on the shape itself but alsoto other shapes that it is compared to.

4.1 Matching Experiments

Basic procedure The input to our pairwise shape match-ing process is the respective randomized part hierarchytrees of the two shapes to be matched. Each randomizedpart hierarchy tree is independently sampled severaltimes forming sample tree pairs. Each sample tree pairis independently matched. The sample pair that yieldsthe best correspondence is called as the winning pair.Each of the two members of the winning pair is calledthe winning re-organization of the respective shape.

In order to clearly and objectively assess the powerof the proposed data structure, we avoid any featureoptimization. The focus is merely on the flexible orga-nizational hierarchy. With this rationale, we use two ofthe simplest properties of the parts: the maximum ω

value inside the part and the part area. For fine parts(leaves), the maximum value of ω inside a part is relatedto the part width. For compound parts, this feature isrelated to the width of the widest leaf part for whichthe compound part is an ancestor. Both features (areaand the maximum ω value) need to be normalized tomake them insensitive to scaling and to minimize fea-ture bias. The first feature is normalized by dividingit by the maximum value of the ω function within theshape domain; whereas the second feature by dividingit by the area of the central part. Experiments showthat the range of the normalized second feature differ-ences is more than twice that of the normalized first. Inorder to eliminate bias, the second feature differencesare halved. Then each feature difference x is convertedto a similarity value in the range (0, 1] via an exponen-tial decay e−4x. Final similarity is taken as the averageof the normalized feature similarities.

Tree matching There are several methods for matchingtwo trees. As the key idea here is the randomization,we do not dwell on the choice of matching algorithm,and do not attempt to devise clever similarity measures,similarity re-ranking or clever classifications as theseimprovements can be studied independently. In the cur-rent implementation, pairwise matching problem itself

12 Tari and Genctav

Fig. 9 The most probable sample for a hand shape. Consistent with our intuition, all four fingers are at the same level. Ateach node, the enclosing part is depicted as dark gray, whereas the respective seed part as black. The neighboring part (thelight gray) is also shown. The neighboring part gives a clue on both the level and the context in which a respective leaf partis obtained. The initial partitioning tree can be deduced using node numbers and neighboring parts at each leaf node.

Fig. 10 The most probable samples for two more hand shapes. Again, all four fingers are at the same level.

Fig. 11 Another hand shape where a pairwise grouping between the left and the right fingers are visually forced. The secondand the third most probable samples.


Fig. 12 The winning match. See the text for discussion.

is formulated as finding a maximal clique in the joint as-sociation graph of the pair of trees to be matched, forwhich we adopt an approximate algorithm from [35].Further details are given in the Appendix.

Illustrative examples Our first example is a matchingbetween a human silhouette and its occluded version.The winning match is shown in Fig. 12. We have tworemarks:

First, in both of the initial part hierarchy trees, thearms on the left reside at a different level than the armson the right, without any intuitive reason. This is read-ily revealed by their five digit versus four digit nodenumbers: 21111 versus 2112. (Recall that an initial parthierarchy tree can be inferred from any sample due toour informative visualization.)

Ideally, almost symmetric upper bodies (nodes 211for both trees) should contain two distinct saddle pointsp1 and p2 such that ω(p1) = ω(p2) = s. That is,two distinct saddle points on a single s − level curveshould simultaneously yield three nodes: the arm onthe left, the head, and the arm on the right. As dis-cussed earlier, certain configurations including this oneare not generic; even the slightest perturbation imposesa strict order on the saddle points. As a result, one ofthe arms along with the head gets separated from theother arm, and then the head-arm combination getsseparated. For both shapes, the saddle point value sep-arating the head and arm on the left combination fromthe arm on the right is very close to the saddle pointvalue separating the head from the arm on the left. Forthe first shape, the respective saddle point values afternormalization with respect to the global maximum ofω are −0.683 and −0.687 while the saddle point valueseparating the entire upper body from the entire lower

body is −0.053. Clearly, the hierarchical order betweenthe upper body and its children is much stronger thanthe hierarchical order among its children. Even thoughan arm re-organization is not necessary for finding thecorrect part correspondences since the structures of theupper bodies are already the same for the two initialtrees, the left and right arms are brought to the samelevel. This is because the probabilities of retaining theinitial binary local structures are very low due to theextreme closeness of the consecutive saddle point val-ues. Indeed, the probability of the winning organizationfor the un-occluded human is ≈ 0.91. For the occludedhuman, all of the highly probable samples exhibit thesame preferred organization for the upper body. But, onthe other hand, the probability of the winning organiza-tion for the occluded human case is very low (≈ 0.01),due to the occluding stick affecting the lower body (tobe discussed next). Here, the important point is thata re-organization which performs the best with respectto a specific task is selected regardless of its isolatedlikelihood. The representation of an individual shape isinfluenced by the shape it is being compared to.

Second, notice that the legs of the occluded hu-man are at the sixth level, whereas the legs of the un-occluded one are at the fourth, as revealed by theirnode numbers. This is due to the influence of two addi-tional parts belonging to the occluding stick, and posesa challenge for the matching process. Nevertheless, thelegs are brought to the same level as well as all of thecorresponding sub-parts. Consequently, correct associ-ations are found: 11 ⇔ 11 (central regions), 211 ⇔ 211(upper bodies), 21111 ⇔ 21111 (arms on the left),21112 ⇔ 21112 (heads), 2112 ⇔ 2112 (arms on theright), 212 ⇔ 21212 (lower bodies) 2121 ⇔ 212121(legs on the left) 2122 ⇔ 212122 (legs on the right).

14 Tari and Genctav

Each leaf of the first tree is matched to a leaf of thesecond tree (Fig. 13). We remark that the winning or-ganization of the lower body is highly unlikely when theoccluded shape is considered in isolation.

Fig. 13 Leaf level illustration of the matching in Fig. 12.

The second example is a matching between a cat anda horse (Fig. 14). As a consequence of a weak saddlemarked by the arrow, the front legs of the horse arenot partitioned into individual legs. This granularityinconsistency is resolved by matching the respective leafnode of the horse tree to a non-leaf node of the cattree, the parent of two leaves holding the front legs ofthe cat. The compound part stored at a non-leaf nodeis circled. In this example, the winning organizationscoincide with the respective initial part hierarchy trees.For the cat, this is also the most probable organization,whereas for the horse the second most probable one.

The third example is a matching between the pre-vious cat and another (Fig. 15 (a)). In this case thereare three weak saddles resulting with three unintuitivepartitions for the second cat: Firstly, its head is frag-mented; secondly, its rear body goes through a spuriousdivision causing an erroneous shift in the levels of itssub-parts; thirdly, its fourth leg is not separated fromits tail. Nevertheless, the winning match contains all ofthe correct associations. The rear body and its partsfor the second cat are properly lifted one level up; con-sequently, the correct associations of the parts of therear bodies are found successfully. The head of the firstcat matches to the parent of the two leaves holding twounintuitive parts of the head of the second cat. The twohead fragments of the second cat as well as the fourthleg and the tail of the first cat are correctly excludedfrom the winning set of associations as there are nocorresponding parts in the other tree.

Finally, the fourth example ( Fig. 15 (b)), is a match-ing between the previous horse and another. In additionto the previously discussed granularity inconsistency inthe first horse, there are several other inconsistencieswhich are not noticeable at the leaf level presentation.

For instance, the rear body of the first horse firstly splitsinto fourth leg-tail combination and third leg, and thenthe fourth leg is separated from the tail. On the otherhand, the rear body of the second horse after a spuriousdivision gets separated into rear legs and tail, and thenthe two legs are separated. Consequently, a two leveldifference occurs between the green colored third leg ofthe first horse (with four digit node number) and theorange third leg of the second horse (with six digit nodenumber). Despite granularity and order inconsistencies,all of the parts are correctly matched. The winning re-organization for the second horse is shown in Fig. 16.Interestingly, the winning organization for the horse hasquite low probability ≈ 0.01. Yet it wins in a scope de-fined by not only the horse but also the cat.

4.2 Run-time

Experiments are conducted on a 2009 model MacBookPro with Intel Core Duo 2.8 Ghz single processor with 4GB memory. The codes are generated in object orientedMatlab environment without any coding optimization.Average run time for sampling from a randomized treeis observed to be 0.03−0.1 seconds. Average run time formatching is observed to be 0.7−4 seconds. Typically, 50-100 random samples are generated, yielding total runtimes on the order of 1-2 minutes for each matching taskon a single processor. Of course, trials being completelyindependent can be run in parallel. It is also possibleto migrate codes to C/C++ language.

5 Real Images

There are several options when dealing with real im-ages. One option is to couple the computation of thefield ω with lower level tasks, namely smoothing andsegmentation. This can be achieved via coupled set ofPDEs, and is exactly what has been done by Tari, Shah,and Pien [52] for computing skeletons from real images,or by Proesman, Pauwels, and van Gool in [39] for solv-ing a variety of visual tasks. A slightly simpler strategy,however, is to start with a fuzzy (i.e. real valued) edgemap, and then compute a monotone function of thedistance from this fuzzy edge map (could also be a dif-fuse function), and then using this function as f in (7),compute the field ω. Note that a rough estimate of amonotone function of a distance transform is all that isrequired, and such a function is readily provided eitherby the coupled PDEs in Tari et al [52] or the SalienceDistance Transform in Rosin and West [41].

In Fig. 17, we depict the results obtained using thelater strategy for the cheetah image from Fig. 4. The


image is first processed with an AT based coupled PDEscheme [18], producing three alternative fuzzy edge mapsby varying scale space parameters and region of inter-est (the bottom row). Then Rosin and West SalienceDistance Transform for the edge maps are computed

(the middle row). Finally, by replacing the function f

in (7) with respective Salience Distance Transforms,three ω fields are computed (the top row). The com-puted fields are consistent with what one expects for theshape of the given cheetah. Observe that its two ears

Fig. 14 A granularity inconsistency. Due to the weak saddle marked by the arrow, the front legs of the horse are not partitionedinto individual legs. Nevertheless, the leaf node holding the pair of front legs is correctly associated to a non-leaf node of thecat, the parent of two leaves ech holding a front leg.

(a) (b)

Fig. 15 Two more cases involving both level and granularity inconsistencies.

Fig. 16 The winning re-organization for the second horse. Observe how two spurious splits are made irrelevant: 1) the head(21111) is shifted up to the level the front legs (2112); 2) the rear body (2122) is shifted up to the level of front body (211).Even though the resulting organization has quite low probability ≈ 0.01, it wins in a scope defined by not only itself but alsothe other cat.

16 Tari and Genctav

Fig. 17 The field ω computed from the cheetah image: (bot-tom) un-thresholded edge fragments computed at three differ-ent settings of scale and contrast parameters; (middle) Rosinand West; (top) ω. See the text for details.

are captured by peripheral level sets. In the next figure(Fig. 18, the left and the middle), we depict the leaflevel partitioning obtained using the third ω field. Forcomparison, the leaf level partitioning for a silhouetteobtained by thresholding a smoothed image obtainedusing the coupled PDE scheme in [18] is also shown inFig. 18 (the right).

It is also possible that a shape is indicated by acollection of disconnected boundary segments. Such in-complete boundaries may occur either as a side effectof binarizing edge maps computed from real images ordue to virtual contours, e.g., circle of isolated dots inFig. 4. The field ω can still be computed, provided thatthese boundary fragments are embedded in a larger do-main which serves as the closure of the shape domain.In Fig. 19, the field ω for the circle of isolated dotsis depicted. The first column depicts |ω| and the sec-ond the level curves of |ω|. Notice how Ω+ as well asthe level curves around it capture the perceived circularformation. The next figure (Fig. 20) depicts the field |ω|

and its level curves computed for a binary edge map forthe bear. The binary edge map is obtained by thresh-olding the AT phase field computed using the methodin [18]. Despite the incomplete boundary, the ears areclearly captured. The leaf level partitioning is shown in(Fig. 21), where the shape boundary is shown in black.

6 Summary and Discussion

An unconventional application of the AT phase fieldis presented after incorporating higher level influenceswith the help of lower level constructs such as dis-tances and global averages. The new field explicitlycodes parts, i.e., features that are higher level thanedges or boundaries. Within each part, successive levelcurves are related by a motion under curvature. A sim-ple partitioning scheme which fully exploits the geom-etry of the level curves of the new field is developed. Inorder to cope with representational instabilities due tonoise and visual transformations, the partitioning hier-archy is stored in a tree endowed with a probabilisticstructure. This new data structure is called Random-ized Part Hierarchy Tree. The idea is to achieve stabil-ity by virtue of retaining multiple possibilities, ratherthan by persistence based simplification. Any samplefrom the randomized tree of a shape is a possible hi-erarchical organization of the parts of the shape. Typ-ically, samples with higher probabilities are more intu-itive as compared to the ones with lower probabilities,thus, serve as the best choices in the absence of anytask dependent bias. We have also experimented withpairwise shape matching as a specific task. Illustrativeexperiments indicate the potential of the new field andrandomized hierarchy tree to shape matching.

Our work is a step toward establishing AT-like im-plicit diffuse representations as bridges combining shapeabstraction with early visual processes, sharing the spir-its of both [32] and [52]. Instead of coupling the fieldcomputation with smoothing/segmentation in a cou-pled PDE framework as in [52], we compute the weightedforcing function f in (7) from raw images via their un-thresholded real valued edge maps. The rest naturallyfollows.

As a final remark, quite interesting and non-linearbehavior is obtained despite the fact that equations arelinear.

Acknowledgements This work has been partially fundedby TUBITAK grant 112E208, the Alexander von HumboldtFoundation, and TUBITAK-BIDEB fellowship.


Fig. 18 Leaf parts obtained from the leftmost ω field of Fig. 17. For comparision, the leaves for a cheetah silhouette obtainedby thresholding a smoothed image obtained with the coupled PDEs in [18] are also shown (the right)

Fig. 19 The field ω when the boundary is not a closed curve.Notice how Ω+ as well as the level curves around it capturethe perceived circular formation.

Fig. 20 Field ω from a binary edge map. The boundary isnot completely specified, but the features such as ears arecaptured.

Fig. 21 The leaf level partitioning for the bear. Colors areadjusted for better visibility. The incomplete shape boundaryis shown in black.

Appendix A

Field computation To keep implementation simple wecombine (9a) and (9b), and multiply the inhomogeneityfunction f by ρ2 as scaling affects neither the geomet-rical nor the topological features of the level curves. Indiscrete setting this gives

L∗ωi,j

−

1

ρ2ωi,j −

1

|Ω|

(k,l)∈Ω

ωk,l

+fi,j = 0 (10)

where L∗ denotes the discrete Laplace operator; ωi,j ≈

ω(x = i · hx, y = j · hy) with hx and hy are spatial dis-cretization step sizes that are taken as the pixel width.Based on our discussions in §2, we set ρ2 = |Ω|. Next,we define a relaxed scheme:

ωn+1i,j = ω

ni,j + τA

where A is the left hand side of (10) and τ is the re-

laxation parameter selected smaller than |Ω|4|Ω|+2 . When

implemented in parallel the ωi,j values required for A

need to be taken from the nth step; if, however, the val-ues are being updated sequentially, updated values areused for faster convergence.

Saddle point detection For locating saddle points, wedo not rely on indefiniteness of the Hessian. Instead,we find watershed regions by calling Matlab’s watershedroutine, which uses Meyer’s method [30]. We eliminateall those watershed boundaries that do not neighborΩ+. On each of the remaining watershed boundaries,the saddle point is the minimum of the restriction ofω to the respective watershed boundary. Typically, theconsidered watershed boundaries extend from Ω+ tothe shape boundary. It may be possible, however, thata watershed boundary touching Ω+ bifurcates beforereaching the shape boundary. In this case there are in-deed two watershed boundaries; the respective saddle

18 Tari and Genctav

points are given by the respective minima after the bi-furcation.

Tree matching Let (Vi, Ei), i = 1, 2 be two rooted trees.Let k, l ∈ V1 and m,n ∈ V2 be distinct nodes of the re-spective trees. The tree association graph of the twotrees is the graph (V,E) where V := V1 × V2, and thegraph nodes (k,m) ∈ V and (l, n) ∈ V are adjacentwhen the connectivity between k and l is equivalentto that of m and n. Specifically, we say (k,m) ∈ V

and (l, n) ∈ V are adjacent if level(k) − level(l) =level(m) − level(n) and the length of the path from k

to l in the first tree is the same with the length of thepath from m to n in the second tree.

Defining the equivalence between two sets of nodesin respective trees by comparing levels and path lengths,there exists a bijection between maximal subtree iso-morphism and maximal clique of the association graphof the two trees; i.e., tree matching is equivalent tofinding the maximal clique in the association graph.If the trees are attributed – e.g. in our case (Vi, Ei,α)where α is a function that assigns an attribute vector[α(1)(u) , α(2)(u)]T to each node u in either tree – thensubtree isomorphism with the largest similarity is calledmaximum similarity subtree isomorphism. In this case,the weighted association graph is the weighted graph(V,E, c) such that c(z) for z ≡ (u, v), z ∈ V , u ∈ V1 andv ∈ V2 is defined via a similarity measure sim(·, ·) in theattribute space: c(z) = sim(α(u),α(v)). The attributesand similarity measure are calculated as described in§4.1.

Suitably defining a weight matrix M using nodeweights c(·), the global maximizer of xTMx gives themaximum weight clique, which is solved iteratively:

xn+1i = x

ni

(M xn)i(xn)T Mxn

(11)

where n is the iteration variable. The matrixM = (mij)is given via a matrix B = (bij) as follows:

mij = maxi,j

(bij)− bij

where

bij =

0 if i = j, and node i is

adjacent to node j

12c(ui)

if i = j

12c(ui)

+ 12c(uj)

otherwise

Let maximum weighted clique be C ⊂ V . The solu-tion x∗ to the maximization problem (via the iterativescheme (11)) is expected to be

x∗i =

c(ui)uj∈C c(uj)

if ui ∈ C

0 otherwise(12)

The iterative scheme (11) returns an approximationto the limit vector x∗ in (12). What remains is how tointerpret this vector. The paper [35] provides no sug-gestion on this. We adopt the following strategy insteadof simply thresholding.

We start with an empty clique. Then starting withthe node with the highest x value, we gradually addnodes to the clique in the order of decreasing x value.After each inclusion we compute expected vector using(12) and check the difference between this estimate andthe actual vector returned by the iterative scheme (11).We keep adding nodes till inclusion of nodes no longerdecreases the difference.

References

1. Ambrosio L, Tortorelli V (1990) On the approxima-tion of functionals depending on jumps by ellipticfunctionals via Γ -convergence. Commun Pure ApplMath 43(8):999–1036

2. Aslan C, Tari S (2005) An axis-based representa-tion for recognition. In: ICCV, pp 1339–1346

3. Aslan C, Erdem A, Erdem E, Tari S (2008) Discon-nected skeleton: Shape at its absolute scale. IEEET Pattern Anal 30(12):2188–2203

4. Aubry M, Schlickewei U, Cremers D (2011) Thewave kernel signature: A quantum mechanical ap-proach to shape analysis. In: ICCV - Workshop onDynamic Shape Capture and Analysis

5. Bai X, Wang B, Yao C, Liu W, Tu Z (2012) Co-transduction for shape retrieval. IEEE T ImageProcess 21(5):2747 –2757

6. Bajaj CL, Pascucci V, Schikore DR (1997) The con-tour spectrum. In: Proceedings of the 8th confer-ence on Visualization

7. Ballester C, Caselles V, Igual L, Garrido L (2007)Level lines selection with variational models forsegmentation and encoding. J Math Imaging Vis27(1):5–27

8. Bar L, Sochen N, Kiryati N (2006) Image deblur-ring in the presence of impulsive noise. Int J Com-put Vision 70(3):279–298

9. Biasotti S, Cerri A, Frosini P, Giorgi D, Landi C(2008) Multidimensional size functions for shapecomparison. J Math Imaging Vis 32(2):161–179

10. Braides A (1998) Approximation of Free-discontinuity Problems. Lecture Notes in Mathe-matics, Vol. 1694, Springer-Verlag

11. Buades A, Coll B, Morel JM (2005) A non-localalgorithm for image denoising. In: CVPR, pp 60–65

12. Burgeth B, Weickert J, Tari S (2006) Minimallystochastic schemes for singular diffusion equations.


In: Tai XC, Lie KA, Chan TF, Osher S (eds) ImageProcessing Based on Partial Differential Equations,Mathematics and Visualization, Springer BerlinHeidelberg, pp 325–339

13. Chan T, Vese L (2001) Active contours withoutedges. IEEE T Image Process 10(2):266 –277

14. Cremers D, Tischhauser F, Weickert J, SchnorrC (2002) Diffusion snakes: Introducing statisti-cal shape knowledge into the Mumford-Shah func-tional. Int J Comput Vision 50(3):295–313

15. Dimitrov P, Lawlor M, Zucker S (2011) Distanceimages and intermediate-level vision. In: SSVM,Springer-Verlag, pp 653–664

16. Droske M, Rumpf M (2007) Multi scale joint seg-mentation and registration of image morphology.IEEE T Pattern Anal 29(12):2181–2194

17. Edelsbrunner H, Letscher D, Zomorodian A (2002)Topological persistence and simplification. DiscreteComput Geom 28:511–533

18. Erdem E, Tari S (2009) Mumford-Shah regular-izer with contextual feedback. J Math Imaging Vis33(1):67–84

19. Erdem E, Sancar-Yilmaz A, Tari S (2007)Mumford-Shah regularizer with spatial coherence.In: SSVM, Springer-Verlag, pp 545–555

20. Gebal K, Bærentzen JA, Aanæs H, Larsen R (2009)Shape analysis using the Auto Dinfusion Function.Comput Graph Forum 28:1405–1413

21. Gilboa G, Darbon J, Osher S, Chan T (2006) Non-local convex functionals for image regularization.UCLA CAM-report 06-57

22. Gorelick L, Galun M, Sharon E, Basri R, Brandt A(2006) Shape representation and classification us-ing the poisson equation. IEEE T Pattern Anal28(12):1991–2005

23. Jin Y, Jost J, Wang G (2012) A nonlocal versionof the Osher-Sol-Vese model. J Math Imaging Vis44:99–113

24. Jung M, Vese L (2009) Nonlocal variational imagedeblurring models in the presence of gaussian orimpulse noise. In: SSVM, Springer-Verlag, pp 401–412

25. Jung M, Bresson X, Chan T, Vese L (2009) Colorimage restoration using nonlocal Mumford-Shahregularizers. In: EMMCVPR, Springer-Verlag, pp373–387

26. Kontschieder P, Donoser M, Bischof H (2010) Be-yond pairwise shape similarity analysis. In: ACCV2009, Lecture Notes in Computer Science, vol 5996,Springer, pp 655–666

27. Lee TS, Yuille A (2007) Efficient coding of visualscenes by grouping and segmentation. In: Doya K,Ishii S, Pouget A, Rao R (eds) Bayesian Brain:

Probabilistic Approaches to Neural Coding, MITPress, chap 8, pp 141–185

28. Lee TS, Mumford D, Romero R, Lamme VA (1998)The role of the primary visual cortex in higher levelvision. Vision Res 38(15-16):2429–2454

29. March R, Dozio M (1997) A variational method forthe recovery of smooth boundaries. Image VisionComput 15(9):705–712

30. Meyer F (1994) Topographic distance and water-shed lines. Signal Process 38:113–125

31. Morse SP (1969) Concepts of use in contour mapprocessing. Commun ACM 12(3):147–152

32. Mumford D, Shah J (1989) Optimal approxima-tions by piecewise smooth functions and associatedvariational problems. Commun Pure App Math42:577–685

33. Patz T, Preusser T (2010) Ambrosio-Tortorellisegmentation of stochastic images. In: ECCV,Springer-Verlag, pp 254–267

34. Patz T, Kirby R, Preusser T (2012) Ambrosio-tortorelli segmentation of stochastic images: Modelextensions, theoretical investigations and numericalmethods. International Journal of Computer Visionpp 1–23, DOI doi:10.1007/s11263-012-0578-8

35. Pelillo M, Siddiqi K, Zucker S (1999) MatchingHierarchical Structures Using Association Graphs.IEEE T Pattern Anal 21(11):1105-1120

36. Peng T, Jermyn I, Prinet V, Zerubia J (2010) Ex-tended phase field higher-order active contour mod-els for networks. Int J Comput Vision 88(1):111–128

37. Pien H, Desai M, Shah J (1997) Segmentation ofMR images using curve evolution and prior infor-mation. Int J Pattern Recogn 11(8):1233–1245

38. Preußer T, Droske M, Garbe C, Rumpf M, TeleaA (2007) A phase field method for joint denois-ing, edge detection and motion estimation. SIAMJ Appl Math 68(3):599–618

39. Proesman M, Pauwels E, van Gool L (1994) Cou-pled geometry-driven diffusion equations for low-level vision. In: Romeny B (ed) Geometry DrivenDiffusion in Computer Vision, Lecture Notes inComputer Science, Kluwer

40. Reuter M (2010) Hierarchical shape segmenta-tion and registration via topological features ofLaplace-Beltrami eigenfunctions. Int J Comput Vi-sion 89(2):287–308

41. Rosin PL, West G (1995) Salience distance trans-forms. Graph Model Im Proc 57(6):483–521

42. Rosman G, Bronstein MM, Bronstein AM, Kim-mel R (2010) Nonlinear dimensionality reductionby topologically constrained isometric embedding.Int J Comput Vision 89(1):56–68

20 Tari and Genctav

43. Shah J (1991) Segmentation by nonlinear diffusion.In: CVPR, pp 202–207

44. Shah J (1996) A common framework for curve evo-lution, segmentation and anisotropic diffusion. In:CVPR, pp 136–142

45. Shah J (2005) Skeletons and segmentationof shapes. Tech. rep., Northeastern University,http://www.math.neu.edu/ shah/publications.html

46. Shah J, Pien H, Gauch J (1996) Recovery of shapesof surfaces with discontinuities by fusion of shad-ing and range data within a variational framework.IEEE T Image Process 5(8):1243–1251

47. Sun J, Ovsjanikov M, Guibas L (2009) A con-cise and provably informative multi-scale signature-based on heat diffusion. In: Comput. Graph. Forum

48. Tari S (2009) Hierarchical shape decomposition vialevel sets. In: ISMM, Springer-Verlag, pp 215–225

49. Tari S (2013) Fluctuating distance fields. In: BreussM, Bruckestein A, Maragos P (eds) Innovations inShape Analysis - Proceedings of Dagstuhl Work-shop, Mathematics and Visualization, Springer

50. Tari S, Genctav M (2011) From a modifiedAmbrosio-Tortorelli to a randomized part hierar-chy tree. In: SSVM, Springer-Verlag, pp 267–278

51. Tari S, Shah J (1998) Local symmetries of shapesin arbitrary dimension. In: ICCV, pp 1123–1128

52. Tari S, Shah J, Pien H (1997) Extraction of shapeskeletons from grayscale images. Comput Vis ImageUnd 66(2):133–146

53. Teboul S, Blanc-Fraud L, Aubert G, Barlaud M(1998) Variational approach for edge preservingregularization using coupled PDE’s. IEEE T Im-age Process 7:387–397

54. Yang X, Bai X, Koknar-Tezel S, Latecki J (2012)Densifying distance spaces for shape and image re-trieval. J Math Imaging Vis DOI 10.1007/s10851-012-0363-x

55. Zhu SC, Yuille AL (1996) FORMS: A flexible ob-ject recognition and modeling system. Int J Com-put Vision 20(3):187–212

56. Zucker S (2013) Distance Images and the En-closure Field: Applications in Intermediate-LevelComputer and Biological Vision. In: Breuss M,Bruckestein A, Maragos P (eds) Innovations inShape Analysis - Proceedings of Dagstuhl Work-shop, Mathematics and Visualization, Springer

from a non-local ambrosio-tortorelli phase field to a randomized part hierarchy...

Documents