j math psych

Upload: hayesw

Post on 14-Apr-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/27/2019 j Math Psych

    1/4

    Journal of Mathematical Psychology 54 (2010) 334337

    Contents lists available at ScienceDirect

    Journal of Mathematical Psychology

    journal homepage: www.elsevier.com/locate/jmp

    Theoretical note

    Dissimilarity, Quasimetric, Metric

    Ehtibar N. Dzhafarov

    Purdue University, United States

    Swedish Collegium for Advanced Study, United States

    a r t i c l e i n f o

    Article history:

    Received 18 August 2009

    Received in revised form20 January 2010

    Available online 2 March 2010

    Keywords:

    Asymmetric metric

    Categorization

    Discrimination

    Dissimilarity

    Fechnerian Scaling

    Metric

    Quasimetric

    Stimulus space

    Subjective metric

    a b s t r a c t

    Dissimilarity is a function that assigns to every pair of stimuli a nonnegative numbervanishingif andonlyif two stimuli are identical, and that satisfies the following two conditions called the intrinsic uniform

    continuity and thechain property, respectively: it is uniformly continuous with respect to the uniformityit induces, and, given a set of stimulus chains(finite sequencesof stimuli), thedissimilarity between theirinitial andterminalelements convergesto zeroif thechainslength (thesum of the dissimilarities betweentheir successive elements) converges to zero. The four properties axiomatizing this notion are shown tobe mutually independent. Any conventional, symmetric metric is a dissimilarity function. A quasimetric(satisfying all metric axioms except for symmetry) is a dissimilarity function if and only if it is symmetricin the small. It is proposed to reserve the term metric (not necessarily symmetric) for such quasimetrics.A real-valued binary function satisfies the chain property if and only if whenever its value is sufficientlysmall it majorates some quasimetric and converges to zero whenever this quasimetric does. The functionis a dissimilarity function if, in addition, this quasimetric is a metric with respect to which the function isuniformly continuous.

    2010 Elsevier Inc. All rights reserved.

    This note aimsat filling in certain conceptualand terminologicalgaps in the theory of dissimilarity as presented in Dzhafarov andColonius (2007) and Dzhafarov (2008a,b). The axioms definingdissimilarity are proved to be mutually independent. A convenientcriterion is given for the compliance of a function with thechain property (D4 below). Psychophysical applications of thenotion of dissimilarity were previously confined to discrimination

    judgments, same-different and greater-less. Here, an exampleis given of an application of the notion to the categorizationparadigm. As a terminological improvement, it is suggested thatthe term oriented (i.e., asymmetric) metric should be reservedfor quasimetrics (functions satisfying all metric axioms except forsymmetry) which are symmetric in the small: then any metricis a dissimilarity function. A familiarity with the dissimilaritycumulation theory and its main psychophysical application,

    Fechnerian Scaling (at least as they are presented in Dzhafarov &Colonius, 2007), is desirable for understanding the context of thisnote.

    Notation conventions. Let S be a set stimuli (points), denoted byboldface lowercase letter x,y, . . .. A chain, denoted by boldfacecapitals, X,Y, . . ., is a finite sequence of points. The set

    k=0 S

    k ofallchainswith elements inS is denoted byS. It contains the emptychain and one-element chains (identified with their elements, sothat x S is also the chain consisting of x). Concatenations

    Corresponding address: Purdue University, 47906 West Lafayette, IN, UnitedStates.

    E-mail address: [email protected].

    of two or more chains are presented by concatenations of theirsymbols, XY, xYz, etc. Binary real-valued functions S S Rare presented as Dxy, Mxy, xy, . . ..

    Given a chain X = x1, . . . ,xn and a binary (real-valued)function F, the notation FXstands for

    n1i=1

    Fxixi+1,

    with the obvious convention that the quantity is zero ifn is 1 (one-element chain) or 0 (empty chain). The notation FXstands for

    maxi=1,...,n1

    Fxixi+1.

    All limit statements for sequences, such as Fana 0, Fanbn 0,or FXn 0 are implicitly predicated on n .

    If stimuli x,y, etc. are characterized by numerical values of acertain property (e.g., auditory tones are characterized by theirintensity), these numerical values may be denoted by x,y, etc.(possibly with indices or ornaments), and by abuse of notationwe can say x = x, y = y, etc. Analogously, if the stimuli arecharacterized by numerical vectors, we can write x = (x1, . . .xk).

    1. Dissimilarity: Definition and applications

    Definition 1.1. Function D : SS R is a dissimilarity functionif it has the following properties:

    D1 (positivity) Dab > 0 for any distinct a, b S;

    D2 (zero property) Daa = 0 for any a S;

    0022-2496/$ see front matter 2010 Elsevier Inc. All rights reserved.doi:10.1016/j.jmp.2010.01.005

    http://www.elsevier.com/locate/jmphttp://www.elsevier.com/locate/jmpmailto:[email protected]:[email protected]://dx.doi.org/10.1016/j.jmp.2010.01.005http://dx.doi.org/10.1016/j.jmp.2010.01.005mailto:[email protected]://www.elsevier.com/locate/jmphttp://www.elsevier.com/locate/jmp
  • 7/27/2019 j Math Psych

    2/4

    E.N. Dzhafarov / Journal of Mathematical Psychology 54 (2010) 334337 335

    D3 (intrinsic uniform continuity) for any > 0 one can finda > 0such that, for any a, b, a, b S,

    ifDaa < and Dbb < , thenDab Dab < ;

    D4 (chain property) for any > 0 one can find a > 0 such that

    for any chain aXb,

    ifDaXb < , then Dab < .

    Remark 1.2. The intrinsic uniform continuityand chain propertiescan also be presented in terms of sequences, as, respectively,

    if

    Danan 0

    and

    Dbnb

    n 0

    then Danb

    n Danbn 0,

    and

    ifDanXnbn 0 then Danbn 0,

    for all sequences of points {an},

    an,

    {bn},

    bn

    in S and se-quences of chains {Xn} in S. (For the equivalence of these state-ments toD3 andD4 see Dzhafarov & Colonius, 2007, footnote 8.)

    Remark 1.3. A simple but useful observation: in the formulationofD4 for any chain aXb can be replaced with for any chain aXbwith DaXb < m, where m is any positive number. Indeed, the to be found for a given can always be redefined as min {, m}. Inthis class of chains wealsohave aXb < m, where stands for D(see Notation Conventions above). Stated in terms of sequences, if

    ifDanXnbn 0 then anXnbn 0,

    so one can only consider the sequences anXnbn with sufficientlysmall DanXnbn and anXnbn .

    The convergence Dxnyn 0 is easily shown to be anequivalence relation on the set of all infinite sequences in S(Dzhafarov & Colonius, 2007, Theorem 1).

    Theorem 1.4. Each of the properties D1D4 is independent of theremaining three.

    Proof. (1) LetS be a two-element set {a, b}. If

    Dab = Dba = Daa = Dbb = 0,

    thenD1 is violated whileD2D4 hold trivially.

    (2) With the same S, if

    Dab = Dba = Daa = Dbb = 1,

    thenD2 is violated whileD1,D3,D4 hold trivially.

    (3) Let S be represented by R, and let

    Dxy =

    |x y| if |x y| < 1

    |x y| + e|xy| e if |x y| 1.

    Then D obviously satisfies D1D2. To see that D4 holds too,we use Remark 1.3 and only consider chains aXb with aXb

    DaXb < 1. Since DaXb |a b|, wehave then Dab = |a b| < whenever DaXb < = . But D does not satisfyD3, which can beshown by choosing, e.g., a = a = 0 and b = b + (b > 1, > 0):then

    Dab Dab = + eb

    e 1

    considered as a function ofb can be arbitrarily large for any choiceof = Dbb > Daa = 0.

    (4) Let S be represented by a finite interval of reals andDxy = (x y)2. Then D satisfiesD1D3 (obviously) but not D4:subdivide a nondegenerate interval [a, b] within S into n equalparts and observe that the length of the corresponding chain is

    DaXb = n

    a b

    n

    2 0

    as n , while Dab is fixed at (a b)2 > 0.

    The following three examples show how the notion ofdissimilarity appears in the contexts of perceptual discriminationand categorization.

    Example 1.5. Let the stimulus set S be represented by [1, U](e.g., the set of intensities of sound between an absolute thresholdtaken for 1 and an upper threshold U), and let xy be theprobability with which y is judged to be greater than x in some

    respect (e.g., loudness). Let

    xy =

    y x

    x

    ,

    where is a continuously differentiable probability distributionfunction on Rwith (0) = 1

    2and (0) > 0. Then the quantity

    Dxy =

    y x

    x

    1

    2

    is a dissimilarity function.1 Indeed, its compliance withD2 is clear,and D1 follows from the fact that is nondecreasing on R andincreasing in an open neighborhood of zero. To demonstrate theintrinsic uniform continuity, D3, observe that the convergenceDxnx

    n 0 on [1, U] [1, U] is equivalent to the conventional

    convergencexn x

    n

    0, and that on this compact areaDxy is uniformly continuous in the conventional sense. Finally,to demonstrate the chain property, D4, we use Remark 1.3 toconclude that DanXnbn 0 implies (assuming, without loss ofgenerality, an bn)bn

    an

    dx

    x

    1

    2

    =bn

    an

    (0) dxx = (0) log bnan 0,

    and the latter is equivalent to

    Danbn =bn an

    an 0.

    Example 1.6. Consider a set of probability vectors

    S0 =

    (p1, . . . ,pk) [0, 1]k

    :

    pi = 1

    .

    The value of pi can be interpreted as the probability withwhich a stimulus is classified into the ith category among anexhaustive list of k mutually exclusive categories. Two stimulicorresponding to one and the same vector (p1, . . . ,pk) are treatedas indistinguishable, so one can speak of the vector as a stimulus.Define a measure of difference of a point q = (q1, . . . , qk) froma point p = (p1, . . . ,pk) as

    2

    Dpq =

    ki=1

    (qi pi) logqi

    pi.

    Then, for any c ]0, 1k

    [, D is a (symmetric) dissimilarity functionon

    Sc =

    (p1, . . . ,pk) [c, 1 (k 1) c]k :

    pi = 1

    .

    The propertiesD1D2 are verified trivially. The intrinsic uniformcontinuity, D3, follows from the fact that D is uniformlycontinuous in the conventional sense on the compact domainScSc, and that the conventional convergence (in Euclidean norm)pn pn 0 is equivalent to Dpnpn 0. (Note that this will

    1 This is a special case of the mirror-reflection procedure described inDzhafarov and Colonius (1999).

    2 D2pq is the symmetric KullbackLeibler divergence, the original divergencemeasure proposed in Kullback and Leibler (1951). The present example does not

    touchon manyimportant aspects of the relationship between divergencemeasures

    in theinformation geometry andthe dissimilarity cumulation theory: this is a topicfor a separate treatment.

  • 7/27/2019 j Math Psych

    3/4

    336 E.N. Dzhafarov / Journal of Mathematical Psychology 54 (2010) 334337

    no longer be true if we replace pi c with pi 0 or pi > 0 forsome i {1, . . . , k}.)To see that D satisfies the chain property,D4,we use the inequality

    Dpq Vpq =

    ki=1

    |pi qi| ,

    an immediateconsequence of Pinskersinequality (Csiszr, 1967).3

    It follows that for any chain Xwith elements in Sc,

    DpXq Vpq,

    whence the convergence DpnXnqn 0 implies Vpnqn 0,which in turn is equivalent to Dpnq

    n 0.

    Example 1.7. Let S be any finite set of points, and Dxy anyfunction subject toD1D2. ThenD3andD4 are satisfied trivially,and D is a dissimilarity function.

    2. Conditions relating dissimilarities, quasimetrics, and met-

    rics

    Of the four properties defining dissimilarity, the chain property,D4, is the only one not entirely intuitive. It is not always

    obvious how one should go about testing the compliance of afunction with this property. The criterion (necessary and sufficientcondition) given in Theorem 2.5 often proves helpful in this issue.Conveniently and somewhat surprisingly, it turns out to be acriterion for the conjunction of the conditionsD1,D2,D4 ratherthan justD4, and it may even help in dealing with D3.

    We need first to establish terminological clarity in treatingmetric-like functions which may lack symmetry. By simplydropping the symmetry axiom from the list of metric axioms onecreates the notion of a quasimetric.

    Definition 2.1. Function M : SS R is a quasimetric functionif it has the following properties:

    QM1 (positivity) Mab > 0 for any distinct a, b S;

    QM2 (zero property) Maa = 0 for any a S;

    QM3 (triangle inequality) Mab + Mbc Mac for all a, b, c S.

    Any symmetric metric is obviously a quasimetric. If M is aquasimetric then Mxy+ Myx(or max {Mxy, Myx}) is a symmetricmetric.

    The notion of a quasimetric, however, turns out to be toounconstrained to serve as a good formalization for the intuitionof an asymmetric (oriented) metric. In particular, a quasimetricdoes not generally induce a uniformity,4 which means thatgenerally it is not a dissimilarity function (see Dzhafarov &Colonius, 2007).

    Example 2.2. LetS be represented by the interval ]0, 1[, and let

    Qxy = H0 (x y) + (y x) ,

    where H0 (

    a)

    is a version of the Heaviside function (equal 0 fora 0 and equal 1 for a > 0). The function clearly satisfiesQM1QM2, andQM3 follows from

    H0 (u) + H0 (v) H0 (u + v) ,

    for any real u, v. But Q is not a dissimilarity function as it is notintrinsically uniformly continuous: take, e.g., any a = b = b andlet an a+ in the conventional sense. Then Qaa

    n = a

    n a 0

    and Qbb = 0, but Qanb = 1 + b an 1 while Qab = 0.

    3 The familiar form of Pinskers inequality is K2pq =k

    i=1 qi logqipi

    12

    V2pq,

    from which the present form obtains by D2pq = K2pq + K2qp.4 For the notion of uniformity see, e.g., Kelly (1955, Chapter 6). A brief reminder

    of the basic properties of a uniformity and the relation of this notion to that of a

    dissimilarity function can be found in Dzhafarov and Colonius (2007, Sections 2.2and 2.6).

    [Compared to the uniformity induced by a dissimilarity function(Dzhafarov & Colonius, 2007, Section 2.2), one can check that thesets A = {(x,y) : Qxy < } taken for all positive do not forma uniformity basis on S because, for any < 1 and any > 0,A

    1 = {(y,x) : Qxy < } does not include as a subset A .]

    We see that simply dropping the symmetry axiom is not com-pletely innocuous. At the same time the symmetry requirement

    is too stringent both in and without the present context (e.g., indifferential geometry symmetry is usually unnecessary in treatingintrinsic metrics). A satisfactory definition of an asymmetric met-ric can be achieved by replacing the symmetry requirement withsymmetry in the small.

    Definition 2.3. Function M : S S R is a metric (or distancefunction) if it is a quasimetric with the following property:

    M (symmetry in the small) for any > 0 onecan finda > 0 suchthat Mab < implies Mba < , for any a, b S.

    What is important for our purposes is that any metric in thesense of this definition is a dissimilarity function.

    Theorem 2.4. A quasimetric is a dissimilarity function if and only if

    it is a metric.

    Proof. A dissimilarity function satisfies the symmetry in the smallcondition (Dzhafarov & Colonius, 2007, Theorem 1).This provestheonly if part. For the if part, the compliance of a metric M withD1D2 is obvious. By the triangle inequality, for any a, b, a, b S,

    Maa + Mbb Mab Mab

    Maa + Mbb Mab Mab.

    From Definition 2.3, for any > 0 there is a > 0 such that

    max

    Maa, Mbb

    0 one can find a > 0 such that, for any a, b S,

    if Mab < , then Dab < .

    Remark 2.6. A slightly less rigorous but more compact formula-tion of the theorem is this: a real-valued binary function D on S

    satisfiesD

    1,D

    2, andD

    4 if and only if for all a, b S

    with suffi-ciently small Dab the latter majorates a quasimetric Mab such thatManbn 0 implies Danbn 0.

    Proof. If such a quasimetric M exists, D satisfies D1 because ifa = b, then either Dab m > 0 or Dab Mab > 0. D2follows from the observations that (1) for any > 0 we have0 = Maa < , implying Daa < ; and (2) choosing < m wehave Daa Maa = 0. To show the compliance ofD withD4, notethat for any > 0, ifDaXb < min {m, } then

    DaXb MaXb Mab < ,

    implying Dab < .Conversely, ifD satisfiesD1,D2, andD4, then function

    Mab = infXS DaXb

    is a quasimetric. Its compliance withQM3 follows from

    Mab + Mbc = infX,YS

    DaXbYc

    = infZ S

    b is in Z

    DaZc infZS

    DaZc = Mac.

    To see that M satisfiesQM2 andQM1, observe that DaXb 0, forall a, b S and X S. Since S includes the empty chain, we get

    Maa = infXS

    DaXa = 0.

    For a = b,

    Mab = infXS

    DaXb > 0,

    because if the infimum could be zero then, for every > 0,we would be able to find an X with DaXb < , and this wouldcontradictD4 since Dab > 0. To complete the proof it remains toobserve that Dab Mab for all a, b S.

    In practice the quasimetric M of the theorem often can bechosen to be a metric in the sense ofDefinition 2.3. This makes itpossible to speak of the function D as being or not being uniformlycontinuous with respect to (the uniformity induced by) the metricM: D possesses this property ifMana

    n 0 and Mbnb

    n 0 imply

    Danbn Danbn 0.

    Corollary 2.7. If M in Theorem 2.5 is a metric, then D and M inducethesame uniformity, that is,the convergence Dxnyn 0 is equivalent

    to the convergence Mxnyn 0, for any sequences {xn} , {yn} inS. Consequently, if D is uniformly continuous with respect to (the

    uniformity induced by) M, then D is a dissimilarity function.

    Proof. The implication ifMxnyn 0 then Dxnyn 0 holds bythe definition ofM, while the reverse implication follows from Dmajorating Mwhenever the formeris sufficientlysmall. As a result,ifD is uniformly continuous with respect to M then it satisfiesD3

    (and by the previous theorem it satisfies D1,D2,D4).

    Thus, the properties D1, D2, and D4 in Example 1.5could be established by noting that for a sufficiently smallm > 0,

    (a) 12

    majorates k |a| on the interval 1 m 12

    ,

    1

    m + 12

    , where k is any positive number less than the

    minimum (a) on this interval.It would followthen that ifDxy = yxx

    1

    2

    < m (where x,y [1, U]), thenDxy > k

    |y x|

    x

    k

    U|y x| = Mxy,

    the latter being a (symmetric) metric. Since Mxnyn 0 clearlyimplies Dxnyn 0, the properties in question follow byTheorem 2.5. The property D3 obtains by Corollary 2.7: M and

    D are uniformly equivalent (i.e., one of them converges to zero ifand only if so does the other), and it is clear that D is uniformlycontinuous on [1, U] in the conventional sense (which here means,with respect to M).

    In Example 1.6, Pinskers inequality is all one needs to mentionto establish D1, D2, and D4 by Theorem 2.5. D3 follows byCorollary 2.7 using the same reasoning as in the previous example.

    Acknowledgments

    This research has been supported by NSF grant SES 0620446and AFOSR grants FA9550-06-1-0288 and FA9550-09-1-0252 toPurdueUniversity. The author is most grateful to Hans Colonius forcritically reading a draft of this note andsuggesting improvements.

    References

    Csiszr, I. (1967). Information-type measures of difference of probability distribu-tions and indirect observations. Studia Scientiarum Mathematicarum Hungarica,2, 299318.

    Dzhafarov, E. N. (2008a). Dissimilarity cumulation theory in arc-connected spaces.Journal of Mathematical Psychology, 52, 7392;See also Dzhafarov, E. N. (2009). Corrigendum to: Dissimilarity cumulationtheory in arc-connected spaces [J. Math. Psychol. 52 (2008) 7392]. Journalof Mathematical Psychology, 53, 300.

    Dzhafarov, E. N. (2008b). Dissimilarity cumulation theory in smoothly-connectedspaces. Journal of Mathematical Psychology, 52, 93115.

    Dzhafarov, E. N., & Colonius, H. (1999). Fechnerian metrics in unidimensional andmultidimensionalstimulus spaces. Psychonomic Bulletinand Review, 6, 239268.

    Dzhafarov, E. N., & Colonius, H. (2007). Dissimilarity cumulation theory andsubjective metrics. Journal of Mathematical Psychology, 51, 290304.

    Kelly, J. L. (1955). General Topology. Toronto: Van Nostrand.Kullback, S., & Leibler, R. A. (1951). On information and sufficiency. Annals of

    Mathematical Statistics, 22, 7986.