plausible reasoning spatial observation

Post on 06-Apr-2018

227 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

  • 8/2/2019 Plausible Reasoning Spatial Observation

    1/8

    UAI 2001 LANG & M ULLER 285

    Plausible reasoning from spatial observations

    J4r6me Lang and Philippe MullerIRIT - Universit4 Paul Sabatier, F-31062 Toulouse, France

    {lang, muller}~irit, fr

    AbstractThis article deals with plausible reasoningfrom incomplete knowledge about large-scalespatial properties. The available information,consisting of a set of pointwise observations,is extrapolated to neighbour points. We usebelief functions to represent the influence ofthe knowledge at a given point to anotherpoint; the quantitative strength of this in-fluence decreases when the distance betweenboth points increases. These influences areaggregated using a variant of Dempsters ruleof combination taking into account the rela-tive dependence between observations.

    1 IntroductionThis article aims at handling knowledge about large-scale spatial properties (e.g. soil type, weather), incontexts where this knowledge is only partial; i.e. somepiece of information is known only at some given lo-cations of space. We have investigated some m eans toperform plausible reasoning on this kind of informa-tion at any point in the considered space.Several studies can be related to the question ofimprecise knowledge in spatial databases, but theyusually consider the question of representing incom-plete knowledge about the location of spatial ob-jects (using relational theories, more or less relatedto the seminal work of [Randell et al., 1992], or us-ing fuzzy locations [Bloch, 2000]), or about vague re-gions [Cohn and Gotts, 1996], rather than about staticproperties and their distribution over a given space.[Wiirbel et al., 2000] apply revision strategies to in-consistency removing in geographical information sys-tems. A completely different line of work, in therobotics literature, deals with map building using oc-cupancy grids (see e.g. [Iyengar and Elfes, 1991]); itwill be briefly discussed in Section 5.2.

    While plausible reasoning has been applied to a vari-ety of domains, it has rarely been applied to reasoningabout spatial information. On the other hand, it hasbeen applied to reasoning about temporal information,which gives some hints about how to do it for spa-tial information. Plausible reasoning about systemsthat evolve over time usu ally consists in assum ing thatfluents1 do not change and therefore that their valuepersist from one time po int to the subsequent one, un -less the contrary is known (from an observation, forinstance) or inferred; this implies some minimizationof change. Now, the latter persistence paradigm canbe transposed from temporal to spatial reasoning. Inthe very same line of reasoning, when reasoning aboutproperties in space, it is (often) intuitively satisfactoryto assume that, knowing from an observation that agiven property 9 ho lds at a given point x, then it holdsas well at points "close enough" to x.What we precisely mean by "close enough" dependson the nature of the region as well as on the property~, involved. Moreover, it is clear that the belief that~ "persists" from point x to point y is graduallycreasing: the closer y to x, the more likely ~ observedat x is still true at y. This graduality can be modelledby order relations or by quantitative measu res such asprobability. How ever, as we exp lain in Section 3, pureprobabilistic reasoning is not well-suited to this kind ofreasoning, unless very specific assumptions are made.We therefore model persistence with the help of thebelief function theory, also known as the Dempster-Sharer theory. Belief functions (and their duals, plau-sibility functions) generalize probability measures andenable a clear distinction between random ness and ig-norance that probability measures fail to do.After giving some background on belief functions, weshow how to infer plausible conclusions, weighted bybelief degrees, from spatial observations. Then, werelate computational experiments, evoke information-theoretic and decision-theoretic issues, and conclude.

    fluent is a proposition which evolves over time.

  • 8/2/2019 Plausible Reasoning Spatial Observation

    2/8

    286 LANG & M U LLER UAI 2001

    2 Background on belief functionsThe Dempster-Shafer theory of evidence[Dempster, 1967] [Sharer, 1976] is a generalization ofprobability theory enabling an explicit distinctionbetween randomness and ignorance.Let S be a finite set of possible states of the world(taken to be the set for possible values for a given vari-able, for the sake of simplicity), one of which corre-sponds to the real world. A (normalized) mass assign-m e nt is a mapping m : 2s -~ [0, 1] such that rn() = 0and }-fAtS re(A) = 1. The condition that rn(O) = 0 issometimes omitted (see [Smets and Kennes, 1994]): ifm(~}) > 0 then we say that m is an unnormal ized massassignment. The interest of having a mass assignmentunnormalized is the ability to keep track of a degreeof conflict.The subsets of ~q with a nonempty mass are called thefocal elements of m. m is a simple support function iffit there is a nonempty subset A of S and a a E [0, 1]such that re(A) = a and re(S) = l-a (by convention,when specifying a mass assignment we omit subsetswith an empty mass).A mass assignment rn induces two set functions Bel,~and Pl,~ from 2s to [0, 1]: the belief function Belmand the plausibility function Plm are defined respec-tively by: VB C_ S, Bel,,(B) = ~ACBm(A) andPI,~(B) = ~AnB~O re(A). When m i~ normalized,Belm (B) represents the probability of existence of atleast one true piece of evidence for A, whilePI,~(B)represents the probability of existence of at least onetrue piece of evidence which does not contradict A.When all focal elements are singletons, rn can beviewed as a probability distribution on S; in this caseBelm (A) = Plm (A) = ~,EA re(s), hence, Bel,~ an dPl,~ coincide and are identical to the probability mea -sure induced by m . Therefore Dempster-Shafer theorygeneralizes probability theory on finite universes.The Dempster-Shafer theory of evidence enables anexplicit distinction between randomness and ignorancethat probability theory cannot:. Another crucial ad-vantage of the theory of belief functions is that it is

    ~This is clear from the following two mass functions:1m~{head, tails}= 1; m~({head}) = m~({tails}) = ~.m= represents a true random phenomenon such as toss-ing a regular coin, while ra~ would correspond to a casewhere i~ is not reasonable to define prior probabilities on{head, tails} - imagine for instance that you were justgive~ a parrot with the only knowledge that the two wordsit knows are "head" and "tails": there is absolutely noreason to postulate that it says "heads" and "tails" ran-domly with a probability (nor with any other probabil-ity); it may well be the case, for instance, that it always say

    "head". This state of complete ignorance about the out-come of the event is well represented by the neutral mass

    well-suited to the combination of information from dif-ferent sources. The Dempster combination m~ ~ m2 oftwo (normalized) mass functions m~ and m~ on S isdefined by

    m~ ~ m~(A) =where

    X,Y~S,XVIY=A R(ml Eq2)

    Importantly, this operation is associative, which en-ables its extension to the combination m~ ~ ra2 ~...rn~ of an arbitrary number n of mass assignments.When unnormalization is allowed, we define the un-normalized Demp ster combination of two (normalizedor not) mass assignments m~ and rn~ on S by

    m~ Ov m2(A) = Z m~(X)m~(Y)X,Y C_S,X C3 Y mA

    The resulting m~ O u m~(O) m easures the degree ofconflict between rnt and m ~.Lastly, in some cases it is needed to transform a massassignment into a probability distribution. This is thecase for instance when performing decisiomtheoretictasks. Importantly, this transformation should takeplace after combination has been performed and notbefore, as argued in [Smets and Kennes, 19 94] wh o in-troduce the pignist ic t ransform T(m) of a norm alizedmass assignment rn, being the probability distributionon 5 defined by: Vs ~ S,T(m)(s) = ~-]~A~_S,sfiA ~~"Alternatives to the pignistic transform for decisionmaking u sing belief functions are given in [Strat, 199 4].3 Extrapolation from observations3.1 ObservationsFrom now on we consider a space E, i.e., a set of"spatial points" (which could be seen as either Eu-clidean points or atomic regions). E is equipped witha distance3 d.We are interested in reasoning on the evolution "inspace" of some properties. For the sake of simplicity,the property of interest merely consists of the value offunction re(S) = 1.3Recall tha~ a distance is a mapping d :such as (i) d(x,y) -- 0 if and only if x -- y; (ii) d(x,~) ----d(~/, z) and (iii) d(x, y) + d(u, z)

  • 8/2/2019 Plausible Reasoning Spatial Observation

    3/8

    UAI 2001 LANG & MULLER 287

    a given variable, whose domain is a finite set S. S isfurthermore assumed to be purely qualitative, i.e., S isnot a discretized set of numerical values. S may be forinstance a set of possible soil types, or a set of weathertypes. The simplest case is when S may is binary, i.e.,the property of interest is a propositional variable thetruth value of which we are interested in - for instanceS = {rain,-~rain}.An observation function 0 is a mapping fromDora(O) C_ E to the set of nonempty subsets of S. Ointuitively consists of a set of pointwise observations(x, O(x)} where x ~ E and O(x) is a nonempty subsetof S; such a pointwise observation means that it hasbeen observed that the state of the world at point x be-longs to O(x). 0 is said to be complete at x if O(x) is asingleton and trivial at x if O(x) -- S. The range R(O)of 0 is the set of points where a nontrivial observationhas been performed, i.e., R(O)3.2 Spatial persistenceThe question is now how to extrapolate from an ob-servation function O. As explained in the introduc-tion, the spatial persistence principle stipulates thatas long as nothing contradicts it, a property observedat a given point is believed to hold at points nearby,with a quantity of belief decreasing with the distanceto the observation. This principle is now formally en-coded in the framewo rk of belief functions.Let x be a given point of E, called the focus point.Wh at we are interested in is to infer some new (plau-sible) beliefs about what holds at x. For this we con-sider a set of mass assignments {rny~=,y ~ R(O)}where each m~, is the simple support function de-fined by{ == 1 - y ) )where f is a mapping from (2s \ 0) x ~+ to [0, 1] s.t.1. f is non-increasing in its second argument, i.e.,

    ~ >_ fl implies f(X, ~) < _ f(obs, fl);2. f(obs, a) = 1 if and only if a = 0;43. f(obs, ~) --~-~+0 0

    f will be called a decay function. Decay functions formodelling decreasing beliefs over t ime have first beenused in [Dean and Kanazawa, 1989].The intuitive reading of the mass assignment mu~is the following: the fact that O(y) is observed at ysupports the belief that O(y) holds at x as well, to a

    ~as noticed by a referee, there are intuitive cases wherethis condition could be weakened.

    degree which is all the higher as y is close to x. Inparticular, if x = y (thus d(x,y) = 0) then O(x) =O(y) has a maximal (and absolute) impact on x while,when y gets too far from x, this impact becomes null.By default (and like to [Dean and K anazawa, 1989] fortemporal persistence) we will use exponential decayfunctions f(obs, ~) = exp(- x--(-~-) where A(obs) is areal strictly positive number expressing the "persis-tence power" of the observation ob s (such a function iscalled an exponentiM decay function). This deservesfurther comments.We first consider the case of complete observations,i.e., obs is a singleton {v}. A({v}), written A(v) with-out any risk of misunderstanding, characterize the per-sistence degree of the value v: the lower (v), thestronger the spatial persistence of the property V = v.The two limit cases for A(v) are:

    A(v) = 0; by passage to the limit we writeexp(- ~-~) = 0 and therefore the property V"- vis non-persistent: as soon as d(x, y) > 0, the factthat V = v holds at point y does not supportthe belief that V = v should hold at x too. Asan example, consider the proper~y "the 5th dec-imal of the temperature at x is even". Clearly,this property is non-persistent (provided that thegranularity of space is coarse enough);A(v) = +~ : by passage to the limit we writeexp(- x~) = 1 and therefore the property V = vis strongly persistent: as soon as it is true some-where in space, it is true everywhere in E.

    How ,k(v) is determined depends on the variable V andthe value v involved. It may be determined by expe-rience. Considering a point x where V = v is knownto hold, the probability of the relevant persistence ofV = v from y to x (which may sometimes be under-stood as the probability of continuous persistence fromz to y - this will be discussed later), according tothe formula above, is exp(-~ In particular, if~1~)d~ (V - v) is the "half persistence" of V -- v, i.e., thedmtance for whmh the probablhty of relevant persis-tence is equal to , then we have A(v) -- In2 Now, when V is not a singleton, the persistence decayfunction of V will be taken to be the persistence func-tion of the most weakly persistent element of v, i.e.,A(V) = minv~v A(v).The critical point is the reference to relevant persis-tence rather than with simple persistence. Assumethat we try to build an approximately valid weathermap and that the property rain = true observed atpoint x. Clearly, this property being known to have a

  • 8/2/2019 Plausible Reasoning Spatial Observation

    4/8

    288 LANG & MULLER UAI 2001

    significant spatial persistence, this piece of knowledgeis a strong evidence to believe that it is also raining ata very close point y, such as, say, d(x, y) -- lkm. Thisis not at all an evidence to believe that is raining atz wl~ere d(x, z) = 8000kin, hence, the impact of z onz regarding rain is (almost) zero. This does not meanthat the probability of raining at z is (almost) zero. Itmay well be the case that it is raining at x; but in thiscase, the fact that it is raining at z is (almost certainly)unrelated to the fact that it is raining at x, because, forinstance, the air masses and the pressure at these twopoints (at the same time) are independent. The im-pact f(rain, d(x, z)) = true of x on z regarding raincan be interpreted as the probability that, knowingthat it is raining at x, it is also raining at z and thesetwo points are in the same "raining region". Hencethe terminology "relevant persistence", which may alsobe interpreted as "continuous persistence" (i.e., persis-tence along a continuous path) if we assume moreoverthat a raining region is self-connected5.This is where the difference between pure probabilityand belief functions (recall that they generalize prob-ability theory) is the most significant: in a pure prob-abilistic framework, this impact degree, or probabilityof relevant persistence, cannot be distinguished froma usual probability degree. If we like to express prob-abilities of persistence in a pure probabilistic frame-work, we need a mapping grain : E~ --~ [0,1] s.t.Prob( golds(z, rain)[Holds(y, rain)) = grain(d(~:, y) ).This mapping g is different from f. More precisely,g > f holds, and g and f are closer and closer to f asd(x, y) is smaller and smaller; when d(x, y) becomeslarge (with respect to the persistence degree of rain),the impact g tends to 0 while g tends to the prior prob-ability of raining at x. From this we draw the follow-ing conclusion: a pure probabilistic modelling of spatialpersistence needs not only some knowledge about howproperties persist over space but also a prior probabil-ity that the property holds at each point of space; thelatter, which may be hard to obtain, is not needed withthe belief function modelling of persistence,The second drawback of a pure probabilistic modellingof spatial persistence is the lack of distinguishabilitybetween ignorance and conflict. Suppose (without lossof generality) that the (uniform) prior probability ofpersistence is . Consider the four points w,x, y,zwhere x is very close to x and y and half way betweenboth, and w is very far from z. Suppose that it hasbeen observed that it is raining at y and that it isnot raining at z. The probability, as well as the be-lief, that it is raining at x, are very close to . The

    ~and, in a stronger way, by "linearly continuous per-sistence" if we assume that a raining region is not onlyself-connected but also convex.

    explanation of this value is the following: the twopieces of evidence that it is raining at y and not rain-ing at z have a strong impact on x and are in conflict.An analogy with information merging from multiplesources is worthwhile: the rain observed at y and theabsence of rain at z both can be considered as informa-tion sources, the first one telling that it is raining at xand the second one that it is not, the reliability of thesources being function of the distance between themand the focus point x. In the absence of a reason tobelieve more one source than the other one, the prob-ability that it is raining at x is 1 This has nothing todo with the prior probability of persistence: had thisprior been 0.25, the probability that it is raining at xwould still have been g.Consider now w as the focus point, w being very farfrom y and z, their impact is almost zero and the prob-ability of rain at w is (extremely close to) the prior

    1 is a priorrobability of rain, i.e., . This value of gand comes from ignorance rather than with conflict.Therefore, probability cannot distinguish from whathappens at x and at w, i.e., it cannot distinguish be-tween conflictual information and lack of information.Belief functions, on the other hand, would do this dis-tinction: while the belief of raining at would havebeen close to , the belief of raining at w, as well asthe belief of not raining at w, would have been closeto 0. Hence the second conclusion: a pure probabilis-tic modelling of spatial persistence does not allow for adistinction between conflictual information and lack ofinformation, while the belief function modelling does.3.3 CombinationOnce each observation is translated into a simple sup-port function rnu,~, the belief about the vMue of thevariable V is computed by combining all mass assign-ments mu~= for y E R(O).A first way of combining them consists in applyingmere Dempster combination, i.e.,

    If one wishes to keep explicitly track of the measureof conflict then one may use unnormalized Dempstercombination instead. However, a naive use of Demp-ster combination has a severe drawback. Consider thefollowing space E = {x, y, z, w} where d(x,y) = 1;d(~,~) = d(y,~) = ~0; d(z,w) =~0; d(~,~) =d(y, z) = 19 and the observation function O concern-ing rain: O(x) = O(y) = true; O(z) = false. Thefocus point is w.We take an exponential decay func-tion with a uniform A = 30. The mass assignmentsmx~o, rny~w and m,~w are the following:

  • 8/2/2019 Plausible Reasoning Spatial Observation

    5/8

    UAI 2001 LAN G & M ULLER 289

    ~.. lO w I0 z

    y 10." ) "

    false})my~= ((true})"~,~w ({tr,e, false})m,~ ({~r,e, faZse})

    = exp(- ) 0.5exp(-) ~ 0.5"~05

    ~0.5The combination m~o = rn.~yields

    m~({false}).~ 0 .2({true,false}) .~ 0.2Clearly, this is not what we expect, because x and ybeing close to each other, the pieces of informationthat it is raining at z and at y are clearly not inde-pendent, and thus the mass assignments m=~ andm~. should not be combined as if they were inde-pendent. On the other hand, on the following figures,where rn=~, rn~.~ and m=~ are identical to thoseabove but x is no longer close to y, the above resultm~= = rn=~m~.~@mz~ is intuitively correct.To remedy this problem, we introduce a discountingfactor when combining mass assignments. The dis-count grows with the dependence between the sources,i.e., with the proximity between the points where ob-servations have been made.We use here a method inspired from multi-criteriadecision making (where positive or negative interac-tions between criteria have to be taken into accountwhen aggregating scores associated to the different cri-teria). Assuming that E is finite, for X C_ E andz ~ E \ X, we introduce a conditional importance de-gree ~t(xlX) ~ [0, 1] expressing the importance of theknowledge gathered at point z once the points in Xhave been taken into account. The quantity 1-1z(xlX )is therefore a discount due to the dependence w ith theinformation at ~ and the information already gath-ered.Intuitively, it is desirable that "the further x fromX", the higher I~(xlX ). W hen is sufficiently far fromX, there is no discount and I~(xlX ) is taken to be 1.Several possible choices are possible for/~. In the im-plementation w e chose the following function6: for any

    elts intuitive justification, which is based on an anal-

    X C_ E and for any z ~ E \ X,p(z[X) = min(1,p(X U {z}) - #(X)) where ~(~) = O,p(X) = 1 if IX] = 1 and for any X ofcardinalityn>_2,p(X) = 2- ; ~-~{~,~}gx,u#= e- ~ where A is a pos-itive rent number.In particular we have P(~IO) = 1 and P(zI{Y}) = 1 -, ~ ( . ~ , ~ )~ x--. Taking A = 10, on the example of figure 1 wehave P({Yl{X }}) ~ 0.095 and/~(zl{x}) ~ 0.85.Now, the aggregation of the n mass assignmentsm~, y ~ R(O) \ {x}, with respect to p is done bythe following algorithm. Let z be the focus point andR(O) the points where a nontrivial observation hasbeen performed.

    sort the points in R(O) by increasing order ofthe distance to z, i.e., let Lo(~) be the orderedlist (91, ..., Yn)} where R (O) = {y~,...,yn} andd(~, ~) _< ... _< d(x,~);fori~! rondo

    I rn;(O(i)) = 1 - (1- f(O(i),d(x,yi))": Im}(S) = (1 - f(O(i), d(x,3. compute m , = t~i=~..,~ rn~

    This way o f combining by first reranking and then us-ing interaction factors is reminiscent of the aggrega-tion operator known in multi-criteria decision theorycalled Choquet integral. Formal analogies will not bediscussed further here.In practice, it is often the case that each pointwiseobservation is precise, i.e., O(yi) = {vi} for eachyi ~ R(O). In this case, the above combination op-eration can be written in a much simpler way: themass of a value {v} can be expressed as follows, givena few preliminary notations : Yrni, 2!j, vj ~ V/ai =mi({v~-}) O; Vi ~ [1..p],Pi = {k ~ [1..n]/m~({vi}) 0}; Vi ~ [1..p],g = {k ~ [1..n]/m~({v~}) = 0}. Inthat case, it is easy to show that combination withoutdiscount yields:m ( v ~ ) = (1 - (l-[~,~p, (1 - a.))) * (]-I.~(1 - a~.)).Whereas combination with discount yields:m(vi) = (1 - (l-Ik~p,(1 - ak)"*)), (l-[~p-7, (1 -ogy with fuzzy measures and interaction indexes in multi-criteria decision making, would be rather long and compli-cated to explain without introducing further several deft-nitions. We omit it because this is not the main scope ofthe paper.

  • 8/2/2019 Plausible Reasoning Spatial Observation

    6/8

    290 LANG & MULLER UAI 2001

    4 ExperimentsWe recall that what we focus on is the plausible ex-trapolation of information: given a set of observationson E, what is the likelihood of the truth of a formulaon a point outside of the set of observations ? For theexperiments, we used a binary value domain, namelyS = {white, black}.We compute the overall mass assignment for each lo-cation x in the space E, by combining the mass as-signment induced by every point y in the observationset R(O). We have two courses of action from here.Either we make a plain Dempster combination of allthe simple support functions mx = ~v~R(o)my~=, ei-ther we make a correction based on a Choquet integralapplied to the exponents (as explained at the end ofSection 3.3), to lower the influence of close concurringobservations (for which the independance hypothesiscannot hold). After this combination is performed, wecan decide whether to normalize the results (by as-signing the mass of the null set to the other possiblesets) or to keep a non-zero mass for contradictory in-formation. Keeping un-normalized resulting mass as-signments helps visualizing the conflictual regions.We chose the following experimental framework: weconsider a space of pixels E, ordered along two axes,and for which a distance relation is the Euclidean dis-tance. The distance unit is then one and the factors~(white), ~(black) are uniformly fixed at 3, and wetook the same value of 3 8 used for the coefficient ofinteraction between observations 7 The best way to

    reveals no information, and a purple one would revealconflicting values. Since color is not possible in thisarticle we will show figures in shades of gray. Eachobservation point will be in black or white, and theshade of gray for each interpolation will be a differencebetween the combined mass of the two values (normal-ized). In order to see the observations points, the morelikely a point is to have a value close to a black obser-vation, the more white it is, and conversely. A middlegray will indicate similar levels of both values. Forinstance, figure 4 shows the result for three close con-curring "black" observations next to a single "white"observation. In one case the information is corrected totake into account the fact that close points are relatedand do not express independent sources. In the otherone we have made a plain Dempster combination. Wecan see that the three black points combined have aninfluence similar to a single point. In the limit casewhere the three points are exactly identical we wouldhave exactly the same result as with only one point(illustrating this would not be very spectacular). Fig-ure 2 nonetheless shows different levels of interpolationvarying with the distance between concurring observa-tions, with or without the Choquet-like correction.

    Figure 2: Corrected normalized (]eft) and non-corrected normalized (right) interpolation, with vary-ing distances between observations with identical val-ues

    Figure 1: Corrected(left) and non-corrected (right)in-terpolation, with conflicting values and three concur-ring, close observationsillustrate our results would be to assign a color to eachpixel, assigning a red intensity to one value, a blue in-tensity to the other (and eventually a green intensityto the belief in the empty set, if one want to keep trackof the level of contradiction). This way a black pixel

    7This settings proved empirically to give visual resultsthat illustrates well the principled we use here. Obviously,these factors should be tailored for specific spatial proper-ties with respect to the scale of the actual observed space.Moreover, other distances could be considered where theinteraction and persistence of relevance would take intoaccount other factors.

    5 Information-theoretic anddecision-theoretic issuesPlausible information can be very useful in the contextof decision-making, when decisions have to be m ade onthe basis of uncertain information.5.1 Information intensity mapsOur framework can be used to measure the variationsof the quantity of information over space. In orderto do so, we may compute a probability distributionon S at each point of E, using for instance the pig-nistic transform, and the information level at eachpoint can then be computed using usual information-

  • 8/2/2019 Plausible Reasoning Spatial Observation

    7/8

    UAI 2001 L A N G & M U L L E R 291

    theoretic measures such as entropys. Hence we canbuild a map where each point z is associated to theentropy of its final probability distribution. Entropyincreases as information decreases; in other words, thequantity 1 - H (p) measure the quantity of informationof p. Minimal entropy is obtained at points at whichat a complete observation has been made. Maximalentropy is obtained at points associated with a uni-form probability distribution (if any). Note that thisuniform probability distribution may come either fromconflictual observations or from a lack of information:as explained in Section 3.2, once the combined massassignment has been transformed into a probabilitydistribution, there is no longer a way to distinguishconflict from lack of knowledge.This is true independently of the number of valueswe consider for a spatial fluent, but to illustrate theprocess, we show on figure 3 the level of informationusing as before the 2-valued set S = {white, black}.In this case, the quantity of information 1 - H(p)grows with Ip(white)- 1. Information is minimal1 and maximal whenwhen p(white) = p(black) = 7V(white) = 1, p(black) = 0 or p(white) = O,p(bZack) =1. The shade of gray is proportional to [p(white)-This way, a black point corresponds to a low amountof information and a white point to a high one. Againwe show the results both with and withou t correction.

    Figure 3: Corrected normalized (left) and non-corrected normalized (right) level of information

    5.2 Decision-theoretic map constructionWe are no w interested in the following problem: givena set of observations O, where (and w hat) is it worth-while to measure next? This problem, already consid-ered in the fietd of robotics (where it has received apure probabilistic treatment), is relevant not only forexploring geographical data but also for the question ofgranularity dependent representations. Indeed, given acoarse-grain representation of spatial information seen

    8We recall that the entropy of a probability distributionp over a finite set S is defined as H(p) -- ~,~s -p(s)lnp(s)

    as a set of observations on a larger space, what lo-cations are the most informative when one want toswitch to a finer-grained representation?An information intensity map already gives some hintsabout where it should be interesting points to makenew measures: measures seem more useful in regionsin which the information quantity is low. However,picking the point of E with the lowest amount of in-formation is not sufficient in general. Especially, itis relevant to make a difference between points wherenothing is known because the observations are too far,and the ones where there is conflict between observa-tions at points nearby.If one is interested in choosing one measurement, aclassical heuristics is the maximum expected entropyloss9. This, however, works well if (1) only one moremeasurement has to be made; (2) the measurementshave uniform costs; (3) the utility of a gain of infor-mation does not depend on the value observed noron the location of the measurement. The more gen-eral problem of determining an optimal measurementpolicy over a given number of steps can be cast inthe framework of Partially Observable Markov De-cision Processes. This requires the specification notonly of measurement costs but also a utility functionwhich grows with the global amount of information(and which possibly takes account of the relative im-portance of some values or of some regions). This pointis left for further research, and should be positioned tothe recent work of [D. Kortenkamp and Murphy, 1997]extending the idea of occupancy grids with the use ofMDP.Once a series of measurements has been done, one maydecide either to stop the measurements, or, if the quan-tity information is considered high enough (relativelyto the expected cost of new measurements), we canthen easily compute a "plausible map" from the re-sult of the combination step, by assigning each pointof the space a value with the highest probability, inorder to represent the most likely distribution of thespatial property considered. Figure 4 shows the resulton a sample observation set, with two different levelsof gray for each value. One can again observe that thecorrection decreases the likeliness of a value near con-curring measures, in practise, it would probably bebetter to decide of a threshold under which the beliefin a value is irrelevant before pignistic transformation.If we know indeed that the belief in value 1 is 0.05,and belief in value 2 is 0.04, (thus the belief in theset {1,2} is 0.91), we dont want to assume it is morelikely that value 1 holds and thus we would like themap to remain undetermined at this point.

    ~This heuristics is widely used in model-based diagnosiswhen choosing the next test to perform.

  • 8/2/2019 Plausible Reasoning Spatial Observation

    8/8

    292 LAN G & M U LLER UAI 2001

    Figure 4: Deciding of the most likely map, with (left)or without (right) corrections

    6 ConclusionWe have given here a way of using a set of theoreti-cal tools for the study of (partial) spatial information.By modelling intuitions about the likelihood of spa-tial properties which depend only on distance factors(persistence, influence), we have shown how to inferplausible information from a set of observations. Thefield is now open to experimental investigations of thevarious parameters we have introduced in a genericway, as our ideas have been quite easy to implement,Our way of extrapolating beliefs could be applied toother fields than spatial reasoning. A similar use ofbelief functions is made in [Denoeux, 1995] for classi-fication and by [Hfillermeyer, 2000] for case-based rea-soning. However, these frameworks do not considerpossible interactions before combining, probably be-cause this issue is less crucial in the contexts they con-sider than in spatial reasoning.We think a number of paths can be now followedthat would show the relevance of this work for spatialrepresentation and reasoning. First of all, we nowneed to focus on the intrinsic characteristics of spatialproperties that may influence the parameters we haveconsidered here. Persistence is certainly dependenton more factors that mere distance (for instance,rain is influenced by terrain morphology), and itwould be useful to isolate which kind of informationcould be combined with our framework. The secondorientation we have only sketched here is relatedto spatial decision problem. If we are interested inidentifying the extension of a spatial property (letssay the presence of oil in the ground for the sake ofargument), it would be useful to take into accountinformation about the possible shape (convex or not)or the possibly bounded size of the observed fluent,as it will influence the location of an interesting (i.e.informative) new observation.

    Acknowledgements: we thank the referees for givingus helpful comments and relevant references.

    References[Bloch, 2000] I. Bloch. Spatial representation of spa-

    tial relationship knowledge, In Proceedings of KR-2000), pages 247-260, 2000.

    [Cohn and Gotts, 1996] A . G. Cohn and N . M. Gotts.Representing spatial vagueness: A m ereological ap-proach. In Proceedings of KR-96), pages 230-241.1996.

    [D. Kortenkamp and Murphy, 1997] R.P. BonassoD. Kortenkamp and 1=1.. Murphy. Artificial Intelli-gence based mobile robots. MIT Press, 1997.

    [Dean and Kanazawa, 1989] T. Dean andK. Kanazawa. Persistence and probabilisticprojection. IEEE Trans. on Systems, Man andCybernetics, 19(3):574-585, 1989.[Dempster, 1967] A.P. Dempster. Upper and lower

    probabilities induced by a multivaluated mapping.In Annals Mathematics Statistics, volume 38, pages325-339. 1967.[Denoeux, 1995] ~. Denoeux. A k-nearest neighbourclassification rule based on Dem pster-Shafer theory.In 1EEE Trans. on Systems, Man and Cybernetics,number 25, pages 804-813, 1995.[Hiillermeyer, 2000 ] E. Hiillermeyer. Similarity-based

    inference as evidential reasoning. In Procee dings ofthe 14th European Conference on Artificial Intelli-gence (ECAI2000), pages 55-59, 2000.

    [Iyengar and Elfes, 199 1] S. S. Iyengar and A. Elfes.Autonomous Mobile Robots. IEEE Compu ter SocietPress, 1991 .

    [Randell et al., 1992] D. Randell, Z. Cui, andA. Cohn. A spatial logic based on regions andconnection. In Proceedings of the 3rd C onferenceon Principles of Knowledge R epresentation andRea soning (KR-92). Morgan Kaufmann, 1992.[Shafer, 1976] G. Sharer. A mathematical theory ofevidence. Princeton U niversity Press, 1976.[Smets and Kennes, 1994] P. Smets and R. Kennes.The transferable belief model. Artificial Intelligence,(66):191-234, 1994.[Strat, 1994] T. Strat. Decision analysis using belief

    functions. In R. Yager and J. Kacprzyk, editors,Advances in the De mpster-Shafer theory of evidence.Wiley, 1994.[Wiirbel et al . , 200 0J E. WiirbeI, R. Jeansoulin, andO. Papini. Revision: an application in the frame-work o f GIS. In Proceedings of KR 2000 , pages 505-515, 2000.

top related