representation and matching pictorial structuresgrauman/courses/spring2007/...control theory...

26
67 IEEE TRANSACTIONS ON COMPUTERS, VOL. c-22, NO. 1, JANUARY 1973 [4] K. D. Senne and R. S. Bucy, "Digital realization of optimal discrete-time nonlinear estimators," in Proc. 4th Annu. Princeton Conf. Syst. Sci., 1970. [5] R. S. Bucy and K. D. Senne, "Digital synthesis of non-linear filters," Automatica, vol. 7, pp. 287-298, 1971. [6] E. C. Tacker and T. D. Linton, "Digital and hybrid simulation of a discrete-time optimal nonlinear filter," in Proc. 4th Hawaii Int. Conf. Syst. Sci., 1971, pp. 465-467. [7] -, "Digital and hybrid simulation of a Bayes-optimal non- linear filter," Studies in Digital Automata, Louisiana State Univ., Baton Rouge, Air Force Office of Scientific Research Contract F-44620-68-C-0021, Tech. Rep. LSU-T-TR-40, Sept. 1970. [8] R. E. Mortensen, "Mathematical problems of modeling sto- chastic nonlinear dynamic systems," J. Statist. Phys., vol. 1, no. 2, 1969. [9] C. W. Helstrom, "Markov processes and applications," in Com- munication Theory, A. V. Balakrishnan, Ed. New York: McGraw-Hill, 1968. [10] R. S. Bucy, M. J. Merritt, and D. S. Miller, "Hybrid computer synthesis of optimal discrete nonlinear dynamic systems," in Proc. 2nd Symp. Nonlinear Estimation Theory and Its Applica- tions (San Diego, Calif.), 1971. [11] ,"Hybrid synthesis of the optimal discrete nonlinear filter," Stochastics, vol. 1, Jan. 1973. [12] D. S. Miller, Ph.D. dissertation, Dep. Elec. Eng., Univ. South- ern California, Los Angeles, 1972. Edgar C. Tacker (S'59-M'64) was born in Savannah, Tenn., on September 26, 1935. He received the B.S. degree (with distinction) in electrical engineering from the University of Oklahoma, Norman, in 1960, the M.S. degree in electrical engineering from New York University, New York, N. Y., in 1962, and the Ph.D. degree from the University of Florida, Gainesville, in 1964. IN 01!31,11~5~Q~N, ~0 His industrial experience includes two :1_ i 5 years as a Systems Engineer with Bell Tele- 8 phone Laboratories, Inc. He is presently an Associate Professor in the Departments of Electrical and Chemical Engineering, Loui- siana State University, Baton Rouge. He has developed graduate courses in applied func- tional analysis, systems science, and digital and hybrid computation at LSU and Michigan State University, East Lansing. His current research interests are in the areas of stochastic control theory and multilevel system theory, with emphasis on com- putational aspects. He has applied these theories to problems involv- ing the control of chemical processes and interconnected electrical energy systems. Dr. Tacker is a member of Tau Beta Pi, Pi Mu Epsilon, and Eta Kappa Nu. Thomas D. Linton was born in Frost, Ohio, on November 9, 1944. He received the B.S. i degree in electrical engineering from the Uni- versity of Arkansas, Fayetteville, in 1967, and i% the M.S. degree in systems science from Michigan State University, East Lansing, in1969. He is presently working toward the Ph.D. degree in the Department of Electrical Engi- neering, Louisiana State University, Baton Rouge. Mr. Linton is a member of Eta Kappa Nii and Phi Kappa Phi. The Representation and Matching of Pictorial Structures MARTIN A. FISCHLER AND ROBERT A. ELSCHLAGER Abstract-The primary problem dealt with in this paper is the following. Given some description of a visual object, find that object in an actual photograph. Part of the solution to this problem is the specification of a descriptive scheme, and a metric on which to base the decision of "goodness" of matching or detection. We offer a combined descriptive scheme and decision metric which is general, intuitively satisfying, and which has led to promis- ing experimental results. We also present an algorithm which takes the above descriptions, together with a matrix representing the in- tensities of the actual photograph, and then finds the described object in the matrix. The algorithm uses a procedure similar to dynamic programming in order to cut down on the vast amount of computation otherwise necessary. One desirable feature of the approach is its generality. A new programming system does not need to be written for every new description; instead, one just specifies descriptions in terms of a certain set of primitives and parameters. There ate many areas of application: scene analysis and descrip- tion, map matching for navigation and guidance, optical tracking, Manuscript received November 30, 1971; revised Mav 22, 1972, and August 21, 1972. The authors are with the Lockheed Palo Alto Research Labora- tory, Lockheed Missiles & Space Company, Inc., Paln Alto, Calif. 94304. stereo compilation, and image change detection. In fact, the ability to describe, match, and register scenes is basic for almost any image processing task. Index Terms-Dynamic programming, heuristic optimization, picture description, picture matching, picture processing, represen- tation. INTRODUCTION T llHE PRIMARY PROBLEM dealt with in this T paper is the following. Given some description of a visual object, find that object in an actual photo- graph. The object might be simple, such as a line, or complicated, such as an ocean wave, and the description can be linguistic, pictorial, procedural, etc. The actual photograph will be called the "sensed scene," a two- dimensional array of gray-level values, while the object being sought is called the "reference." This ability to find a reference in a sensed scene, or, equivalently, to match or register the images of two scenes, is basic for almost any image processing task. Application to such areas as scene analysis and descrip- tion, map matching for navigation and guidance, optical

Upload: others

Post on 07-Feb-2021

6 views

Category:

Documents


0 download

TRANSCRIPT

  • 67IEEE TRANSACTIONS ON COMPUTERS, VOL. c-22, NO. 1, JANUARY 1973

    [4] K. D. Senne and R. S. Bucy, "Digital realization of optimaldiscrete-time nonlinear estimators," in Proc. 4th Annu. PrincetonConf. Syst. Sci., 1970.

    [5] R. S. Bucy and K. D. Senne, "Digital synthesis of non-linearfilters," Automatica, vol. 7, pp. 287-298, 1971.

    [6] E. C. Tacker and T. D. Linton, "Digital and hybrid simulationof a discrete-time optimal nonlinear filter," in Proc. 4th HawaiiInt. Conf. Syst. Sci., 1971, pp. 465-467.

    [7] -, "Digital and hybrid simulation of a Bayes-optimal non-linear filter," Studies in Digital Automata, Louisiana StateUniv., Baton Rouge, Air Force Office of Scientific ResearchContract F-44620-68-C-0021, Tech. Rep. LSU-T-TR-40, Sept.1970.

    [8] R. E. Mortensen, "Mathematical problems of modeling sto-chastic nonlinear dynamic systems," J. Statist. Phys., vol. 1, no.2, 1969.

    [9] C. W. Helstrom, "Markov processes and applications," in Com-munication Theory, A. V. Balakrishnan, Ed. New York:McGraw-Hill, 1968.

    [10] R. S. Bucy, M. J. Merritt, and D. S. Miller, "Hybrid computersynthesis of optimal discrete nonlinear dynamic systems," inProc. 2nd Symp. Nonlinear Estimation Theory and Its Applica-tions (San Diego, Calif.), 1971.

    [11] ,"Hybrid synthesis of the optimal discrete nonlinear filter,"Stochastics, vol. 1, Jan. 1973.

    [12] D. S. Miller, Ph.D. dissertation, Dep. Elec. Eng., Univ. South-ern California, Los Angeles, 1972.

    Edgar C. Tacker (S'59-M'64) was born in Savannah, Tenn., onSeptember 26, 1935. He received the B.S. degree (with distinction) inelectrical engineering from the University of Oklahoma, Norman, in1960, the M.S. degree in electrical engineering from New YorkUniversity, New York, N. Y., in 1962, and the Ph.D. degree fromthe University of Florida, Gainesville, in 1964.

    IN 01!31,11~5~Q~N, ~0 His industrial experience includes two:1_ i 5years as a Systems Engineer with Bell Tele-

    8 phone Laboratories, Inc. He is presently anAssociate Professor in the Departments ofElectrical and Chemical Engineering, Loui-siana State University, Baton Rouge. He hasdeveloped graduate courses in applied func-tional analysis, systems science, and digitaland hybrid computation at LSU and MichiganState University, East Lansing. His currentresearch interests are in the areas of stochastic

    control theory and multilevel system theory, with emphasis on com-putational aspects. He has applied these theories to problems involv-ing the control of chemical processes and interconnected electricalenergy systems.

    Dr. Tacker is a member of Tau Beta Pi, Pi Mu Epsilon, and EtaKappa Nu.

    Thomas D. Linton was born in Frost, Ohio,on November 9, 1944. He received the B.S.

    i degree in electrical engineering from the Uni-versity of Arkansas, Fayetteville, in 1967, andi% the M.S. degree in systems science fromMichigan State University, East Lansing,in1969.He is presently working toward the Ph.D.

    degree in the Department of Electrical Engi-neering, Louisiana State University, BatonRouge.

    Mr. Linton is a member of Eta Kappa Nii and Phi Kappa Phi.

    The Representation and Matching of Pictorial StructuresMARTIN A. FISCHLER AND ROBERT A. ELSCHLAGER

    Abstract-The primary problem dealt with in this paper is thefollowing. Given some description of a visual object, find that objectin an actual photograph. Part of the solution to this problem is thespecification of a descriptive scheme, and a metric on which to basethe decision of "goodness" of matching or detection.We offer a combined descriptive scheme and decision metric

    which is general, intuitively satisfying, and which has led to promis-ing experimental results. We also present an algorithm which takesthe above descriptions, together with a matrix representing the in-tensities of the actual photograph, and then finds the describedobject in the matrix. The algorithm uses a procedure similar todynamic programming in order to cut down on the vast amount ofcomputation otherwise necessary.

    One desirable feature of the approach is its generality. A newprogramming system does not need to be written for every newdescription; instead, one just specifies descriptions in terms of acertain set of primitives and parameters.

    There ate many areas of application: scene analysis and descrip-tion, map matching for navigation and guidance, optical tracking,

    Manuscript received November 30, 1971; revised Mav 22, 1972,and August 21, 1972.

    The authors are with the Lockheed Palo Alto Research Labora-tory, Lockheed Missiles & Space Company, Inc., Paln Alto, Calif.94304.

    stereo compilation, and image change detection. In fact, the abilityto describe, match, and register scenes is basic for almost anyimage processing task.

    Index Terms-Dynamic programming, heuristic optimization,picture description, picture matching, picture processing, represen-tation.

    INTRODUCTIONTllHE PRIMARY PROBLEM dealt with in thisT paper is the following. Given some description of

    a visual object, find that object in an actual photo-graph. The object might be simple, such as a line, orcomplicated, such as an ocean wave, and the descriptioncan be linguistic, pictorial, procedural, etc. The actualphotograph will be called the "sensed scene," a two-dimensional array of gray-level values, while the objectbeing sought is called the "reference."

    This ability to find a reference in a sensed scene, or,equivalently, to match or register the images of twoscenes, is basic for almost any image processing task.Application to such areas as scene analysis and descrip-tion, map matching for navigation and guidance, optical

  • 1EEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    tracking, stereo compilation, and image change detec-tion is direct and obvious.There are two basic approaches to solving the image-

    matching problem as defined above.If we possess a precise description of the noise and dis-

    tortion process which defines the mapping between thereference and its image in the sensed scene, we can em-ploy statistical decision theory techniques to derive animage-matching procedure which optimizes some objec-tive criterion (e.g., minimum error, or minimum risk,in determining the best embedding of the reference inthe sensed scene). A typical outcome of such an analysisis the use of a correlation-like matching procedure. How-ever, in most practical problem situations, the requirednoise and distortion model is not available, nor is itfeasible to attempt to construct one. For example, wemight consider all human faces to be perturbed versionsof some single ideal or reference face. However, to at-tempt to define completely a valid noise and distortionmodel for this situation would be a hopeless task.A second and more general approach to the image-

    matching problem bypasses the need for a noise and dis-tortion model by accepting an embedding metric with-out requiring its justification as being equivalent to aminimum error (or minimum risk) procedure. In fact,without a noise and distortion model, there is no the-oretically valid way to derive or predict the error per-formance of a selected procedure prior to its actualapplication. Our primary concern in this paper is withthis latter case.We offer the following two sets of criteria for an em-

    bedding metric not theoretically derived on the basisof error performance. First, it must be successful inapplication (i.e., its observed error performance mustbe acceptable), intuitively satisfying (so we can havesome confidence in its ability to deal with as yet untriedapplications), and general enough so that it can be em-ployed over a wide range of problems without significantmodification. Second, it must be possible to specify acomputationally feasible decision algorithm which se-lects a suitable embedding based on the given em-bedding metric. (The combination of embedding metricand corresponding decision algorithm will be called anembedding model.) In the remainder of this paper, wewill present an embedding metric and correspondingdecision algorithm, present examples of the applicationof this embedding model to a number of distinct prob-lem areas, and show how this embedding model is rele-vant to the problem of scene representation as well as toimage matching.

    AN EMBEDDING METRIC

    To introduce the generic form of our embeddingmetric, and establish its intuitive validity, let us firstconsider the following process. Assume that the refer-ence is an image on a transparent rubber sheet. Wemove this sheet over the sensed image and, at each pos-sible placement, we pull or push on the rubber sheet to

    get the best possible alignment between the referenceimage on the sheet and the underlying sensed image.We evaluate each such embedding both by how good acorrespondence we were able to obtain and by how muchpushing and pulling we had to exert to obtain it.

    Let us now consider a discrete version of the aboveprocess which is both more precise and more reasonablefrom an implementation standpoint. In a specific appli-cation, we might have some information on the rangeof permissible distortions that can occur between thereference and sensed images. For instance, some subsetof the items appearing in the sensed image might alwaysretain their internal shape even though their relativepositions might be subject to change with respect totheir locations in the reference scene. Further, wherechange of relative position is possible, we might be ableto bound the extent of such change; and, finally, wemight like to assign variable "costs" to the differenttypes of change of relative position or relative change insome nongeometric attribute.To achieve these capabilities, we replace the rubber

    sheet by a reference image which is composed of a num-ber of rigid pieces (components) held together by"springs." A rigid piece of the reference image can be assmall as a single resolution cell, or as large as the entirereference image, and corresponds to a single coherententity in the reference image. The springs joining therigid pieces serve both to constrain relative movementand to measure the "cost" of the movement by howmuch they are "stretched." (Typically, the springs willbe highly nonlinear in their behavior.) In determiningthe cost of an embedding, we measure the "tension" oneach spring (the tension can be a function of directionas well as stretch or even a relative change in somelocally defined attribute), and also make a local evalua-tion of how well each coherent piece is embedded as anindependent entity.The above model permits two interesting dichotomies.

    The first dichotomy is the separation of "syntactic" and"semantic" information. The semantic information,which is application dependent, is embodied in thespecific partitioning of the reference into coherentpieces, the placement and cost functions assigned to thesprings, and the cost functions associated with the inde-pendent embedding of the coherent pieces. The syn-tactic information, which is relatively independent ofthe particular application, defines the class of descrip-tions which the algorithm can process. These data areembodied in the limits set on reference decomposition(e.g., number and maximum size of pieces, etc.); in theformats which must be employed to specify the globalconstraints and costs; and in the form of the embeddingmetric which evaluates "global" fit. The separation ofsemantic and syntactic information is essential to per-mit application of the model to a broad range of prob-lem areas without the necessity of making significantchanges in the implementation.The second dichotomy is the separation between the

    68

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    local and global evaluation functions. The global evalu-ation function, associated with the relative positioningof the coherent pieces as described previously, hasstrong syntactic controls on its form to permit its inte-gration directly into the decision algorithm. This is im-portant because the global evaluation produces the mostsevere combinatorial problems. A local evaluation func-tion, associated with how well a given coherent piece isindependently embedded, is easily changed from prob-lem to problem (based on problem-dependent considera-tions) without requiring any change in the core algo-rithms. Thus, the form of a local evaluation functioncan be a (conventional) correlation function togetherwith a pictorial reference component, or a procedurebased on linguistic concepts together with a formaldescription of a reference component,' or even a seriesof guesses in'serted interactively by a human evaluator.The decoupling of the local evaluation functions fromthe core algorithms provides a great deal of flexibilityin' making changes or improvements in the evaluationfunctions for a given problem, as well as when switchingfrom problem to problem. Further, because of the aboveseparation, the performance of the algorithms (bothlocal and global) can be independently evaluated in adirect and intuitively obvious manner. Such an evalua-tion then permits iterative improvement in performanceby selective alteration in the problem-dependent options.We are now in a position to present formally the pro-

    posed embedding metric. Let the reference be composedof p components (i.e., p coherent, or primitive, pieces).For 1

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    global (intercomponent) constraints, and, if acceptable,then evaluate the embedding metric for the given selec-tion. Obviously, such an approach is completely imprac-tical even for small pictures. For example, a 50X 30 SM(M = 1500) and a 5 X 5 RM (N = 25) would require morethan 1054 selections and evaluations, a hopelessly largenumber. If we assume that the coherent and relationalconstraints provide us with P = 6 nonoverlapping com-ponents, each component sequentially constrained tostay within w = 10 locations referenced to the locationof the previously placed component, then we would stillbe required to perform on the order of 1500 X 105= 1.5 X 108 evaluations. Assuming 10-3 s per evaluation,we would require 1.5 X 105 s or approximately two daysof computation time. It is thus obvious that a moreeffective technique is required.

    Dynamic ProgrammingExpressed in formal terms, the evaluation of the em-

    bedding metric for a typical picture results in a non-linear, integer programming problem with local optimadifferent from the global optimum (i.e., no particularregularity, such as unimodality). The only availableclass of computational procedures for finding the globaloptimum under the above conditions (other than the ex-haustive search techniques discussed earlier) is usuallydesignated by the generic name dynamic programming.DP is a multistage or iterative optimization proce-

    dure which can be described in general terms as follows(see [6] and [7]). We wish to find

    min G(X) = E h(Xi) (3)X iEI

    where X= {X1, X2, , xp4, each xi, 1

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    table forfk and yk*. The number of lookups required forthe construction of this kth table will be proportional tothe number4 of its rows- [AMD(G)j] and (for each row)the constrained number of feasible locations of the-vari-able being eliminated (denoted Wk, whereWk

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    responding locations of the embedded components aredetermined from Y [see (20)].7

    Given the restriction that the hi in (3) are independentof all xj forj > i [this is consistent with the embeddingmetric as presented in (2) ], then dynamic programmingcan be viewed as a procedure for finding the shortestpath through a graph. We define this graph in the fol-lowing way. The nodes of the graph are arranged in pcolumns, where each node in the ith column is labeledby the values of the variables corresponding to a uniqueset of embeddings for the first i components; there areas many nodes in the ith column as there are uniqueembeddings of the first i components. The length of abranch between a node in column i -1 (with label Xi1)and a node in column i (with label Xi) is hi(Xi). Wedelete branches with infinite length.To determine the shortest path from the root node

    (a single node placed in column zero) to some node incolumn p, we can proceed as follows. In the ith column,for each node, sum all the branch lengths correspondingto the label of the node. We now determine that set ofvariables (xi) which do not appear in any hj for j> i,and call this set of variables Zi (the variables in Zi cor-respond to components in columns ji). We now place the nodes in the ith column intosets, such that, for each set, the nodes are identical intheir labels except for the variables in Zi. For each suchset, we retain the node with the shortest path lengthfrom the root node, and delete all the other nodes in theset as well as those nodes in columnsj>i which branchfrom deleted nodes. It is this pruning process whichgives DP its computational advantage over completeenumeration. After the above set of operations is carriedout through the Pth column, we select the node definingthe shortest path through the tree, and thus the lowestcost embedding for the P components. The LEA differsfrom DP in that in the sequential determination of theshortest path, a maximum of only mi nodes will be re-tained in the ith column (mi is the number of permissiblelocations for embedding the ith component). In pro-cessing the ith column, the nodes are grouped into setssuch that, for each set, the nodes are identical in theirlabels for xi. For each such set we then proceed as in theDP case. Thus, at the ith iteration we save only the mi"best" current embeddings, such that every possible

    7The yi defined in (16) clarify the presentation of the LEA. How-ever, in a computer implementation one need Inot calculate these yi,which are arravs, each element of which is a sequence of locations.Rather, it is only necessary to compute a more restricted funiction Zidefined by

    Zi (Xi) = Xi-lwhere xi- is that value which minimizes the previous equiation

    These zi are arrays, each element of which is a single locatioiirather than a sequence of locations. Then in the compuitations in-volving the Si, if Si depends on more than the two rightmost loca-tions of y_ (xi-,)*xi, the additional locations may be retrieved fromthe Zk, 1

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    YSM

    44 5 2 8

    3 7 5 1 3

    2 8 1 5 7

    1 4 3 2 4

    1 2 3 4 -_z

    C1 (1, 1)

    c2 (1,1)

    C3 (1, 1)

    C4 (1,1)

    4Q-L C2

    2 3

    = CI =36C2 4

    =C3 =5=C4 =3

    I(z Y) = I SM(z,Y) Cifor I i - 4

    Spring definition when (i, j) = (2, 1) or (i,j) - (4,3)

    Xi - Xj =(Zi - Zj J _ yj) gi (xi- X)1,0 02,0 1

    otherwise

    Spring definition when (i,j) - (4, 1) or (i,j) = (3,2)

    xi-=xj- (zi- zi Yi Yj) g.i(Xi-x.)0,1 00,2 1

    otherwise

    (b)

    Evaluation of g2

    x2 x1 61 s2z2 Y2 z1Y1 I1 12 g21 9224 14 1 2 0 3

    3 4 1 4 1 4 1 6

    2 4 4 4 0

    4 4 2 4 4 4 1

    3 4 2 4 0 6

    2 3 1 3 1 1 0 2

    3 3 1:3 1 3 1

    2 3 1 3 0 4

    4 3 2 3 1 1 1 3

    3 3 5 1 0

    22 1 2 2 3 0 5

    3 2 1 2 2 1 1 4

    2 2 5 1 0

    4 2 2 2 5 3 1

    32 1 3 0 4

    Evaluation of 93

    x3 x2 x s3

    z3 33 z2 Y2 zlyl 13 g32 g2 g32 3 2 4 1 4 0 0 3 3

    3 3 3 4 1 4 4 0 6 10

    4 3 4 4 3 4 2 0 6 8

    2 2 2 3 1 3 4 0 2 6

    2 4 4 1 3

    3 2 3 3 2 3 0 0 4 4

    3 4 0 1 6

    j4 2 4 3 2 3 2 0 3 5

    4 4 2 1 6

    2 1 2 2 2 0 5

    2 3 1 3 2 1 2 5

    3 1 3 2 1 2 3 0 4 7

    3 3 3 1 4

    4 1 4 2 3 2 1 0 4 5

    4 3 1 1 3

    Evaluation of g4 = G

    x4 23 xl 94Z4y4 Z y33 1 Y 6g43 g41 S4 14 g3 G

    1 3 2 3 1 4 0 0 0 4 3 7

    3 3 1 4 1 0 1 4 10 15

    2 3 3 3 1 4 0 2

    4 3 3 4 1 2

    3 3 4 3 3 4 0 0 0 2 8 10

    1 2 2 2 1 3 0 0 0 5 6 11

    3 2 2 3 1 5

    2 2 3 2 2 3 0 0 0 2 4 6

    4 2 2 3 1 0 1 2 5 8

    3 2 4 2 2 3 0 2

    1 1 2 1 1 3 0 1 1 1 5 7

    3 1 1 2 1 0 1 1 7 9

    21 31 12 0 0

    41 32 1 0

    31' 4 1 3 2 0 0 0 1 5 6

    (c)

    Fig. 1. An example illustrating the operation of the linear embedding algorithm. The definitions of x, gij, I, are given on pagesz and y are the components of x; that is, x = (z, y). (a) The sensed image. (b) The reference description. (c) Linear embedding algorithm.

    73

    x= (z Y)

    1,42,43,44,41,32,33,34,31,22,23,24,21,12,13,14,1

    SM (Zty)

    5288751381574324

    (a)

    -[ (2, 3), (3, 3), (3, 2), (2, 2)]

    -I (3, 2), (4, 2), (4, 1)(3, 1)]

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    ponents, or perhaps by someone guessing at what a"good" embedding might actually look like.) Now, em-ploying the LEA (or DP) to place and evaluate thecumulative cost of placement of the components se-quentially, we can eliminate from further considerationany of those placements whose costs exceed the boundestablished by our best trial embedding. It should benoted that this technique is valid only if the cost asso-ciated with the embedding of each component is non-negative. However, this can almost always be the casefor the class of problems we are discussing in this paper.A heuristic embellishment of the above branch-and-

    bound technique would be to use some fraction (say[k/N]') of the bound at the kth stage of an N-stageprocess as the threshold for eliminating a possible se-quence of embeddings.

    Scale and Rotation (S&R) ConsiderationsIn attempting to match or register two images, we

    frequently are faced with the problem of unknown rela-tive scale and orientation. While such variations areconceptually indistinct fromn any of a host of unwantedvariations between the reference and the image, they(S&R) can serve as a vehicle for clarifying some im-portant issues pertaining to the way the LEA is em-ployed.As noted in earlier sections, the embedding process is

    carried out at two levels. First, the components of thereference are searched for as independent entities. Theparticular processes by which these searches are exe-cuted are not a direct issue of concern here; the im-portant point is that, regardless of the search mecha-nism, the outcome of the search for any individual com-ponent is presented to the LEA by a tabulation calledthe local evaluation array [L(EV)A]. Each entry in theL(EV)A corresponds to a possible embedding in theimage of the associated reference component, indexed bythe variables used to define the embedding. The entryconsists of a number related to the probability that the-component is actually present at the "location" specifiedby the indexing variables, and each entry can also con-tain the values of attributes of the component as mea-sured at the indexed location. The LEA has no knowl-edge of the component beyond what is presented to itin the L(EV)A for that component. The purpose of theLEA is to integrate global or structural knowledge withthe information provided in the L(EV)A's to find thebest overall embedding (or embeddings) of the referencein the image. The acceptability of the final embeddingselected by the LEA will thus be dependent on tlhe qual-ity of the information presented in the L(EV)A's,where the extent of this dependence is related to therelative importance of local (component definition) ver-sus global (intercomponent) information for the particu-lar problem. Thus, it is the responsibility of the localevaluation function, in attempting to gather evidenceabout the presence of some given reference component,to be able to deal with the various noise and distortion

    Fig. 2. Reference description of a squiare.

    processes (such as S&R) which might be encountered.To the extent that these same noise and distortion pro-cesses affect the global or structural relationships be-tween the reference components, the LEA provides themachinery necessary to deal with the resulting problems.

    Ability to deal with variations at the global level isaccomplished by defining "attributes" which measure(or estimate) these variations, and then making the"spring" parameters functions of these attributes.Thus, in the case of S&R, if a component Pi has scale

    and rotation attributes Si and 01, the springs (vectors)attached to P1 would be (conceptually) scaled as a func-tion of S1, and rotated as a function of 01. The followingexample illustrates some of the above comments.

    Problem

    Given a two-dimensional region in which there are krandomly oriented and positioned line segments, findthe four-line segments which best approximate a square.Each line is specified by a four-tuple of the type (x, y,0, 1) where the x, y coordinates locate the center of thesegment, 0 specifies the orientation, and I the length ofthe segment. To simplify this example, we will ignorethe detection problem and assume that the given valuesfor each segment are known with probability one. Weassign a cost Ci for each unit of positional disparity be-tween the sides of a candidate square, and a cost C2 foreach degree of rotational disparity between the sides(i.e., sides should meet at right angles).

    Solution

    Consider a single "local evaluation array" consistingof the given list of four-tuples (x, y, 0, 1). We can con-sider x, y to be the "location" indexing variables, and0, 1 to be attributes. All entries have unit probability,entries with zero probability are deleted. The "descrip-tion" of a square is shown schematically in Fig. 2. Each(spring) vector is rigidly attached to its line segment ata fixed angle of 45°. Costs (C1) associated with (x, y) dis-parity between the end of a vector and the center of thenext line segment are circularly symmetric (i.e., the"cost" function increases with distance from the tip ofthe vector) about the tip of the vector. We also assess acost (C2) proportional to the difference in measured

    74

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    (attribute value) orientation of more or less than 900for sequential line segments.The general form of the LEA, with orientation ad-

    justed springs, and spring costs augmented by attribute(S&R) differences, is adequate to deal with the prob-lem as posed. A minor difficulty arises from the fact thatthe orientation of a line segment is actually two valued(i.e., 0 and 0+1800). If we list each line segment twicein the local evaluation array, once for each of its twoorientations, then the LEA can be applied withoutmodification.We can handle squares of differing size by using the

    line-segment-length attribute in the same manner as theorientation attribute (except that double entries are notrequired here). Spring stretching is augmented by acost proportional to the difference in length attributefor sequential line segments. That is, when the LEAexamines a new line segment as a possible additionalside for a square already partially formed, it comparesthe length of the new line segment with the length of theline segment in the partial square to which the new seg-ment will be attached. The spring between the new andthe old line segments then is stretched (over and aboveany stretching due to angular and positional disparity)by an amount proportional to the difference in theselengths.

    In the above example note that, because each entryin the local evaluation array had either zero (oo cost)or unit (0 cost) probability associated with its occur-rence, the size of this array could be reduced to listingonly those few coordinate combinations associated withthe feasible (nonzero probability) occurrence of a com-ponent (line segment). This procedure can be used inother situations where it is reasonable to reduce all lowprobability entries in the local evaluation array to zero.

    PICTORIAL REPRESENTATION

    A central problem in much of the work concernedwith the computer processing of pictorial data is that ofrepresentation. Since we cannot manipulate the realworld object (itself) within the computer, we attempt toconstruct a representation (or model) which can be usedin place of the actual object and which has the following(somewhat overlapping) properties.

    Complete: Any question of interest which could beresolved by reference to the actual object should also becapable of being resolved by reference to the represen-tation.

    Compact: The representation should be free of infor-mation redundant to the purposes for which it will beused. This is necessary to minimize computer storagerequirements.

    Transformable: Much of the information contained ina representation will be implicit rather than explicit inform. The ability to manipulate easily the representa-tion to extract required information is essential. Forexample, if we represent a picture by an intensity matrixor raster, then a count of the number of isolated objects

    appearing in the picture would be implicit informationwhich could be extracted from the representation afterconsiderable processing. However, if the representationconsisted of the contours of the object appearing in thepicture, then the required count could be obtainedrather simply.

    Incrementally Changeable: If we observe a slightchange in the real world object, it should be a relativelysimple and straightforward task to alter the representa-tion. Further, from the standpoint of image matching,a small change in the real world should require only asmall change in the representation.

    Accuracy and Simplicity of Translation: Given a realworld object, it should be relatively simple to derive anaccurate representation of the object.Over the past ten years or so, much of the work con-

    cerned with pictorial representation has been restrictedto the domain of line type drawings, and the use of for-mal linguistic methods (see [8 ]- [10 ]). Very little successhas been achieved in attempts to extend this work toscenes of terrains, cloud covers, human faces, etc.,which can only be described meaningfully in terms ofpicture components which are not line elements, butwhich are regions with colors, textures, shadings, etc.8

    Perhaps the most serious failing of the linguistic (andsimilar) techniques occurs with respect to the "transla-tion" property. These techniques build a representationby constructing a hierarchy based on picture primitives,assembled into linear expressions employing specifiedrelational forms, and satisfying a set of syntax rules. Theproblem arises from the fact that (usually) the onlydirect correspondence between the actual object and itsrepresentation occurs at the level of the primitive ele-ments (typically points, intensities, and lines), while inpractice there are pieces of a picture that are too in-volved or complicated to describe in terms of theseprimitives. Theoretically, the description is possiblesince the matrix representing the picture is finite. How-ever, such a description would be so complicated that,aside from the difficulty of composing it, there would bea considerable likelihood of error and inaccurate repre-sentation.

    In the previous portions of this paper we have pre-sented a representational scheme for pictures, and wereprimarily concerned with its application to imagematching. We will now show that the representationalscheme has wide general applicability, and avoids manyof the problems of the linguistic approaches. First wenote that it is a hybrid type of representation in that itinvokes symbolic (numerical) elements as well as allow-ing actual picture segments to be part of the representa-tion. The ability to intermix picture segments and sym-bolic data in the same representation greatly simplifies

    8 Specialized systems have been developed, highly tailored tospecific problems, which are exceptions to this assertion; e.g., seeKelly [11]. A number of papers, including Bledsoe [15], [161 and

    * Goldstein and Harmon [17], effectively consider the problem of faceidentification based on feature measurements obtained manually.

    75

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    the translation problem. Where a pictorial concept isdifficult to describe symbolically, we can use an actualpiece of the picture as part of the description. A secondaspect of our representation that simplifies the transla-tion problem is the fact that the components (picturepieces, local evaluation arrays, etc.) and the relationalforms (springs) are two-dimensional rather than one-dimensional entities. We thus avoid the problem of hav-ing to construct a one-dimensional model for a two-dimensional structure.

    Let us now examine some of the other representationalattributes. Incremental changeability follows directlyfrom ease of translation (although the inverse relation-ship would not necessarily hold). Transformability withrespect to image matching has certainly been estab-lished; the fact that the representation is already intwo-dimensional form, with metric, geometrical, andtopological relationships explicitly and quantitativelyexpressed, implies that transformability for many otherapplications is more than adequate.

    In many respects, compactness and transformabilityare antithetical since data in explicit form are usuallymore extensive than the equivalent implicit information.This is the case with our representation. It requires con-siderably more storage than might be required for alinguistic representation.The one area where the linguistic approach has an

    obvious advantage is with respect to completeness. Alinguistic representation can treat nonpictorial informa-tion (e.g., relations between items in a picture and otheritems not visible, but perhaps implied or normallyassociated with the pictured items) in a way that wouldbe extremely difficult in our representation. This addi-tional capability could, of course, be achieved by ap-pending the necessary linguistic machinery to our cur-rent scheme, although the final result might well be amixture of two representations rather than one inte-grated representation.

    EXPERIMENTAL RESULTS

    In order to evaluate the practical implications of thetechniques presented in this paper, we have initiated aprogram involving experiments on a variety of line typedrawings and gray-level imagery. Over 400 experimentshave already been performed with the following generalresults.

    1) On well-defined imagery (i.e., relatively noise-freeand unblurred -pictures), the embeddings produced bythe LEA were almost always in agreement with the bestembedding as predetermined by human evaluators.Where the few deviations did occur, they were reason-able, and usually related to the crude component de-scriptions employed.

    2) On noisy imagery (course resolution, additiverandom and coherent noise), the fall in performanceparalleled the difficulty human evaluators had in locat-

    ing suitable embeddings. Where the components werediscernible in the image, the embeddings were usuallycorrect; those components which were significantlyaltered by the noise were sometimes missed, but thesubstitution error was usually in close proximity to thecorrect embedding location, and, even in error, thecorrect embedding almost always had a score close tothe best score.

    Since the programmed version of the algorithm is stillevolving, and some of the discussed features have notyet been implemented (e.g., the "attribute" feature isnot operational as yet), most of the experiments wereinformal in nature. However, for the purposes of thispaper, two sets of controlled experiments were run andare described below.

    Image-Matching Experiments Using FacesThe majority of the experiments we have run to date

    had human faces as their subject. Reasons for this selec-tion include the following.

    1) The availability of a set of digitized gray-scalepictures containing faces.

    2) A single reference (face) could be tested on all thefaces in the data set. In the case of, say, terrain pictures,a separate reference (or, at best, a unique compositionof standard reference components) is necessary for eachpicture.

    3) Our familiarity with faces and their components(eyes, nose, mouth, etc.) facilitates evaluation of per-formance as noise and distortion are introduced.The data set used in the face experiments consisted of

    15 human faces,9 both men (some with beards) andwomen, digitized to approximately 16 true gray levels,and each face typically was contained in a picture fieldof from 2000 to 3000 resolution elements. Using a refer-ence as shown schematically in Fig. 3(a), with com,ponents as described in Fig. 3(b) and (c), almost 300formal and informal experiments were performed. Ineach experiment, additive (truncated) Gaussian randomnoise with zero-mean and standard deviation of either 0,10, or 15 units was added to each resolution element(relative to a pseudogray scale of 64 units for the noise-free pictures). In some of the informal experiments,coherent noise consisting of randomly placed lines wasalso inserted [see Fig. 4(a)].With no more than two or three exceptions, when the

    reference was restricted to hair, eyes, and sides of face,correct embedding was achieved. These results aregratifying in view of the simple component descriptionsemployed, and the equivocation displayed by the re-sulting L(EV)A's. (See examples shown in Fig. 4.)Two series of formal experiments were run on the

    I These data were obtained from the Staiiford Artificial Intelli-gence Laboratory and are a subset of the data employed in the ex-periments described by Kelly [11].

    76

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    LEFTEDGE

    A1P1I

    77

    V~7$J~O~I RIGHT

    NOSE EDGE

    MOUTH

    (a)

    VALUE(X)=(E+F+G+H)-(A+B+C+D)

    Note: VALUE(X) is the value assigned to theL(EV)A corresponding to the location Xas a function of the intensities of locationsA through H in the sensed scene.

    (b)

    K K2=CONSTANTSa=(C+D+E+F)/4p=(A+B+G+H+I+J)/6

    p-(X+F)IF [X

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    1l34567800123456b890li34567890123456789.,

    234 4XMeeAl5 Zvoy"9U34X_6 a-119-z8flfaOsx-

    7 1%5i2[mMCW*UIZ3A8 Z7iZ- = IZXXXWIO

    c so@x + --Zos FLc A33Z7 +A333)11 134 ZI7=-- -+AOUUS=T2 X' --- - *Ass -13 '36AZ 1+.= -+XEUKRX14 MUA7ZI- *744 _8_A15 XISXZ XX1= -.++ielaus+f16 )BSXM"MIlX -A44XXZMUUU*17 -X14tXAM-+4Mf4AZ 1M3US18 XIE)XAI AT) Z1T?AI19 XfX=-I77ZZ +* 1+++=la 120 134) +1 )- - +7X+)Zi +YAL+ Zi ++ M IT22 )AXl= IAZX) -Z7)+-23 ZXZ).Xl ZI- -1Z=24 + AY771TT--+Z)25 MXZ XMA XXZI += 1 +26 AAXZXI)++Il))X-

    28 A@IAXI )1IZZZ1Z=2ZS; =Amfif4XX1 IIXXL1 X_

    3 1 - Z A;A X X X Z =+ 132 1M -I IXXZ 1I) ) ++ 1+=33 4.XWM 1iI71)+ + + = +X IX)34 +1XAM8=XX===++--= -= -AAX.+=ZX1+++35 I

    1234567890123456 78Q3 12345678CC12 34567803

    Original picture.

    12345676 90 12 345o79l012345678 901234567890

    2 ZZZI7 Z2ZZ177X )==4++.. lZl71 l)})ll)111111113 ZZ)1)X17 1+.+1- - )1 1 714Z21+ +)1111Z4 2711) 1Z2)Z1X1 )XEf3e'A' -==+I11111)llllllllZ5 771 )1 1)IZI)14*O93R6Z )=11ZZZZ11zz71111Zf: Z2XZZZ1) 1 I++XWiEll8!iYRO42 +)1 1 Z ZZZZ I ZZIZ7 7ZlZA1AZ )1.+?A334343tILZI 7 I1)1111ZZP Z)I Z1114Xf1)I+Z-Tf'.MM3VY =1127IXAZ1II1; Zll 11 l}ZMlX) 1++)11 }+++X@!X1+)11XZIZZ112

    10 Z71)l3110Zf))l--)111++)1§YX-))1Z2ZZZZl1 ZI7Z11 Z4.3ZII)1+1 1 )ld M=Z*lfAXl ZZZ))Z1177

    12 Z7Z7ZZZX3MXZ4M++1+ I 1 I7Z9MX)I I IZ L+=+ )Z13 ZZl 17@ xwXIBX+-l)?II )APMXXXX1)++) I ZZ1 4 Z 1.1 I ) Z-P LPZ1 1EMA==1) A19P.HMXX1I4.+++MIZZ1 5 ZZI 1Z AAAMAXXX IMM +)- A&JugMeA)1+=+M4.1 1Z16 ZXZ7XXX9878XM^9iX'1+MEX4I.EX+++A3SFli17 Z1 177Z t7 3A 3IRSG X3PG3R9AA14Z-=AlER) 7Z ZZ18 Z))1)ZIX9147XMMXX 114BAA7XZMA1++M.P7)+ZI7lZ19 ZIZ1)11X?+1XXMXl43:4-lXX3*MM33m)4.4.7lZXZ2C 714XzX1X3Z+=+11++4)el.zt1 IMS@l1t)ZXlIXXZ21 ZA@1MM1PAA1X1IXA1)1)M14++l.M3UA+4)111ZlZX22 ZlXAZ1X)4.7x6+ZAX1='m43IXUXMA+-)XZII)1723 Z)77)1l1))ZXM1AA111=tIi)A+PAEm4++1Z)1)724 ZlZl) EIieXAXXt4MW17AOIB11=+i))ZlllZZIZ25 17ZZZ7))11XXiM1XM)tXMEt1!14I)+)S3i)+XX47ZAZ

    27 ZZZ7X11ZM7MZXAAX) Z7ZZAZZMX)+)1M1+ IXXIZ28 Z7XZ)11 Z7ZZ7AfAZZX7Z 1XMXXAXZ7 )l 4+M41 )1111)Z29 ZZXXZI)) 11 XlZAAX414XXXAZXA1 +-9lX111Z30 ZAZI,1)+117XX4tMA6AAAM1 1.ZZXXXlZZI 111)))11231 ZZ11) )+ 1l1)XX4MI'A11IXA)111)AX++41Z111117Z3Z LL1l1 1 1=A4llt3AZ17)Z) l)14.Ag4I=+X+i rT)733 Z=++Z7AZ17) lZZZ+=IlZZI+)ZZZm1)+)))1)+ZZ34 l1X1fmA)A6A1l1XX1X)YXl1A@Al1Z9gXXM48X1ZZZ35 Z 3E33333339zU535333*1337 7

    123456789012 3456 789012 3456789012 3456789V)4. 4..= - I=

    2 e =-- 1+ )-Z + - += --+J =1 - + - 43 + -.7 Z _- 1+ -Z+_- -_) 4+ + +4 1 ) 1+ +ABP3A 1 - ==5 1 1 ) ASM8R1E571-=Z 1 - +6 -4.) - =+ =P&BSPVSEeIA+ - =4++*I4. =7 ) 11) = -3'"IZA8*3333Z= 1 1 -)8 = 334)1 4. M4314.3 -AX - XI-9 = += + 1-143s. s =+

    1C == A3 z- 1 ==++An -+- =11 4. I 3F41.+1 I -I 8==6fA 14. 4. 412 +-+ I)X3341=8 - - = *316,8SmZ- =+13 I Z4ASE"=U*- I -31361=X - +14 = ZMNA+ 1-se Z 338ij38X +*515 1 = ZZ71*AZAA 33 I - 363A3- 3 =16 =X 1+Z+XBAZAl46Ml+A6AX)IIA8F - 33 =17 = 4+ Z =*fHM361 ZA0§U2 1MYA8f oil +4118 = + Z M"A+XAX+X 33)iZ)ZA+U)A *(V=--=19 ++7 =)?M 1)61A1-Il A+AF3Z343 zZ AAA Z4+ XEX4 +++ 1-N§A+8Z.89jU3- + A -z-21 =AAZAA+.A7X +M) ) 3)-ZZ7O+ --= )-2Z -=ZZ AAAAA+ ='1XI 3)1 Of 3 - 4.)-2 3 Z 4 1t46A81 1)AA13M 31 z24 == Z 1WAIA1AAAAA+333) + C) 1 - 1 425 + Z 1 +I11733)X MZ)364AA -3) Z + 11Z6 zz Y1 AZ+XIg+ fg+=ZAAA -33 -27 Z 7 IfSv+%ZA1 1Z)AA+l1M +128 7Z + 1-F4)7ZZl+ A.I)XXZ 33=29 Z)) =#SA ) )M AX14-Al+X +- 3+1-34 LZ 1 X+AGARZZXMZX7XI 1 I - -31 1Z =+ 7X+XAAX)++IXA - --32 Z I XA64)+= XIFWM7Z=+4 +++4 311 +433 XXM+7)l+XlZ) AO=1Z= +H10111 + = +34 = 7MAIX=MX) +111)1 Zt - 1MIX=1+M'X35 314XKVVE33fI1ffI3U§3XX3389AXkA*1m3I7MXM)7I

    12345678901 ?345678Q012345678901234567890

    Noisy picture (sensed scene) as used in experiment.

    HAIR WAS LOCATED AT (6, 18)L/EDGE WAS LOCATED AT (18, 10)R/EDGE WAS LOCATED AT (18, 25)L/EYE WAS LOCATED AT (17,13)R/EYE WAS LOCATED AT (17, 21)NOSE WAS LOCATtD AT (22,18)MOUTH WAS LOCATED AT (24,17)

    123456 7°9'l23456 7890 12 34567 89 C12 3456 78q0

    L(EV)A for eye. (Density at a point is proportional toprobability that an eye is present at that location.)

    (a)

    Fig. 4. Examples of image-matching experiments using faces. (a) Successful embedding under coherent nioise.

    78

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    12345678901234567890 123456789012345678901

    3-4) I lZAMAI+65__ _ _m_ if8mZ-6 s+A6ffISlSOO@SSG 17 366 I{f3E66ISII§@7 -"e""e*[email protected] AOI@8866eUSS640531X

    10 _ v3U6G9SMAA8G6I66Z11 _ __ XKSfiIIItl}+-__+11166.__ .-__12 XflftfE+- +met68+13 AIIIE X)- 1M6410-14 AGOM +XI+15 .11Z8 + 1116 8IMAMPMX+ =))++Si6X17 -XG8MXI XAI 3 ZXLxIS+Is 8 } SSXAMMZXZ =IAXI=+Zii+iS 3M6ZlZl=Xl =I)1A20 A81=+==Z) IX21 4-61- 4Z4 Zt22 -0Z) +4- Al23 6X81 )Z- =M+24 MA1= -ZZ== 6e25 *XZ++..+r- ZA26 )32111)+= 1t27 4+X11X2++=4+ -+28 +XZ8113= ++29 lAXZ3l)'=1+) +30 lAMAZI)=) 331 +XMPAZ1+-+)1=32 =43XAMA211134 4+33 += =IZXXZI+-- 134 + +XZ12Z4 lXZ35 1Z362)33+- +AXW-36 11MMAI- 1AM637 511AMX+ XX88-3e =1 zzz) ZZZZ

    1234567890123456789012345678901234567890

    Original picture.

    * e* ee w *

    ssss ww.si{ 06@ 1.*S~@ i*I66M6e@I3 Z346060( 'U Z 5t)IIIIUL(5I4OO0SW U Z10 1CDO I

    4 0* MiNI*@ ii 0@l00

    0167 116.* $6 ii B @ * 661* 11600I@@

    9~ 0600 1M166M#ei#@ B@ R s86 *!k I3*I0i10C10 0 1ti{ i i 61 1 CEf81M111666M01184I0S6001S666

    1~2 - f 56BM 0g 6 10 6B 0-3 0O101i9*ElE6Ol0014 _00010631(46* 116116000*11160069100i

    16 1{{0 0v 16{i! 91017 00!* il@X 4l@

    19 * E*IEH 8l*80M 6088@ESMI 2E(8601 00O2C *6*66M66

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    1234567890123456789012345678901234561 IlltIllI IIZtzIItZZZzIZZZZLZZlZZ2ZLLZLLZZLZZZLZZZAAXLZZZLZZZZLLZL..3 ZZZZZZZZZZZZZZZA§SSMXZLXZZZZXXXXX4 Z ZZZL1ZZL ZZXM36SSSS6AXZXXX1LXXXX5 ZZtZZtZZZZMtSSASSSSSq9XZZZZZZZZZe 1111ZZZ I I1A5I0I?61555365111@2Z11Z2Z7 i1lZZ1itIIeIe8SSSSSUIsSSmLUZZzzzz

    .. Z LIZ IZLZZI2"4{{||@@X Z Z Z7LZU_._9 11)11)11) 1SSISSIS6ISUMAXA%VA

    10t. *ISS6SSE3GMAX29q4+11 *1AV"691MAXXXXZZK81M12 .-1gUmxxxxxX~x1lzmgS9v=13 IIAlXAAAXXAAAAZ XSSS1J . ___.tSx XMM§eAX8(Xt M A I *.__IS +8SXMSSSZZFMSEMZAfM)16 I AMZlIXAX I XMMAI IZOMX.17 .. ... MMII ZIIIXLL1Iz MIIS -A2111Z)IXZZlZX*A+19 xOzzzzz xxxxxzztze20 -ZA4GAX XXL.AMAXXX XAX.21 SBXXXZAMAXAXXBI+22 XIAXMEMM§OMXMSS

    -.-AXXAMMXAA%9X24 AIMAAMMAAMGOSSI25 )W XXAAMSSSL2L _ SSq88etSS,_

    27 M.s sseSSSMM-28 *XASSSI0SSMXJM2S AXXAMESeM.AAXGI30 IMXXXAMMMAXXXK6Z31 -XAMXXAAMAAAAMISXt4A- ____ )XXAOMA AAMM AMMBe§eXXXj pa..33 1XXXXAHSMMMM8M6eMeMXZXXAA1I34 SZXAAXXXAIBMAMMMPMAAMMXZXZXXMM

    12345678901234567890123456?890123456

    Original picture.

    123456789012345678901234567890123456I MPMMMMMMMMMMMMMMMMMM MMMMMMMMMMMMMM

    -2"'MM MMM MMMMMMMMMNMMMMMMMMMMMMMMMMMM3 M4MMMMMMMMMMMMMMMMMMMMMMMMMMMMMMMM4 PPMMMMM4MMMMMMMMMMMMMMMMMMMMMMMMMMM

    -MWMIMMKMMMMMMMMMMMMMMMMMMMMMMP4IPMMM6 MNMMMMMMMMMMMMM4MMMMMMMMMMMMMMMMM7 MMMMMMAX 2 MSISMA SIEMeMZZ XMMMM4MS MMMMMMAMOeMMMMMMMAMISSO@SSOBQMMMM9 MMMMMMMSS$SMA XMIIA ZA8O9MMSSMMMMMM

    10 MMMMMMSS8MAAAMeXXAXAAZ1IZAt1tMMMMMMTT1MgMMWAX

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    1.

    15

    16~

    213 C- 4n-F tvA A VJ%?08 AXYU3ZS XX X1 2 AXJIAWI5 A

    23 I I IX Z-11+ - LASI+ ZXXXXAA*AX

    22 8~4ALm-AX AAAVXI IIXA1 1)112XAXI It I X4A4ZL"A&

    2,3 AAeAXXXX "XXAAAlXXZZ1 -43261.3ZXX'C1&11i251 X{4'AAAVX IX. 31R(V-X*)AA43319111121Z2

    28AAXXX6AAXVt.XX2I3'14A+A**AXXXXXX4

    32 XXXZ2ZXim+8II"4A1 3124*.AE4 AXXXXA

    34 XXXXASS4 335 XXAB8*Ut )4?MA.AMAXXXXAS)ASUE-OX4?$M~4

    12345618901234567890123456789C123456789

    Original picture.

    1234567h90 l234567e401234567PQ'1I2345t7R90I 1 XXIIXflX72ZlKZZZ111+++1333XA"AZZXXlII2 III XAAX3KXIZXL,17VAXZZIIIZX;4AAX)XXXXM1113 11 SAM. .AAAAXXMXA?A1AXXZZZA'44imAXAXAlll4 111 4.i.1AAAMXXXX 6XLXXXXXX7 XA6AI-AAXXIII

    II !(3t4A14AXZXA6XXAXAXXA%XXAAXAAAMMM6m 11 1l1pomXAXZ%1ZY4Z1f( Ix0 X3AKZ 4zX!A*AIOJ1

    7 111 XAe wXXXAAXAAXXAXXAXX?ZZZAAMl482Z1 18 1 IA M4A MAXZ XA!A#vAA KP0-AX1IZXlK4AA6Z111I

    19) 11IIXXXXAMAP43UN-OIMHROOAXXAA1IXXX1III

    12 111AMA XZA At fSU3anWSeno*#yjf1113 1 I" 1fffffRSLaG901i 4cmfMMA4M, Ak4 I 114 111w.t. A ft.*" 3 AAXXA I NA M U+f3.4MA I I

    16 I11AXA4XAXZ2Z21131*3331IhA4qAAZZ2ZZII17 11 1IXAXXKA X . I+3 I))I + -ZA4P I 0AXXAAII1I16 II1XX)KA1AX2113,?)) 34 =1ZAWfR'4XAAAXA111IC 111AM-M'*3Wi6X11*I ))lZZlXMA'MAXAAM'4,11120 1 gdlYU84 4aAyMAFM P IA21 11164 A3.~~~A43 8A4M8~~~O@~AX6A

    X Xvl1122 111XlAA6XXY''-AAX!lXMOIMAAIXAMAAft41123 1 1ZZIZAZXXZXX IL II izIxI3ZzX1ZAAXA$OMAvI I24 111X2ZX211Z1+14*1 3IlIl3lXXZX'A4AXX4II125 1 1AAVuAM A 1I)3I ==+++.4111IZ31Z XI 1 1 126 1 1l'4vi*am4YtXi II IZ I I... I)ZII Z X IZ I I27 1110SA"A4XlP)7 A AXX6Z I 3+313)1117 IZ X 11t23 1ll2X1Z213 )lAXLs %Z73+)4331112q 1112 1'? 3,)zZ,,XQ,4iiX.7Z 7Zl3I34443.13=1113r m1i13+1i7,4j)jZIzizKXXXX1l3+3+ +111l31 1lIZ3311X63vA461))11X6&P14171134= + il132 1171133 1114-) IXAAAAAZ+41ZIXXZZZxlxAZIZZZI111134 111 3)IZZXZXKZZZX3)1131)ZIIZZZZIXZXAIII35 11131171.433Ij11I111 3*4)ZXAAAAAIIII

    I i3457 C~Q 1 2345il 7;'~1 I234''- -i~r 343= 7 i11ljj IZX1 -1 V !it Y t- # f :#. .3

    2 &PE- zYVERYAE1, r- and1l

    6

    13~

    14~

    160 8§K--3Z4 0 " SI 1 as1 SX=X +

    08 AN8x XE + i3 Ar11)FjV g+X-i8 fSl

    4C ** X31

    i1 Z XAZAAA3SA-"llFA = Z1Xa1= Ai:t- xz inA fXX M Af.Y-1A3Xl31 V*=l3AX1 XX XlA A'+

    21 A7 (A -1 4= )- 4A

    24 ) z PM 2=+ 1,.+ 7

    25 3Il14E*f4X VX'+2l3 =+g6'31 +A XXK26 IBP IB A=f IA Z X = L 11IXX1 X1I1+

    27

    28 1 X 1AII?3Z= MAUVAV*X 1 ) A Z2)1 33I A XX-'2~~M I Z X3-ilUSyUSZ IXk4AA,6 14 X AX+

    31 )Z9PXZZA ML*&S kMIAt,! A+,-'S8K+,Xi=X I1O) )+32 Afg AAPRII8AU )I).S i 8 A YX Cf~~N'-43?3 E2X-St. VAII 1A0~34 1ZIAI

    35 f 118218e Li- x 1 Klj1V+-,X y38lUXwy-* 83O

    1234567A'~012345A7SQ0123456789012345678Q

    Noisy picture (sensed scene) as used in experiment.

    HAIR WAS LOCATED AT (11, 21)

    L/EDGE WAS LOCATED AT (25, 11)

    R/EDGE WAS LOCATED AT (25, 24)L/EYE WAS LOCATED AT (21, 15)

    R/EYE WAS LOCATED AT (21, 21)

    NOSE WAS LOCATED AT (26, 18)

    MOUTH WAS LOCATED AT (29, 17)

    1234567593 123456789012345678901~234567P?0

    L(EV)A for hair. (Density at a point is proportionalto probability that hair is present at that loca-

    tion.)

    (d)

    Fig. 4 (conitinued). (d) Successful embedding under random noise.

    81

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    3 tttt't.'*a.r6o @*'++# 4ez,i4 iJJ 4ij*-IiXtetV.fCC'kC- 4--1 £-=J'4I 1 ,,4140 +s

    S i 6tet:*++oF 6:f'pt- -.0 ±.4Te4'4e1 }V

    7~~~~~~~~~~~~~~~~~~~@**a;+'F4#---HIIIFi ;lp8 4*F-v:t;+ 0f i t -It 1 V

    t**+ffff~~~~~ Of~-;"_rE Jei.a.' 44&EjX;'qi -tit

    r *^^ A1U1 0 t _ , .,3 A i!

    1~ xX-- +=124 f f- Fxl L-_ ,11

    7.7 1 a x*31 II )t 1 1

    i n-V VV4- X 'Y A y| q {- * ~ W 4 S 4 ~ g il X x t. + I Z 1 =V Z 2X.* :X

    23 P;'* g7 =+ + I~E Z I Xy, "A. .

    c>2'': t^^ ': ;+@6y )=+ I y; -vXw} t Y e2 %4'AX )X= 1Y i I I -I fi)2 4 X+25 >9 y X> fil. Z +7k2 f .XX X X X X X Z 1+ ) 7 1 )--+ + X)Lx 111.7 X -TITITI I )II) I + +)7 A X 6XXYxx X XIYI Z Z ) + - li L + + II

    29 AXY YYXyXXXZ IlZ, l. 1Z i lblZiIII 1+ )III-1-`-- XY7-77 7ZI Il I I Z711 1 ) ) )I}) + ~ ++ i+31 YX7?Z1ii l11)=)11111 +++++l)- + iII)I)32 YZZ111111+- I11 1++++) 1 + II1 i i

    +!tt7i11* +lF-4+ l l(134 IZI I X yI35 Z = =I 4;'isK

    IZ 4 67 ^t. 1 2 14^ 7¢C- 2 5". 7 C;t1 2 4^

    Original picttire.

    21 7., tI 4'Sf 7 C Ia45678P 1 2 4 5 6 t3'4

    i~'t, -K-.0';X 'K A- A*. o4-27 P_O"YK v;''"" ':FA ts/Q .; 3.Ati YMiA4 AP1e k4 X4P=i Ft isa

    io9w O''r"f),:i,07? t-*= 4*~ .0-1i\*s&P r8F AiP '4fASE7"Vf ' ' *?4 te ii '-i A5.A 4 X 'AC

    A Ye tIte fZ ..Z A,Kt I0f

    IC "-",Ac,, *i6,@ tvI1 -E,,^f441~~~~~~~~~N k18*1* 4 Yt- Y I Zi*;X.7o . I A td V-1f.-\ 4^

    V1 P tff9wF' -@'F f SIiAtv'A + X aif ;e P ., P-̂~ :

    1 5 4(-X i4r- V43 if 5`,eliB A.J-" AA#9 i R 3AAW* -I36 HPAAX SfOXX X 'd A z+Xtoa* Xi17 F#Y4@ f A'YAfR4*Tv XVIUX8vkeiSWAI A A G-9-?1 8S FfA Xf Eft-i Z!'I§t R$8A{2f i0 Z=C ZA bFfU341I,R 11-Yt; f-3C9! A#k!wl X= AU '-' Z A-928-

    2A F P-;4csJ*Asief E+t*6 C.aF*P 0YA- AA@ 8xsA-221 i"f:*^;E0F{1'~ afW W .i4& if '#'*A6,1- XXj

    24 E-fI-f4'1C1 si:t8=*f it jjX4V-FfiVA I ml*f* a-Ait2 c fZ f;* A A R3iXA k V fFr-CXA A ZIf N X I'"Al K26 f fTFr f-,4 !PA, i#44 P-P vt;V,.

    2 F >:wA'WP- litJP9 tX% f t ve At-7 i P+i4Xi4 ' \H_~ ~~~~

    l

    3C @l*A2-fAsi*Z76-00ei1JO'tA+WkKAAA43

    3n3 6w.i?{';.Z..8' X (AlzktfVAXRjj Aui;;34 E99WINPITAX196 I46'wN-e YPe{ 44Ae6AMXAIP) AR&35 F ioi1Pi1A3i iAAP'llox4il!V.^si^MAz<

    * i 7 7 S4 r,s 4. -.'i1 4 3 4 rF ,7 ° 1 1 2 3 4,r 73 Q;;1 zAS'4w% 'ia8 Fot+Lt e1 v 1i1819 1 a- :9 i 6) i*2 A0=,ER08t ;,, 11":*k i+ 4d*aF- ItW RX A\BC 9#' 6eW

    r *is,_,=s ; z_*6i t~~~~-.--4 A)0-fti -e*

    aas*>^W xsaI ai 6C IF- C8 4Hw gi- i4'1 04+i=1 6,i0 0

    ~~~~~~~~~S : XZ ItiB IfE'A e',9A E4 -XAG8 v1)( ) #M&7 f'X ZE ^ 6A ~ " e0 umeXs*0Zi X t $h*WFI #s

    _ 2 .1 .A . P. .4 a ..L.

    c ;zA oA?.."" E>1 '"tKtZ3I' AtL'A 3*i"

    1 X Ea 2E i' f'f'4 A - ) -A BiKZ}XV 6Z1L 'I1 1 iELtIs21S~s _ 19 - 1 *A H41lX SVi# A' Z

    1.3 f )6Yf* f J Z I A 1 *V6 ) tHAW) X14 VrC-t"fv")AZZ)1 )+ X XL +-+RA-17YAI1 5 +X fU 'V AV+ P ) X Z75b f@§frYK3ZAWAB916 *Af1 XX.* .- 114.Z==+ = A 5b1l*^Z ti17 U 4#G49+*81X8ZA1S"+ IANS VA'81x

    x -4 I if I . X Z............... t A A ...A'-'.. .Z. '~I

    19 A+"*-U 111114A8 ZlZ+ lZb= 'WAL4SX42' lZ?2 '9 >+'4-'l C ,7R 5 (;A (O Z +XIZi=*Glfl A *:*Z,2r *' t ) t )~ 1\ _ ) 1ZI

    2 1 115 AAX XL- IC 1= i= 1 + ) ++ 1. )

    22 S t4Z11AA1,+f ==;g A=++ z=61 =z23 X-FtAXSSlISx.4-1 X,+ ) i1 1 --* +A( !SZl6139'M24 IEM1 X Std XX' I9) XX(.eZ)S+H:VXEII 4X#*? +Ai25 IZ1Z e" AI1-itIP+ A)ZI I.I AZX)++3'-4)X2?6 P'14A)A APX 8AXt *\ Z) (*4+14 +_A1M)27 1i )1*=ve/+ )=f )A+4*) ?.)A Z+1 =0Z A28 X 1 )IgpWO+'#iAfX- Y91 XRI+ XI-A+ IZM29 %L)P'1I+r1XZZlZ1A)1 A+l+= + +a1 13C 6 Z-Z)1K 4+XtL ml CZ eZX= + I 1X=31 1 W f;Y1)= i+ + )ti Z ====3X3 2 Z+L'. Z== J 1 1XZ'l +Z+= + 7ZZ=)33 MXA + I 1'f1'A XX l4Z1A1Xm'M34 A) +4A = 1 IX + M 4Z X=1-61A35 1 + - I _ A )M I m1@@1

    12345o78Q')1234567031234567itOl23456789(

    Noisy picture (sensed scene) as used in experiment.

    HAIR WAS LOCATED AT (8, 20)L/EDGE WAS LOCATED AT (20, 11)R/EDGE WAS LOCATED AT (20, 27)L/EYE WAS LOCATED AT (18,15)R/EYE WAS LOCATED AT (18, 23)NOSE WAS LOCATED AT (23,18)MOUTH WAS LOCATED AT (25,18)

    123456759C1234557P8n12345675C012345A 75)^.

    L(EV)A for L/EDGE. (Density at a point is propor-tional -to probability that L/EDGE is present atthat location.)

    (e)

    Fig. 4 (continued). (e) Successful embedding under random noise.

    82

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    12 i4-rr 7 3C' 1? i4'r,,;7>lt4.1234546751'12 1456 710i'6hfVVYi;flE4.44f * *.eeswtienortngVI*

    _2 _*IEVWWYfl6Me9*f6Owe*tfIt(t(itj.#,1eiguv.1i H9bUi?iwt* ent*ieifiri?ts e*e.4wx3e4 3flM ti etinX ZIttt ZvtrtftF9 X4t.t et44w5 '8 '1eX)k14t#4'fI 94v17 IVE)FIRSSSI,SSRSm17.fefscXXIf;4*ueXes

    _T )(teu0g8ES sSg s0 w,Ao6e ;- IWtSISSSIhhhAHS1SESIR P3@JiX4i- --"

    IL *&ff*MiHS"'RHuh0AX X =/*UInsmvXAtXX4A'sA13 *fIfEm 1 14) I=,+7vsErXs I-X1Ax An14 t**4:*4"lAA V W XZ Z4 7X * +) I z "ELjv5gjg 44 4.-3

    I2f- 9" 'AA X L XA A YI 4) le I Ii X ILXAXI Ai A XN< S

    16 *i"+IShIelGt:wX.X l)t)+ ++X..1xAIh Zhix7AAHXe;q

    l 7 F'-M K Afo Ai= r . L I1 Xk61 +I 1Z7lllll X 7 Z Z7e

    1EfI'1A5X3IRs813'IX)AZNSSS#t.t #S11mHm'YIX YI1i I15 **4XASISSffiIx V*(4'XVSSXKKKX IK IAM1PI-S14AZfEfE&Sfl.Y XXXfit'SSsZ?6X&-XX)4

    3 2 tffAAXiv. :X il ) Z9EN IF X XZ AAA4

    4: !4 Hai*>ti e1 4 VI i i:" A '1, et.L Ia=1 Y_tM/ZXX24 X X4X FMSOW4A X_f;x. "'A.x 4E1 f 131 0Jt XXX-AX\ Z

    12?4Fl'7i.670

    25

    4

    32X eXXKXXk'AESfl9>+'XAtK17;weez3)1111

    ?4xA X-X80f4XX.:.i .VB8X* XA34 A~~~~X/X~~~~*HhEgr ~ ~ WA

    35 XX(X tf!I9SSUUVit'PAKKKxYKLAW*fESW9*3VSRS.X

    12345eJ739212345o7tc 123450nyk'c 123455rv:

    Original picture.

    A A ,AKWA 7 *'77411'7I t'vx/4At;?2 AX)ALA"Vr4?K t'A i X4"%3 AZX;0AKAi@ZZ* , ?\\wK*--AiA9l)'. Xii

    '. erxu4#SSSetSwtflue**Sv-R ~tH 47e*;s ___9 tl8"'XtLlHSSStBS 'efllflfiS1dSf)ilSS4 el-+Sriz0

    10 S*S6faSfl6kS+.B6&'1eiSASISvle1Sa*ntEtflL,lC SN(MhIHSSteHUSCS4GSWS@-1+=f'.Z±i%S12 4l"St41h5OS~LS: 11 7?XKHfft1l SiAI-C*1 3 USeXweISS*X#sY))XiI1-ZLVAWSESSNSxKzX8N0|SAV14 RISt*f11&1kkvlMZ 1 IY*I+'fSw51+Z1HtU1'15 XS'iE808SI\L1XAZ-+ AtS XSVRASX*N')16 *ISIAssixm 9Si e=)1271*?HSSX+ )1'tHU Z17 U,tR11*SZ: 1=- t+x vLsssrsIrazn,w7ev18 8leM ISE0SS? ZA.+ tEA"C4ZASSR#( 1=+t--&Y X19 4fSAttiS=SPHAahX~AXM?19#1# IE{AII*EF*F4I9) = f.S ^PYIS ^ a 1§11 X*'A,:A= EJ?C AJHN'vX l1I eNSRX' 6t -!'Z-IA4d 16 P 15

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    1234507b'd,1345678se 123456759012 345b73tf

    2345 +11+

    --++= )A649e1 -7 +A9**5q*i,"Y*Z =

    I010 Ii wSf)ii oeZi** -a

    17I1 )A6180OBO8{s3UHR6i8S666.+1 4 ) [email protected]+ __15 6Q16 ZI8,2 Z Z I1 ZXL66s86s417 -VKR4i8 4A+-- +Z ' IARSA

    T1GEx -iAi t1+ - -)U568B0U410Q +i 69 1'X7)1- 8 I71 em2C +QC66*A"MX)I Z )l AB866O_e_

    22 =166'iZK"48s1 *X)[email protected] 'v,6R'*-IZZLZX2Z22-1)6PSA

    25 zSSAl + 1- Z 88#+25 1iSt I 1z= l __"+2 =- - )W+2R AOXI+-1).- += ++29 -AX)+ XA )=--)l3~~~~ ~~~~1YT2T- _ -

    31 +X+)}+ -32 Y5) ++

    34 Ad>X AZ+ I=35 -YtiZ ZZ14= 1)+

    37 =+11)1*=)= = + ) Z Z xZ 1 1)38 ==1)+ - -+4+) IZXZZZZ1)-

    1234567 S01234567 c012345678qOl234567 1)n

    Original picture.

    133457 ri,7 '1-, 71 11.7ZZ7AX 7I. 1..Z 11 1 I I1 .)++ 1)IIi I2 1177 A )1)))) + 1 j)))4 11) +

    4 l)flh/1eXS77 1)1)1_1- += -+711l7^Y)1lt'7 Ii51YZ>7frlT72 Y7)7^;. r-- .-+ 4+# 11_Tf'Ft 7.UTT1T) 1-6 1 51 A?)4+,+-+*t YXJ'-+*- 171111J1Al7 1)JZ?iAxl+!7 =). 1JH *^.?1IX4, !F 1 1 }>+9s4j g s++)-W)AZZLT

    I Xz JZ) I?I? ) I B ti ^' t G 4 !,10 XZ '£ J IIZ x 4' .-'Ai- F--*''s )1

    l' I -lz =n,a1-''-*;4Ll : -'t- .; n-}-.TT 1I Zt^ / *X.Z hi:̂ i i i-i'fi if- Xt,Ytl'; 24 ,1 I Z I ) 4 1 t4- n ?V .J 4 4l1k.4 J.te'.:2)F -I

    Ibt,1 ) + KAL^ ? *lE! f it I *I ) +) ,1w16 1 )'I I $8f +*1 7 )*.ut;HwI +' I1: 1

    311 VP.s7 Ix /1==74 1 T? 177''-' ¼ H331o.)1111- **. j > 11 ++- )l ''"7--s} +

    34 171 7 7 --

    3751TT-i- -TV.'k7.i'4x 7i'.^.'>71 .1'

    ?? I ) ) z I t:q= !', "zha %'Z,R."1 IF ".]= ,,I7 11T) !1.1s8-IR JF= F +_ ii - r."~-3 XX1+ I J I I )y j Z1 )>- iZ v) I I7 f 7"> y fTi.

    12345676og&034'LI -'34 12345678?0123456 '??O1 ()-X, )2= A=-= A- 1 + = 12 +=l I - + _=_= _ =3 I +-+= 2 X=1 = 1= XZ4 = : A 1 1-' I ++1 += Z +Z45 Z7 -- = I 1=A+ 1 ) X-iA= )A I

    7Z +.x = =+=-Z I AN + 7 1.+ A7 + I I) 1 4 Wjatt46S66X Z -X1)-=1 A8 t 7Z +Z+XYYw88i.z*v vw*69"1= I0 +1 -1 + =-zi8I-5Z6hStai0Z = =

    10 Z 1. .+7Xw06+UA6tU':6,64Ul 6*8ii^E -=1 1 * 1 - x*H6OYSZ0H66'6s36RwwI1V%= 112 -XZ1+1 1+ tRsUu6eeezdv+ 113 + 4 D AlX elE6VHM.m860S# I- +14 X) - o'A5a I*68EX?'5I4O6'a*.Zpi,Z +15 4 1 A.iesEeBm+vs0 mg".Zs*1-16 + 11)WE3sssaki*A\x + +Sl*3SSSli =Z17 Z1) - fY*10"'A1 11 )lM M70S,.IAA )1s 1= f£I66Rt'f - lx'11 16l9=19 )t 66RUAi,'X+ + - *ilggitifrl ) X=2C 4 1 +.f-4*lBI1 - 2 =Z X0-5-E 8Z+ All21 +1 ti=oSi-* = 6) 1541A687V*iX+=Z I22 1 *IlleiAS"m>.j ff= X.e9;*+z 123 + - - *10+.7 49"A9Si+) Zfl *SUI!=L-24 -x *=8s0*' + 1 M 3 -X= IhaS5 L X25 L=L86"= 1) +i= I#4661) 1=26 * 8* 1 E'31 111 I A Z1 Z Z -> )Xi=) X-M 1)

    12345,795C12345t7h0C1?2345O7Q9c1234567890

    Noisy picture (sensed scene) as used in experiment.

    HAIR WAS LOCATED AT (13,23)L/EDGE WAS LOCATED AT (25,13)R/EDGE WAS LOCATED AT (25,28)L/EYE WAS LOCATED AT (22,16)R/EYE WAS LOCATED AT (22,23)NOSE WAS LOCATED AT (27,20)MOUTH WAS LOCATED AT (29,19)

    L(EV)A for eye. (Density at a point is proportional toprobability that eye is present at that location.)

    (g)Fig. 4 (continued). (g) Successful embedding under random noise.

    84

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    l2347CPiq-1234,A7OI, 1234-57Qc'A,' Oi' 7 QP .On1 = +~+=_= , ==_+44= 44+4+ *44 4

    2 z+=+++4 -+e+++,*++*++++#.....+...+...)P)2 ++++++ + +++,++ -++++-+++.-)*. . . . . . . . .4 .+.+....4.) P1111. 1 . P P P P P . . . . . .5 4+4*4pP) IP)) .S'V hA~5X P P P)P) P)IPX) P114

    7 P P P Ati '4i,*@ 1R1 P P P P P P111 1A P P)))Z}1 /* *4*+6*~r') t))))P) IP1P1P9 PPP I l 1 ee*eti fi 7 P ) 1 P )1 Pil

    IC PPP PPAt*ii*ti-iXt PXPIiliLl11I Iz Aiv)IIW+i9=@4++z*> I1!1 1 1 1_12 Z T)l*libi)+-+ZrfP21 1) 1 111 11113 =+-++lif*9XL-AZ17AAZ+14CI11 ).)1MIII14 2*e*)AePxiPX [email protected]===15 -+.M4LXN:-IAA2 Z%'AP) +9 .16 +X9I +1Z1P} =ZXZ 1 +>'*L---.---17 -XF1-44+4+ -++=--KA_+1 8 z VI PI1IMA XZ4-44XX4I Iq 19XXX X AAtAKAL 1 +WZ-20 +4MX.'NA5 AZMMXPP __ A_X t

    222* .I r* _,4 x2 3 = v*aElt*X-2325 43FP0VZ26 +i: i3

    2';28 23SB8Hfl{}XI.t2S ~~Auea5EBae*"__

    '31 +- £PPSPIUPPSBP32 4 Y#8B86R8I'

    34 -yMA A-.iAS35 + 1 1 + +vxz

    12345678901234567?90 12'345676C123456 J?-;3

    Original picture.

    1?34-67; ';.'123~~'123447XZ 5018-A1245X 73A )I AALAXAX1'.A; X] .K'W-MVM 5t. 1A ZY A -1.!,

    3 AAA.^itlS".?.X'"4A;1AAA^l_ ZY7 7A&Ai4 A.A %=X77.' t eA1eAIX A^ )'AX L YLYZ A AtA415 A A f, _4A rIL IV X UMI it i K I I X I I Z X '-O A -M3YX P!.A A,6 A AA A X 7++LXZ 1 7 7 2^) -XAZlt2 JA AXI A A A7 A5' ,'AY IZ ^. I '.:XFll8iFiWV* W _0 Z I I I Y P. I2 X A AA A;P ,^,.AtAZ Y 11 \VOGX,s yi 7;-@s.. 1+ I Z'f1F 32 AAI- ^5 A-AAXIX'Af-* 1l1XV'AwvA4:'1Y~X AX I AiA

    I A 4) I I 42+eL.L 77 ZAP- IS Zl IIIX11AAAA

    1!A4 -% Y. 8e-Yf )ir =2 Z x Y ?' ++ a, A --, Al13 At.t '! I =+ X -XX I A -L ,7Z ,t 'At I I X) X r-X ) -.tAA

    I S A A A A t ' 1.X I X1 t.?!VaP3AX) X2sZ OAA7 tte1+A X7 I + It+t I)+1 1 Y A AV Yft A A

    Ic2 .114A .1 Z X Z A. '-t:i- 1J 1 'V' I V X )_ A X V,'O Xt.A .% Pt?^ Att 14Szx1sZ Y I y J* '6I V16i:*" Z I ? A ,YXA f A...................EA21 4-A I I ! I I 7 Z Y*r:'.,A A ) 1ta I Z 4A XV^A" A4.22 t.:. xz7 Z 7 I A [IfA E, PfAAec -IthA . x7 .1A*^

    7 I . .7 .k tI K7- 7-.......%-' -1 A .A N

    2~~~~~~~~~~~~~~~~~ Y ,X A y> v:4:-Z' I7YYjz A A -I7; A*AA A I Y Zqt 'AJ-.^zYx X I.t>AaA

    28$ A\Â,;KKI1t' i;4'¢-e.8,>8 t X 7 I ZX Y X~ *:X P >A27: izA -I". A Z X,L X ''4ik- ~,;4 zj 7 X X A -A A e ,PA* XtX) i( I A^.7

    .4AAt,^ .-'2 I) I1f 1 1Y.i~* X I ZXh A MA L A AX AA't A^at ' 'K .'I - A. 4sw;4,? I ~+IV X X I t)_' X A A!A .E.

    3Z 4'I:5\)7,.'"e Y 7 X A> a^^3, X4 K*^.,, #K^* *vl ^ tt** 5\

    4 ~ St ^E -Y Y .: s X X . 8 1 )+x x z 1. X-Zs7^i*5

    itz'.7: i4 %; 74 1,c 144 7 ^92 54 t ,7 a9

    L(EV)A for mouth. (Density at a. point is proportionalto probability that.mouth is presetit at that loca-tion.)

    1 s4-. 7 -I , 1?'4 a 7 A Q-'v 2 34 7 u1 2 3 4 5j J7 j1 5t P _-K 4VZ4 -+.P = ) - 1+ Ii 1)2 +1XZ)+Z P=1 = Z I + -PZ A= * *t'1

    A , + K I PP ' +A I PI.A4 P ) =v 1 *Z 1= = P +5 X 2z ) *t- 4-fE1X" 2 X 1=ZI +A 1 146 Z=-l.l- l P16R, A 17-0-- -ZAXX +1- PI7 Z-=-) A0XOU S*W-@wWSE@'P'U* P AlX 4-X 9A X -x P'4-EewefflemelzP-ZS) OP S9 1 Z A Ij 4U=9S=ROSaj19 ) Z13 - 1 PfHUIti.AAveB=.v x i=A * A11 P -eA-I:o1g6XQ1SIB ) X + +I*5P* 1 4 81 P0=..12 i1.AEi FtBXlsz= +-+. V'4110P) 11v) Z13 + X+Af XA9B4= OP) SOAIPI Li-E 'I14 1 P. -*'7 hh'. * 5- YX(1C 44 PtIA#+*e- LZLvAI-*vS A q16 All)XINA X)# .'1?U *VS-X Z =)I P17 P + z=1 )P==+-1+ 7 z = A 4 _Pd - = XZ-+ X )P44) I + A= A + K19 X + )XI = A7 Z - PS =+20 I PS PZ9I1 XI I9 I_______21 i6e+ t* '8P 641A1t -. 1-22 1P -tfigs'^.44 1 + 4+23 * - AIfU-'-e-i-o * PZ 1 p))24 + - P4 P'0SH'I-Pv'xv*lO Z =?5 14P-4OPK=E,I9I1.S 'M.1 ) -

    -.?F z -11 4IQtIRE4ij.*ySitliH 1 PA =2=L .._.._*7 -. /IgFt8i'tlSl+ei+j X)=XK i- IPj| P /Jei8R5fleYsaI-p x =

    2q 1 -t 1I .sRE*8pEi=.-pp-) - -3C -- +*. e£H-Eef X * z A31 = I P 'O6t6R =x1= 1 * Z3 1 1 -X 4s; +4-i-34 P= AP.¶.AP&'A x P*P ==Z A35 2 ] *4)-z 7 =X-+ = 2

    I Z k4 56 7- c l 2 5 70 -;) I 214'67 4OI2 34557j%(

    Noisy picture (sensed scene) as used in experiment.

    HAIR WAS LOCATED AT (8, 19)L/EDGE WAS LOCATED AT (16,9)R/EDGE WAS LOCATED AT (16, 23)L/EYE WAS LOCATED AT (14,12)R/EYE WAS LOCATED AT (14,18)NOSE WAS LOCATED AT (18, 16)MOUTH WAS LOCATED AT (21,16)

    (h)Fig. 4 (continued). (h) Successful embedding under random noise.

    85

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    1234567,V1' 1?34';67$i9. 12 34517$30CIt -44 44';+; }4-eba in Q eEr a t e a c

    2 *4D jf *44*If~

    9*'}I I IV 0f 4f f E^eeb4 -' 44441§i4 i*41S (+i+-+.- te -FFlf- {,9f 4 t 0_,zFEv i,,u # If

    *%- 8H-.HS86 QAt8-88f9 f- -;Vlff+E-.fFfkjffElSE 888jReSHEfigfSea8 f

    ZAF" s E R8X ' ti^ s

    S HA1Y Ii'l A'±4n12*94'^ t'@ l@Bfia|9 i0 ZiVAzII + Ig a$BRfs R a: " A41 3 41 v. Y Z I +tIRE8a .A A AA14 (WAXAVURSR YXl )s--+ =411|0fl RaX X .AA

    l

    I1 54XX 448ftXX71 += I. (RISRA8 X XXX AA' M16 M XXAA8101.Y 1 7 )- +A SxIAXAAA17 PA;4XXAUfJg4X'hAL+=-4)/Zi14HHH77XXYXXXXAA

    --1--AXXXXXAR4Z888, XI XY K A119 R AK X,4.Elff ie~k;2* It & IV'tF iMA74I I ZXXXXX X2C AYXXAw'A f.AAXtA 1X IXX 1 11* AAZIXXZXXX7 T r C 1z7 F & x -,x1i k Y 'Al iXXXfX-X22 MXXXX*XI 1))X= 1I +X5PX?Z772Z7XX23 3 A XX X'i* X 1 4IAL 171- X '7 I1111I IZZZZXX

    - 74A-' YWZX'/-XX t' A 18 1 j716--; S2 9I AY )I I I I'I 1ZX X25 A X XZ I X lA X7X0-MAX1XAXXX 4 ++ 11Z X

    26 AX2 X LZXIiL P vX XA VMA X AL.X Z II?v1--=I + +) I Z X X7 AXYXA xL Z L r XFDI--= =iWT)TX-x

    2E XXZiZ2IZIAXXX(XXZZ/ZZZY)===+++))1IZZ29 77ZZ ZZZ Z 9V. AAXLZ ZKX X 'A +)1 I Izz3-37l7ZZZZ'7Z4- 127.AAJA-A+.-1)) II ZZZZZZ31 ZZt 71Z7Z PAE,q4AZ Xm4U'AX?- I 1 ZXKXXXXX A32 ZZZZZX AX+AA7'`5X4AAAAXX 7 I -I ?MM¶4#-3 i 1LLZLXA41 X X XYWXTZZ - -1+Z1Xq34 XXXZIXXZ MXXAAIZZZI1IZ35 %IMXXI+ 4AZ1I 1I) +2- -I

    123456789q123456786O 1234567BQO123456-7Vc

    Original picture.

    7

    A-

    2 P'NPJ AIIlV.j 74G A4+ Y'3 "94 .911: Yz X ,* 'o R

    4 MA8t 8 t> 7 d. 7 , { 8t < 1 ASX tY

    73 -fWJ-5*t\ , .>+fi|j ++

    i':P 'I3X i: I A X

    12 M vX YV+ _+ !A!

    7I v'@'A mm- m) I !A- .-ft,K t `- :3-y{FZ"' .'- AY .'1

    14 MuBMv IX7*f-2 sm ZlS:A6 t. X z Z I Z t_41 7 if N4 t,;( -'lA .

    ~ ~ ~ ~ ~ ~

    1 7 I+1A-,I I I7Y. 7 4Ij 1 '

    ~~~7~~' AaAk ,

    W'" A A I MIRIP- }IJt" Z K >-f^"igdt ^ *I)W ,5A <

    1 Av

    _ X!,Q Z Y "AAX A| -i7 it Vf'Z t #0 + NI^X- 'K

    L I W' -+ x

    I M MA *

    1" 3 17Y 7 Y ~ '. ||"-1' | IL * f 1 I .1YA . AIZL( V) fo nose.)((I Densty a a point is proportin

    o pr' oa' bA t y Ytt no seis p esn1 at tha loti on2 ) 5 x + + 1I Z I I I

    ?21 MV9wh-M-X'7 i 7. w 1X.;l .'"YI I IX^271 '-tAl I 30'

    24 9'fl'JE"*;3t9,* >̂, =+ ZZ -C tvP+X f-71i; 0

    -2?, 1t'J 74}SiN ? A x"-#VYlk?7X.S1t4'3 I Po v V it so- t 91~ 'P it 1 71X A I |

    3 -J(Pt§-^9 ie"i YYY Y :, Z wiVA Vt.'7 xA'

    b SYv p.L FCX A + X ct:¢-xL. s I ,,!"!, '. ,Iv..! E Y' 8-3r1J;Y + + _ - ) l .!771 Y -5 '4 '- + I I I I Vy t'Z v-J'

    i

    1 2 4 5iTpC 2 4 12 -! 7j) 1 2 :4F '7a,1t'5 h,7 7 }

    L(EV)A for nose. (Density at a point is proportionalto probability that nose is present at that loca-tion.)

    19734,j7- "',, 2m7 -!13- a r12 j *i rI f:1 o aY-sEl I Y YAl II -1 'l1 aR IEl- A-4At,#2 UX4 7/1V t716Zfr F -e UIz s( =4 F +R '^AUE Tj@ 1 ERiAi0 ABAlUiYS#lii AS4 0 Atift*41)1X /fE19 1 4f )'AV5~~~~~~~~~18884' X.8 P- -kW.*' 1-1a a^6 v5Ht BASE '*9 8"E HRi Z A*IIX MJ6A x98H"b67 XA P, x Y HZ8ê '

    13 'fAXHUZ8-'8Hs 1k;-'5-

    i 13Rsa1 I 1Xiiy 441 6 4I1f%lRHR8B/1 + X h0'J71X* X

    17t 1 I I tf A+ 8EE f A fJ 'A

    3 f,XR" ig) +s'7AwSAO-A.AE 1t A I XI21c 94- X)s o+ 17~ x 116")? /I1 2 lk A-itfXZ'I 11Z WXSVIVUIX 8H1 37:&' I I 9 f 1%+8XI I=F i xZ +* XI I7-

    24Fl Z d ZA AA4 xZ a- + X Z25 fM *i^ " 1t AA + ' X I -X+112 + 7) A k a I, I+wR Ab 1 7 )4-.2 7i4: 1X+ A Z 4 : 12 ='Zj/ 4Zi4fj Ajllj 4 Z + ) Y

    A 1_ K Z7 A I 'Ai, I...A x--14) ) t77 A=I Z.

    3 1 Z x 1 iAv AX ZYZ 1X 1AA )L is I I6.71 41*-'-3?2 L X X71 , I I y.t

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    1~~~~~~~~'2344?i?(i?35h7°r!2 45 7Hq 1__234'_5h '7,iq

    2

    7 -HI~~~4beHllPe 0I4e414 + A4 i NeILRJ v I

    5 -5P4RRflI29S99iI.

    10 _A9ffsEReSs0*+1 gA7 B3

    12 ;e980e8HR 71111 1sw8Sma13 f5SC9m*Hb5z1+++IlzE§RvK-14 4~?I117UU

    19 +038649 11 1) IZAf4Ax%R8+17 __ 416LF-t-~l49*

    1~ .X ASf 7 1 lX AX 7I A

    201 39 X X X Z A Z 77 1+ I5II A'A

    l20 1S FS0XX-X 7171) I +1X iAS =-

    22 1 il,XI Z l+1++* + I#1-23 4,fX1 IS Z1 +--) 11X -1-24 (fZLi95Z ti625 +tEVAXYXAXZlI)1+1I A+26 XWIMHAPX1ZXZI1XI__

    21t 1Ridb£tXZ7Xe'}f0+

    28 +%S;!'13mL17I17X5x129

  • 88 IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    THIS PICTUJDE HAD lAD HlDS ADDO300 COCOIDNI 'HID PICTUREE DAD 160 DtVwS AND £30 CO)LU4NS

    ItIt.

    la

    3D

    SN

    6'\'( }~'vregetation 145 S % D % ._

    Vegetation 3\s

    'Al

    DIO

    10~ ~ ~ ~ 7/

    I5 I *

    'U2

    1.K.114

    ID

    I i14 Se~Seshre___ .. -

    (a)Fig. S. Example of. image-matching experiment using a terraini scenie. (a) Reference for terrain.

  • FISCHLER AND ELSCHLAGER: PiCTORIAL STRUCTURES 89

    IH000 PICYOIP HAS 100 0901 AND 000 CON.100NS lots PUctuOPF HSo I&O 0000S 0N0 200 CUL0.0400

    02

    30

    41..

    ......:)veegettation1

    40Si.

    Is~~~~~~~~~~~~~~~~~~~~~

    102

    904

    I0I00

    li.tl/ e~tt~f

    121~1.

    140

    144

    Fig. 5 (continued). (b) Embedding of reference in sensed image for terrain.

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    3834

    AI,3049'St

    at-

    To33

    304

    3,

    3r,

    i4.

    143

    14114.

    St.

    392

    319

    -------------------913i334'?,*p33F4~T03'I?1'~730T3T W 33l~'S3~'3t3i3q,99940907333------ ,aOus0Asu I t33e 4133333 I33W4333 IIn..Ij-3.I.004014461114193.33333333343.03333333 4I333It333333333NA334If333934t933*4It3373343.I*

    ....uS03330 3440f33SS4340430~~~34.,3'33430334340333333333A33401333M3333433333333313333')13?~5jij~ji34034M4033b4AUiUU334430S'4A403UO' 9339433U303333S3334934403331 333330334333033 333333.3333.3------------333A3A4 --------31.-------------3393333334 aa343330333133133333333333333433 3333333~MR040999900339033 13.".304 333333 3304336633 09 13333in033333333TT1lIIU

    * -. 333~~~~~~~~------itoww"4" 404 7334333xie -~473333333413333333434'.~~~~~~~~~~~~~~~~~~~~~~~~~~~mItpg4500117-1aorA II?a46333333333333I---3333333333 VA33 3anoless

    333 333333333

    3.4.3..930..33433.,3.3..333m393333330933a343333333333333133311

    3- -. 333333333313333~~~~~~~22#0I3.333 0g483p3S3 i333 40.3~036340333 3403 3U33 13 13333 61 109900334 10 9699333430171i 03033343a 3 4 3 3 3 3 3O393403333.3-339433339034033334 33.3I333t%t"04093 3333I3333I3I33-393I3I3339633 33333330*3333933333333333333-* 333333333033 339043433333333

    3333I34

    3.g33433933333333343O33----------333 333333334ct3g333939393g33390139696096933493Al3333333333333333334I 93341 y- I333333.3334999" s90900199333333333339,00004,000406 39333 333333330334343333333

    333333333

    ,..3333393333333403060 33 ~3334@33 41113g4. 33333'I 33333 3333314111 3314011333933333AI333 A 4"asoose" o~09333339333333333*33316 33443 333333133033303V9333333093 3333I33* 3333333 333 333633333333333 3333343333933S333309933393333~ 3333339333333333333 3333333303939433333333943Al*~9~3

    33340 343 393333304033333VEGETATION3333333*133333333333333333344094333333333333334333334333343403333333.3333 3333333993333333333.3133333334933933333333339333 333930333333333333333333433933AA333333334.34

    I e -ie foppae 33II3333IIast403333I"I333ANP4491VAIIIAT100419,1I op ..ol i s410 "33333433433343363333366406403340q33333333'43343333333333333333333333334333333333~3 333333339334393403333 403344033333334334003 334.3333333

    A 333333044333333333333333333433333333333333933 39333333333333 333 33333333333 3339 3333333A 3339333333443933333333133333333343333333333333333-*33433333333333333333333033333933399* 33333333*333333343393433333434VEGETATION333133333333334333333333A333333333334333433333333333 3333433 333333:V G ''''O 3333333

    333333333333333333333 it ....... APO ...43AA34A333

    3-I0161 "*Of*8333033033339 3 394333333 4333?333333333I3333333433333333333343-3 . ? . IIva 4q4A "-o-0 IIIII.wvqow 33343333333333333393333I333I333-3333-33333333333933933333333 .33 p"337133333333 It- 11"11 333333 3A-M-- 101493333I 0333v3 99996 aaI 3W,ItaA333333333I I333I I33A3 3o3Re33 44333 3 III o oof13rlp3333v333II.....I3.......96 33 3.3333*s* 33004sop4of-

    Ia333433303 -933333343304440333334333333 333333I31433333I33A 3 M337I33333343AI337 I41033V3St33I3333 93933333V3A394033A33A?I3aI3II99I99I33A39333 9 0I333333 I3.I333I33I..................333333433343

    A '403334333333333333033333333333 43 34333A3A333393314333343A33333333A33313A33v333333 Po ""39333 033333433393333I33I 3I333II33344I433I333I34I336I...3.34393..3

    3 793334033333333363403333333343 3834333333349333343 33334433g33933334333339393333393933334033 33333333333334333333333333.............3.3333333.39303OS40313fAg3314334333333333A3333333A3333333333933f9333333333t33333333334333333k333433MAP

    333933333 33333333339333333403333333333334 33333333333333333.333333333333333333333333933333 .333333333333333333333333333433333 333343333

    3 333333333333333343333333333333344333.333333333333333 .336333.4033333~~~~~~~~~~~041009S.A.-Asee W03333333333433330333A 43A 3333933 0033A333A3334333400333 A39It3I 33.3I3I.3I I33-33I3I3I33I3I3I39A3I3I33I3I3I33I3I......33

    3333333333333

    33333333839333333333334333333344003333333 3433 ..34 33333433333333333333333333333 30333433

    £ 3333333333333333333334333333334 383333333433333333343333333333033393333333 3333333333333333333033333333833333333333333 3391333333333

    3 3933334333333333340g333343333333333393436343333933339333333333333333333484333333333333393633333393.,33333333333333333333333333 33333333333333333.33333333333

    3fIf%OVV"A333333393333q34-333333333.i3339330 "O 3 Al4033944 33A3343333333A3393k33I3A33All33Sit3v33A33333 .13333333P33333"333......3 )A&314*3q*43333033403

    m- I GO--mmaAm"M A x 4- -4 I It 441 A.-Vffft I t Iso".33 -&t33333333333o333333433333333343343a,*33333333I.333334333333 333 3333333333394404033933333340333334333433333333333333404I33 333433333333333333v333f3333333333333333 333333333333334 333I3I3I433333-33A3It33la331 .33333 ...333333333..333343333433* 333333433~AAA A 133333333403r43333A33 AA-RV3333433333.0-4-03A333333433333333330Ps'- 1.333333343333333I3veto 344 3333333333440333333334333333333.99.,000*90.4..33 9"433933..33X333333333333of

    33333333 43333333433 0- I .40i0V W A-33A333431333333433333333334333 . 333P3333vel A33 .* #v I 3qqs$33 te33of 4O1 -.4.3wo" A-ON333333334R4 6333333333333. .3333A X 0.33..3? -A.43J,IfI V9* II e-e 98.8 9996'j3 334A03333434AA3A93 3~T3 Ii3A933V3t3 ANA33..O3333433333333390433333434V33-3A43A3333430-coax33v3 I3333I343333333333333 3443333 3333333333 333433333.33.3)34.334333333e333333* 1111"OKA33333333AN34433333S33 VA33333343&O" 93I3S4Iv To& a It3343343333333 333333333333333333333333333333330333 3333 333 w3 4343933 4..... .w33.3333 4333 83333

    33333343333333413f OWNA3333339433333Aw 3333AA9333331113303333033334033t333333333833333333333433A33333333333333a3 33330433333333333 .33.3333333393333333I333I34383T 914433333333 0493334033333 333433404111 333,73394 la33333333-V333V33AI3IVIt001 tT*-13333"t33.443434...0433333330.00 010 "-%-14'4 *-A909'A44OA

    I3333333334343343I3333A3A" 933493333I333V33I3I333A4333333)A333333303333333 333333333333338 33333333333334333339333343334333333333333333a3.3333..63 3.O3.3333333343333333333

    * 33333333333933343333333343393333333343333333333933333333333333333333334333333 333333333333333333433336333 333333333333333333333333333380333333333333331 ..333334333334333333333333403A3I3I3 343333333333333333333333333333333433333333343333333333343333393333-33 33it 33333333333334I3334333I3V33V33 34333333333I3I33a3334333334343I3333333 34333 3333113333333333 I3a33my3I3I3A

    * .33343333333333333333333334333333333334333 333334333333334333333 333338334343433~~~~~~Otl)'211111I3333333393333333333333333333333333334903 3333333043333343333333333333333333333333833 3334333333333333333338333333.3633333333343433 3*33-. 333393 3333333333

    3.3333333.33333333333333333940334433333343333333344333334343333333333 3333333403343343333333333333333333333333333334334 3 .333.334 3433~ 33-331333-33333L39133.6333433393333333333433333.33334333343333 43 333333433433433333333333333333333333333333333433334333333333 3 333~33433..~3~33~ 33.3 333' .33439333.3433333333

    ,.3333333333393333344043333333333933403339333333 3343 33 333403339333833343333343343339333 3I3333393333343333333333333333333334333333333333333343333 ...33....3333 344344333434033333333 4033933333333

    9333333333333343333333339333333333S3333330333333433ASHORE3~33 333333333 333334333343433333333333333333*.43343334334334336343333S3 333333333333333333333VE E35g±.~JLJ3 33333*3339:343333336- izzizz tr Ertittizirzftttzyrfrfr fit tj ~ ~ ~ ~~~~~~~~~ ~ ~ ~ ~ ~ ~~~~~~~~334303333

    33333333333333434343333 Ill335 333333334343343333333333333333433 33334333343.333.3333.333333333333I3330333943333433333333333333333333433339333333~33~ 333133 133 3333333333033333333733333303333333333333 3433 333..333*333333 3333.43334333333433333333333 3433333333St 4333333433433g333433333336333II I43333333433344363333303339333333333333 3333I333333333.3Z33.33.33333333Z333333?33 03334333OA3334333343333333633433333393 3343. 3 33333333333333333334433333333333433 44.33..33.3434334333333333 33S3393

    " 114$$ I t n II ft i? TrtrTIT I zrzztwwrtitI tIrttzrwr z trz I "ZI?trzl I r?" itmmAv3.333 33..mvm33 II I 33 333 3333xxo333040W3RA0833333333339AA&3A333334333333111.41-48TIltit~ ~ ~ ~ ~ ~ ~ ~~~~~~~~~~~~~~~~3733333.33333333333333333339333 3393333"W119333340331. 303333033443303333143433333333i43333333333333393333333333333333333333333333333I34333333333333333l33333 33333333

    *333334033439333303330334~3333~3333333333.3333433333349333433333333333333333 33333333333433393333333433033333333333c33333334333333333333333333333343333333333 Fig.335 (continued).33334333 333333333333334333S3e3n3333ed333image3333333for34994terrain33.333333

    90

  • FISCHLER AND ELSCHLAGER: PICTORIAL STRUCTURES

    posed a rigid template which is correlated at the dif-ferent positions of the sensed scene in order to get thelocal evaluation array. The correlation is performed onlyat the indicated points of the template. Reproductiondifficulties make it hard to see the intensities of theseenclosed points. For "Vegetation 1," all the points areof intensity 40, for "Vegetation 2," 20. For "Urban," thepoints were randomly assigned intensities of 25 and 35,and so on for the other pieces.The terrain-matching experiments represent a con-

    tinuing area of investigation. Our current efforts areprimarily directed toward obtaining a satisfactory setof primitives which can be used as the basis for referencecomponent description. In low-noise experiments, withlimited geometric distortion, the simple descriptions(i.e., ad hoc shape and texture components linked bysprings) produced correct embeddings. Experimentsunder more severe conditions remain to be performed.

    Implementation DetailsIn all of the face and terrain experiments presented

    in this paper, the springs were assigned the values

    0g

    gij(Xj -xi) = qt oo,

    if A < row (xj-xi) < B andC < column (xj- xi) < Dotherwise.

    The values A, B, C, and D were typically set by takinga number of sample pictures and determining the small-est box encompassing the variation in the relative posi-tion between the components i and j. A subroutine waswritten to perform this task and automatically set thederived parameters into the LEA.

    In our current implementation, for a typical 35 X39(face) picture, we require 13 s to compute all theL(EV)A's, and 35 s to execute the LEA (these timesinclude picture input from a disk library, and storageof results on the disk). Most of the programming is inFortran, and the computer is-an IBM /360/40 (the addi-tion instruction on the 360/40 executes in 11.88 ,s).Total core and disk storage in bytes for all arrays isM(4f+2h) and NM(f+2h), respectively, where M, N,f, h are the number of resolution cells in the sensedimage, the number of L(EV)A's, the amount of coreneeded for a floating point number, and the amount ofcore required for a (small) integer. For a typical picturein the face experiments (M=1600, N=8, J=4, h=2),these requirements woi-k out to appr-oximately 32 Kbytes of core, and IOOK bytes of disk storage.

    DISCUSSION

    Many, though by no means all, visual objects can bedescribed by breaking down the object into a number ofmore "primitive parts," and by specifying an allowablerange of spatial relations which these "primitive parts"must satisfy for the object to be present.As an example, suppose we want to describe a frontal

    view of a standing person. This visual object could be

    decomposed into six primitive pieces: a head, two arms,a torso, and two legs. For this visual object to be presentin an actual picture, it is required that these six primi-tives occur (or at least that some significant subset ofthem occurs), and also that they occur within a certainspatial relationship one to the other-that is, the legsshould be next to each other, and below the torso; thetorso should be between and below the tops of the twoarms; and the head should be on top of the torso.

    It may be noticed that in the previous two para-graphs, we implicitly separated the local aspects fromthe global aspects of the description; the local aspectsare the primitive parts of the picture, and the globalaspects are the spatial relations between these parts.While at first glance it does not seem unnatural to makethis separation, in practice there is a frequently encoun-tered difficulty, that is, the feedback between thWe localand the global.To illustrate this difficulty, let us go back to the ex-

    ample of the view of a standing person. Any methodwhich detects torsos on a local level (that is, a methodwhich detects torsos without using any knowledge ofthe positions of nearby arms and legs) might very welldetect several torsos in a picture. In fact, the actual or"true" torso may be one of the weaker of the torsosdetected by the method; it may even happen that thetrue torso is not detected at all. What does determinethe position of the true torso is the position of the truearms, legs, and head. But, unfortunately, the reverse istrue. The positions of, say, the true arms depends onthe position of the true torso. Thus, the possible positionof each piece affects the possible position of each otherpiece, making for a circular type of dependency.

    It seems that whatever the visual object is, wheneverwe try to separate the global and the local, the samecircular dependency occurs. Many times attempts torecognize visual objects described in such a way involvealternating between local and global analysis, heuristics,backup procedures, etc.One conceptual way of avoiding this circularity is to

    evaluate simultaneously a complete interpretation of thepicture; e.g., in the example given above, we would lookat, and evaluate, complete configurations of head, arms,legs, torso, etc. The best complete interpretation couldthen be chosen. This approach, however, requires thatwe make an infeasibly large number of evaluations. Itwas just this computational problem that in the firstplace led to the decomposition approach.The implication of the above discussion is that, in

    general, we cannot hope to decompose the global evalua-tion problem into a number of smaller independent prob-lems, but rather must use something akin to the simul-taneous evaluation, taking advantage of any reductionin total variable interdependency to reduce the requirednumber of such evaluations.

    In this paper, we accomplish this through the follow-ing machinery. First, an embedding metric is presentedwhich sets the framework for evaluating how well any

    91

  • IEEE TRANSACTIONS ON COMPUTERS, JANUARY 1973

    composition of primitive picture pieces (parts of thedecomposed picture) matches the desired compositepicture. Second, a sequential optimization (dynamicprograming-type) algorithm is developed which takesadvantage of the decomposition to reduce drasticallythe computational requirements (our computationalrequirements grow linearly with the size of the picture,rather than exponentially). The contribution of thispaper is the simultaneous offering of the above two com-ponents and their suitability for application to a wideclass of pictorial objects.

    In addition to the image-matching application, whichwas the center of most of the development in this paper,we have also attempted to establish the utility of therepresentational aspects of the embedding metric forgeneral picture description applications.The work presented here is a continuation of the

    investigation described in [1] and [2], where Fischleruses sequential optimization for matching two-dimen-sional scenes, introduces the generic form of the em-bedding metric elaborated on here, and presents the con-cepts of coherent segmentation, arbitrary serialization,and sequential constraints. The relation of the heuristicembedding problem to formal decision theory is alsodiscussed. The only other paper in which sequentialoptimization is applied to a broad class of problems in-volving two-dimensional scenes is where Martelli andMontanari [4] present a metric and matched algorithmfor smoothing pictures. Kovalewsky [5 ] and Montanari[3] have applied dynamic programing to the detectionof (one-dimensional) line-like pictures. Reference [3] isan outstanding paper, in which Montanari providesmuch insight into the characteristics of sequential opti-mization. Both Kovalewsky [5 ] and Montanari [3]comment on the representational aspects associatedwith their optimization procedures. An embedding met-ric conceptually very similar to the one given in [1 J, [2 ],and this paper is discussed in a broad and interestingwork by Bremermann'0 [121 with respect to its poten-tial use in character recognition, speech recognition, andcontrol of effectors (e.g., manipulators).

    ACKNOWLEDGMENT

    The authors wish to thank 0. Firschein and J. Tenen-baum for many constructive suggestions relative to theorganization of this paper.

    REFERENCES[1] M. A. Fischler, "The detection of scene congruence," Lockheed

    Missiles & Space Company, Inc., Palo Alto, Calif., Rep. 6-83-71-2, Jan. 1971.

    [2] M. A. Fischler, "Aspects of the detection of scene congruence," inProc. 2nd Int. Joint Conf. Artificial Intelligence (Advance Paper),Sept. 1971, pp. 88-100.

    [3] U. Montanari, "On the optimal detection of curves in noisypictures," Commun. Ass. Comput. Mach., vol. 14, pp. 335-345,May 1971.

    [4] A. Martelli and U. Montanari, "Optimal smoothing in pictureprocessing," in Proc. IFIP Congr. Amsterdam, The Nether-lands: North-Holland, 1971.

    10 Other relevant publications by Bremerman) incltude [131 and[14].

    [5] V. A. Kovalewsky, "Sequential optimization in pattern recog-nition and pattern description," in Proc. IFIP Congr. Amster-dam, The Netherlands: North-Holland, 1968, pp. 146-151.

    [6] R. Bellman and S. Dreyfus, Applied Dynamic Programming.Princeton, N. J.: Princeton Univ. Press, 1962.

    [7] U. Bertele and F. Brioschi, "A new algorithm for the solutionof the secondary optimization problem in nonserial dynamicprogramming," J. Math. Anal. Appl., vol. 27, no. 3, pp. 565-574,1969.

    [8] 0. Firschein and M. A. Fischler, "Describing and abstractingpictorial structures," Pattern Recognition, vol. 3, pp. 421-444,Nov. 1971.

    [9] A. Rosenfeld, Picture Processing by Computer. New York:Academic, 1969.

    [10] W. F. Miller and A. S. Shaw, "Linguistic methods in pictureprocessing-A survey," in 1968 Fall Joint Comput. Conf.,AFIPS Conf. Proc., vol. 33. Washington, D. C.: Thompson,1968, pp. 279-290.

    [11] M. D. Kelly, "Edge detection in pictures by computer usingplanning," in Machine Intelligence 6, B. Meltzer and D. Michie,Ed. New York: Elsevier, 1971, pp. 397-410.

    [12] H. J. Bremermann, "Cybernetic functionals and fuzzy sets," inAnn. Symp. Rec. 1971 IEEE Syst., Man Cybern. Group,Oct. 1971, pp. 248-253.

    [13] "Pattern recognition, functionals, and entropy," IEEETrans. Bio-Med. Eng., vol. BME-15, pp. 201-207, July 1968.

    [14] -, "What mathematics can and cannot do for patternrecognition," in Pattern Recognition in Biological and TechnicalSystems, 0. J. Grusser, Ed. Heidelberg, Germany: Springer,1971, pp.31-45.

    [15] W. W. Bledsoe, "The model method in facial recognition,"Panoramic Research, Inc., Palo Alto, Calif., Rep. PRI:15,Aug. 1966.

    116] , "Man-machine facial recognition," Panoramic Research,Inc., Palo Alto, Calif., Rep. PRI:22, Aug. 1966.

    [17] A. J. Goldstein, L. D. Harmon, and A. B. Lesk, "Identificationof human faces, "Proc. IEEE, vol. 59, pp. 748-760, May 1971.

    Martin A. Fischler (S'57-M'58) was born inNew York,I- Y., on February 15, 1932. Hereceived the B.E.E. degree from the City Col-lege of New York, New York, in 1954 and theM.S. and Ph.D. degrees in electrical engineer-ing from Stanford University, Stanford, Calif.,in 1958 and 1962, respectively.

    He served in the U. S. Army for two yearsand held positions at the National Bureau ofStandards and at Hughes Aircraft Corpora-tion during the period 1954 to 1958. In 1958

    he joined the technical staff of the Lockheed Missiles & Space Com-pany, Inc., at the Lockheed Palo Alto Research Laboratory, PaloAlto, Calif., and currently holds the title of Staff Scientist. He hasconducted research and published in the areas of artificial intelligence,picture processing, switching theory, computer organization, andinformation theory.

    Dr. Fischler is a member of the Association for Computing Ma-chinery, the Pattern Recognition Society, the Mathematical Associa-tion of America, Tau Beta Pi, and Eta Kappa Nu. He is currentlyan Associate Editor of the journal Pattern Recognition and is a pastChairman of the San Francisco Chapter of the IEEE Society on Sys-tems, Man, and Cybernetics.

    Robert A. Elschlager was born in Chicago,Ill., on May 25, 1943. He received the B.S.degree in mathematics from the University ofIllinois, Urbana, in 1964, and the M.S. degreein mathematics from the University of Cali-fornia, Berkeley, in 1969.

    Since then he has been an AssociateScientist with the Lockheed Missiles & SpaceCompany, Inc., at the Lockheed Palo Alto Re-

    Q< search Center, Palo Alto, Calif. His currentinterests are picture processing, operating

    systems, computer languages, and computer understanding.Mr. Elschlager is a member of the American Mathematical

    Society, the Mathematical Association of America, and the Associa-tion for Symbolic Logic.