high resolution smart image sensor with integrated parallel analog processing for multiresolution...

Robotics and Autonomous Systems 11 (1993) 231-242 231 Elsevier

High resolution smart image sensor with integrated parallel analog processing for multiresolution edge extraction

Marc Tremblay *, Denis Laurendeau, Denis Poussart Laval University, Computer I/'tsion & Digital Systems Laboratory, Dept. of Electrical Engineering, Ste-Foy, Quebec, Canada GIK 7P4

Abstract

Tremblay, M., Laurendeau, D. and Poussart, D., High resolution smart image sensor with integrated parallel analog processing for multiresolution edge extraction, Robotics and Autonomous Systems, 11 (1993) 231-242.

This paper presents a vision sensor which generates a multiresolution edge description using parallel analog processing support. Its multimodule architecture is based on a Multi-port Access of photo-Receptor (MAR) hexagonal sensor coupled to an external but powerful analog processing unit and a microcoded digital interface. The system supports image scanning and edge tracking. Satellite analog processing allows extensive computation using VLSI technology, leaving all the sensor area available for photo-transduction and communication pathways. It is thus possible to design a sensor with up to 500×500 pixels on a single CMOS chip using 1.2/~m technology. The goal of the approach described here is to exploit an imbedded edge tracing algorithm in order to generate a scene description as a list of connected edge segments. Experimental results are presented for the current prototype which implements 256×256 pixels with corresponding multiresolution edge maps.

Keywords: Smart sensor; Image processing; Edge detection; Scale-space integration; Hexagonal image sensor

Marc Trcmblay received a B.Sc. and the M.Sc. degree in Electrical Engi- neering from l'Ecole Polytechnique de Montr , al in 1986 and 1988. He received a Ph.D. degree in Electrical Engineering from Laval University in 1992. He is currently professor in Mi- cro-electronic at the Electrical Engi- neering department of Laval Univer- sity. He works on computational sensing since 1987, on focal plane processing sensor for 2D vision and motion estimation as well as digital architec-

ture for computer vision. Tremblay is a member of Labora- toire de vision et syst~mes num~riques at Laval University, a constituent of the IRIS (Institute for Robotics and Intelligent Systems) Network of Centres of Excellence. He is a member of IEEE and l'Ordre des ing6nieurs du Quebec.

* Corresponding author. Tel.: + 1 418 656-3238, Fax: + 1 418 656-3594, E-mall: [email protected]

Denis Laurendeau received a B.Sc. degree in Engineering Physics from Laval University in 1981 and the M.Sc. and Ph.D. degrees in Electrical Engi- neering in 1983 and 1986 from the same University. In 1986, he has worked as a visiting scientist at Insti- tut de Recherche d'Hydro-Qu~bec (IREQ). He is a professor in Electri- cal Engineering at Laval University since 1987. He is currently the direc- tor of graduate studies in Electrical Engineering at Laval. His research

interests include 2D and 3D vision and the application of computer vision to robotics and biomedical engineering. Lau- rendeau is a member of Laboratoire de vision et syst~mes num~riques at Laval University, a constituent of the IRIS Network of Centres of Excellence. He is a member of IEEE, the Canadian Society of Electrical and Computer Engineer- ing, the International Association for Dental Research, and l'Ordre des ing6nieurs du Qu6bee. Starting in January 1993, he will act as co-Editor of the Canadian Journal of Electrical and Computer Engineering.

0921-8890/93/$06.00 © 1993 - Elsevier Science Publishers B.V. All rights reserved

232 M. Tremblay et al.

1. Introduct ion

Visual perception must be developed in rela- tion with the needs of the recognition processes in order to define efficient and adaptative au- tomation tasks or mobile robot applications. A large part of the computational effort in computer vision is related to low level repetitive processing which can best be implemented at the sensor level [12]. It is well known that biological visual systems include aspects of image preprocessing on the first layers of neurons within the retina. It has been suggested that this low level image processing is related to multiresolution edge extraction [9]. This natural organization reduces the amount of data to be routed to the visual cortex for further recognition [3].

Computational sensing is a novel research area which targets the integration of such image processing by merging transduction devices and analog signal processing modules. Interesting solu- tions exploit natural properties of semi-conductor devices in order to implement simple but powerful computational functions on a very small silicon area [4]. The bidimensional implementation of computational sensors is confronted to trade off between the spatial resolution of the sensor and the complexity of the imbedded electronics. This paradox will be real until microelectronic technology can offer higher integrated level or 3D structures, which would remove limitations of fully parallel implementation of analog processing at the sensor level. In many cases, it is not advantageous to simultaneously compute a large number of edge data if sequential (slow) scanning is required for the extraction of these primary results. Previous work on smart sensing has con- sidered the design of complex photo-sensitive elements [14] with emphasis on the communication

between neighbors [7], [8]. Other approaches use the implicit access of parallel row data at one end of the photo-sensitive array for SIMD computation, [5] or a complete sequential processor which is implemented on a same substrate [1]. A common goal of these approaches and of the one discussed in this paper is to integrate photo-sensitive elements on CMOS or CCD technologies [14,16].

The sensor architecture described here uses a serial-parallel approach in order to yield a balance of resolution and computational capabilities. Emphasis is put on an efficient communication strategy in order to extract a local description of focal plane illuminance and exploit it by an external analog processing module. The main interest of this architecture is related to the parallel analog filtering made possible by using several different operators which are driven in common by the set of primary outputs of the sensor. It is thus possible to generate, in a single scan period, a multiresolution edge description of a scene using band-pass filters with different scales. This type of satellite analog processing is discussed in Sec- tion 2 with the proposed open architecture for dedicated post-processing on primary edge data. The Multi-port Access of photo-Receptor (MAR) architecture is presented in Section 3 with its pixel electronics and the basic operating mode. Section 4 describes the parallel analog module which implements multiresolution Laplacian-of- Gaussian operators followed by a zero-crossing evaluation module. The microcoded edge tracking algorithm is described in Section 5. The paper concludes with experimental results which have been obtained from a current prototype of 256 x 256 pixels.

2. The system architecture

Denis Poussart received the Ph.D. in Electrical Engineering from MIT and joined the department of Electrical Engineering of Laval University in 1968. He leads the Vision and Com- puter Systems Laboratory, which is a node of a network of Canadian uni- versities and industries which have re- cently initiated the Institute for Robotics and Intelligent System. He is a member of IEEE and l'Ordre des ing6nieurs du Qn6bec. He has done research in biophysics and instrumen-

tation and is involved in 3D vision and its applications in biomedicine and industry.

2.1. Satellite analog processing

The implementation of focal plane processing implies a delicate balance between pixel complexity, spatial resolution, and data flow. It is clear that the small cell size of a 2D photo-sensing array leaves only a limited area available for computation. In the case of large arrays, until technology allows much denser circuits (or 3D structures), most of the non-photosensitive area

Smart sensor with multiresolution edge extraction capability 233

Parallel llluminlmeo ~ Bu era and

Y C'l~i~ingpc[11 S Region Of Interest r--

~,e Area IA

Image Sensor Illuminance Images Filtered Images

Fig. 1. Satellite processing approach. The set of primary analog outputs which represents the illuminance of the region of interest addressed within the sensor is used by several analog filters in order to implement the extraction of multiresolution primitives. The pixel of interest itself is also routed to

an A/D converter in order to output illuminance data.

will have to be dedicated to communication, with little built-in processing. Furthermore, there exist serious I / O limitations. For instance, even if edge information could be computed in parallel within each pixel, a simultaneous read-out of such data over large image regions shall remain challenging. Current technologies favor simple operators with great spatial homogeneity. The MAR architecture recognizes such a trade-off between the simplicity of the basic pixel design and the complexity and penalty of rapid communication by making use of external, but tightly coupled, processing support.

This so-called satellite processing is illustrated in Fig. 1, which shows the conceptual representation of a device with relatively high resolution and its associated off-chip parallel analog processing. The dark circle on the sensor delineates a region of interest (ROI) centered on the pixel of interest (POI) from which illuminance data is retrieved and routed to a conditioning module. These channels are commonly used by a set of N different analog filters which may implement multiresolution and/or directional edge extrac- tors, Gaussian filters, etc.

ing, stereo disparity evaluation using two MAR sensors, image calibration for further photometric use of the sensor, and others. A main digital controller is designed for driving specific protocol signals, data flow and memory addressing on a hexagonal tessellation. It also implements a microcoded instruction set which defines complex sequences displacement for the POI within the sensing array. Finally, the controller is responsi- ble for interfacing the MAR system with the host computer and provides bidirectional interrupt capabilities which offer interactive feature extraction, especially during the edge tracking process. This open and modular architecture facilitates the development of other co-processing elements because none have direct functional conse- quences on the others.

3. Basic organization of the hexagonal MAR sensor

In conjunction with the satellite processing approach, the present sensor development has been oriented towards the following objectives: (1) highest possible spatial resolution using current VLSI technology, (2) possibility of using multiresolution edge analysis in order to extract relevant characteristics from a scene, (3) access to custom video rate and format for automatic light adaptation without constraint emerging from his- torical video standards, and (4) emphasis on data base description of early primitives rather than real-time display of illuminance images: it is not an imaging sensor. This section explains how these objectives have been met by using a hexagonal multi-port addressing strategy and presents its associated design and operating constraints.

3.1. Multi-port access on a hexagonal tessellation

2.2. Post-processing module integration

While computational sensor development remains our main research topic, several other pro- jects are in progress as VLSI co-processing modules which will operate in the immediate periph- ery of the sensor. These modules share a single memory block with the sensor and other co- processing modules. Such processing could include scale-space integration, shape from shad-

The pixel architecture is based on a multi-port addressing strategy. The selection circuitry is sim- ilar to a multi-port memory except that selection busses are routed geometrically in order to define the shape of the kernel and that retrieved data are analog and represent pixel illuminance. The implementation of computational sensors is not restricted to a conventional rectangular grid. Even if low level algorithms in computer vision are often designed Cartesian tessellation, the hexago-

234 M. Tremblay et aL

nal pattern was chosen for three main reasons: (1) Immediate neighbors of any pixel of interest are located at a same radial distance, which facilitates the implementation of circularly symmetric operators, (2) multi-port addressing of hexagonal pixels is naturally implemented using a set of colinear data busses, and (3) a hexagonal tessellation is highly regular and facilitates the representation of curved lines or surfaces [6].

Fig. 2 shows an overall block diagram of the MAR sensor. The core area is composed of a 2D matrix of multi-port access pixels where the POI is addressed by three concurrent selection lines. The white star represents the addressed pixels which are simultaneously extracted from the sensor. The topology allows access to the illuminance data of the POI together with the illuminances of all neighbors located on the three axes of symme- try of the array (corners of the concentric hexagon). Each illuminance signal is routed from the sensor on an individual channel and is fed to the external analog module for spatial filtering operations. Each set of selection busses (named Y, X~ and X 2) is activated by a bidirectional shift register. A fourth register (named T) is used to control the parallel analog multiplexor on the upper region of the sensor. Unlike in conventional video sensors, the POI may be moved along any of the axes of the underlying hexagonal structure, thus allowing very flexible scanning strategies. A direction code which uses 3 bits is internally decoded in order to control shift regis-

:puts

Of ~t

Fig. 2. Block diagram of the MAR Sensor. A set of four bidirectional shift registers drive selection lines which con- verge upon the POI. A decoder module uses a direction code of three bits in order to drive each shift register. The analog data path is shown in the shadowed area until it reaches the

analog multiplexor.

VRoset Sx 2 DX2 Dy DXl

~ "~..L~ ~-~:~:~ [J "' ..... MYF ........

.."] ["*'.. .S "".

M.. V J ~ ~ V J Photo-Transduction Analog Buffer Mufti-port Addressing Fig. 3. Multi-port addressing architecture of each pixel and non-destructive read-out of illuminance data. The photo-diode drives the gate voltage Vg of transistor M 1, which is then

translated to a proportional output current (~.

ters for both motion and direction. Reg_Set is used to initialize the four shift registers, Reset initializes the integration process which is discussed in Section 3.2 while the ~ - ~ signal is used to stop the integration process during scanning.

The circuit organization of each pixel is presented in Fig. 3. It can be divided in three main parts: (1) photo-transduction, (2) signal buffering, and (3) multi-port addressing. The single current I s which is generated by the integration of photo-current I e on the gate capacitance of transistor M~, is retrieved through a set of three N transistors (My, Ms1 and Mx2) according to the status of the selection lines. It may be shown [15] that the output current I s may be expressed by the following expression:

I s = ~ VRese t -- VSN + V, 2Cg 1Eint i

- V , - V t , (1)

where K is a CMOS process constant, /3 is the current gain of the transistor M1, Vsu is the voltage drop of an N transistor due to threshold effect, Vr is the forward voltage drop of a PIN diode (approx. 0.6 V), V t is the threshold voltage of the transistor Ma, t i is the photo-current integration time and Ei, is the input illuminance of the pixel.

3.2. Non-destructive read-out o f pixel information

Transistor M 1 operates as an analog transcon- ductance buffer and is required to ensure a non-


destructive read-out of the illuminance data during pixel access. This property is critical because each pixel is addressed several times due to the parallel analog nature of the MAR architecture. The MAR sensor has a global Reset signal for the entire array which applies voltage l/Rese t on the gate of transistor M r The integration process is thus uniform for each pixel of the sensor and the reading of the output current I s is non-destructive, with voltage Vg remaining unchanged irrespective of the frequency or duration of the access to a pixel element. This property is essen- tial since, after a complete scan of the sensor, each pixel is addressed as often as the number of pixels on the extraction kernel.

3. 3. The M A R sensor operation

The typical operating mode of the MAR sensor is presented in Fig. 4 for the two extreme values of scene illuminance in a bright and in a dark region, when only one pixel is selected during the scanning window t s (for proper biasing of output transistor M1). For the dark case, photo- current I E is limited to the reverse leak current of the photo-diode which causes a small deviation AVg on the gate of M r In this condition the output current is maximum. The bright case is illustrated for an unsaturated pixel which sinks a small current (near zero) until a minimum threshold voltage V t is applied at the gate of the output transistor. If an isolated region has a very high level of illuminance (caused by a light source in the scene or specular reflections for instance)

Digital Y Control ~

Analog f R~nse ~

Reset ~ r q

Grab ~ " [__ j - ' - - -

Scan -t J L__ a v o

Vg (black) ~ ' - - t-

,s Ib,ackl f - - t t-

Vg(white) ~ i

Is (white) ~_~ (saturation) Vt. ~ t ~

T- ti ts

Fig. 4. The timing diagram shows a typical illuminance integration cycle and its associated scan (read-out) cycle for the two extreme cases of scene iUuminance. The maximum output current is associated with a dark region while a null output

corresponds to a highlighted pixel.

pixels in this region will saturate. In this case (see dotted line in Fig. 4), the gate voltage decreases rapidly, resulting in a zero current output signal in the entire saturated region without any effect on the unsaturated neighbor pixels.

The global ~ signal is used to interrupt the integration process during the scan period, especially in conditions of heavy light, in order to avoid that the integration time be longer for the last visited pixels than for the first ones. It is clear that the integration process of the MAR sensor is not limited to proceed at a standard video rate (1/30 sec). This is a very useful characteristic. The integration duration t i may be adjusted by the operating system depending on the illumina- tion condition of the scene. This parameter is then defined as an equivalent aperture control (or the light source intensity thereof). A simple procedure which uses the histogram of a previous image could dynamically adjust the level of the white saturation of pixels by changing the integration time t i The range of adjustment for this parameter is only limited by the voltage deviation due to the dark current of the back-biased PN junction.

4. Analog image processing for multiresolution edge extraction

Marr's theory suggests to analyze edges using a multiresolution strategy [9,10,11] in order to dis- criminate, from noisy (but accurate) edges, those which represent relevant features in the scene. In this approach, edge extraction refers to the zero- crossing location in the resulting filtered image when it is convolved with the Laplacian-Of-Gaus- sian (LOG) operator. A multiresolution analysis is simply a variation on the standard deviation o- of the Gaussian. A relevant property of the MAR sensor is its parallel analog filtering capability which allows a very fine sampling, in the scale domain, of the zero-crossing maps from the highest frequency filters to the lower ones. The kernel has a sufficiently large diameter in order to implement low frequency filters for significant feature extraction and high frequency rejection. The current version of the analog multiresolution edge extraction implements 16 different LOG filters as a resistor network from or--0.5 (or Laplacian operator) to tr = 6.9. This section presents some

236 M. Tremblay et al.

details on the VLSI analog implementa t ion of the Marr ope ra to r and a threshold-free zero-crossing detector. A n edge encoding strategy is also presented. Implementa t ion details of the analog comput ing module is available in [2].

4.1. Effect of the spatial sub-sampling of the convolution kernel

The star shape o f the convolut ion kernel which is drawn in Fig. 2 represents a sub-sampling of

(a) (b)

(c)

Fig. 5. Fourier transform of a complete LOG kernel (a) and for its approximation in the MAR implementation (b). High frequencies are not rejected in the region where a large number of pixels remains unread. The MAR kernel is shown in (c).


the complete mask for the LOG operator. Ninety-one pixels are read simultaneously for a possible count of 721 pixels (for maximum radius of 16 pixels). The spatial Fourier transform is presented in Fig. 5 for the complete LOG operator in (a) and the approximated one in the MAR architecture (b). Some oriented high frequency regions are related to the unsampled pixels from the MAR sensor. This means that some high frequency patterns (texture-like) are not correctly rejected by coarse filters as it would be for a complete LOG kernel. In the scale space domain, this means that some zero-crossings (or edges) may appear for coarse resolutions even if they are not detected by finer filters. This problem is solved in part by applying a large number of filters with very fine increments of tr.

4.2. Zero-crossing detection and digital edge encoding

i t(x, y)

~R(x.y) = I(x.y)*v 2 Go(x.y) i

................ l ............... ...........................................................................

H*(x,y) f ~ i i i

x"Y'J t I i Fig. 7. Side view of the example of Fig. 6. The zero-crossing detection consists of extracting the sign of the convolved image with the LOG operator S(x, y) and the thresholded value of the magnitude of the response H(x, y). After a morphological operation applied on the H signal (giving H * ), a zero-crossing is defined as the change of sign within the

active modified hysteresis value H*(x , y).

The zero-crossing extraction must be executed dynamically within the camera by analog processing module in order to avoid a large number of A / D converters, data storage and digital processing. The procedure is shown in Fig. 6 for a simple example while a 1D cut view of the image (a) is shown on Fig. 7. The analog response from each LOG filter is routed to three inverters with pro-

(a) Illuminance Image

I".;O +l (b) Binary Image for H and S

(c) H* resulting from dilatation (d) Edge detected from (c)

LEGEND I~-IS = 1 I S = 0 I H = I I H * = I

Fig. 6. Visualization of the zero-crossing extraction procedure. The synthetic illuminance image (a) consists of a dark circle on a white background with small noisy patterns. The sign (S) and Hysteresis ( H ) bits are shown in (b) while (c) represents the Fegions where the zero-crossing is relevant ( H * = 1) in order to derive the final edge map in (d). A cross-section of the edge in (a) is shown in Fig. 7 for analog and digital values

of a filter response.

grammable input thresholds (VS +, VS - , and 0 V) in order to derive two digital signals: (1) the sign of the response S(x, y) and (2) the thresholded value of the magnitude of the response H(x, y) [2]. The zero-crossing is detected and located at a change of sign within the active window of H. While some pixels near a zero- crossing may have a small magnitude, a morphological dilatation must be applied on the signal H(x, y). This is done by growing its active region (H = 1) until a change on the signal S is reached which defines the derived signal H*(x, y) (see Fig. 6c). This ensure that only zero-crossings with active H on both sides are extracted as edges. This situation is shown over the shadow area in Fig. 7 where a noisy pixel of H does not generate false edge detection.

The final format for edge representation includes two quantities for each pixel location: S(x, y) and H*(x, y). The resulting raw data consists of blocks of 32 bits per pixel for a 16 filters system. An additional 8 bits from a single A / D converter is added in order to memorize the illuminance of the POI (illuminance image). Another particularity of the MAR system is that edge location is defined between two pixels. This edge representation is very useful since the zero- crossing approach tends to yield closed loop edges. Using such an inter-pixel edge representation, a single black pixel in a white background

238 M. Tremblay et el.

(or thin line like finger prints) is extracted as a small circle with diameter of one pixel (see re- suits in Section 6).

5. Scene representation and featutes extraction

A main goal of computat ional sensing is to design sensors which process images at the focal plane level. But it is imperative to define a proper data format in order to accommodate the subsequent segmentat ion and recognition processes. This section presents the two main post- processing digital modules which allow such a data-base description. The proposed scale-space integration approach is first summarized, followed by the microcoded edge tracking which is

O"

Scale Space

. ~ ~ ~ ( S ) - ~ ~ . ~ e l e v a n t but blurred " - - - ~ / Contours (a Large)

~ / f f / / , ~ ' ~ - , ' , "~ - ,~.~-------~'~ "c' f Edges (o Small)

/ .=. / t , - . ~ -~ ~ , " ~ - ~ - J f Original Image

- ..,~==" ~ / (lllumlnance)

(a)

(t ......................... ~ ............................... i.~ .................... t ................ li-.x

. . . . . . . . . . ,,.:..>"//\".,...".-..."<S. . . . . x DOSv=ue 1311181I~31311~1 (example)

(b) Fig. 8. 2D representation (a) of the multiresolution analysis using zero-crossing maps in the scale-space domain. The Depth Of Scale (DOS) approach (b) shows the transposition from continuous representation to an oversampled one in the scale domain. Some DOS values are traced for this example

with the level of visibility of an edge in the space domain.

[] [] [] [] O [] [] []

[] 0 [] ~ [] [] []

,,, [] ° [] []

1 14

2 12 ~

[] [] D

a ~

n

r l

D

[] [3 rn [] [] D O

LEGEND

Real Edge Location UndeteCted Edge Detected Edge Interpreted Edge Location

Fig. 9. Typical execution of the edge tracking algorithm on a hexagonal t e s s e l l a t i o n . The algorithm starts by finding the first zero-crossing [move #4] (which is located between two pixels) and start the tracking in the left for this example [#5 and #6]. The regular e d g e tracking is made by crossing the PO] from side to side [#7, #8 and #9] the zero-crossing detection is lost [#10]. At this point, the triangle is closed [#]1] and tracking continues in the new direction [#12 and

#13]. etc.

dedicated to an hexagonal tessellation. An overview of the primary scene description which is the effective output of the M A R system is also introduced.

5.1. Real-time scale-space integration

The scale-space integration procedure must proceed at the sensor level because it is not appropriate to obtain 16 individual edge maps as sensor outputs without any hierarchical interpre- tation or pyramidal analysis [13]. An example of a typical scale-space representat ion is shown in Fig. 8a as a 3D set of data: a low value of the parameter o" (narrow filter) gives an accurate but very dense edge map while coarse filters extract only the relevant structures of the scene without a good localization. The proposed approach visits the scale-space domain in a line by line way and counts the number of levels which have been visited until it is possible to detect the edge. The oversampling of the scale-domain ensures that


the displacement of the edge is never more than one pixel for consecutive filters. A typical example is shown in Fig. 8b where a sub-set of the accurate edges are labelled with the Depth Of Scale value which ranges from 1 (noisy, high

frequency or very low contrast edges) to 16 (low frequency and relevant contrast in the scene).

The scale-space integration algorithm is in- tended for a co-processing module within the MAR architecture. It is implemented in a pipe-

(a) Illuminance Image (hexagonal grid) (b) Edge Map (o-=-0.8)

a - ,

°

- 2

.!27 (e) Edge Map (~1 .1 )

i..

' 2

(d) Edge Map (0=-2.4)

Fig. 10. Results for a photographic enlarger from the 256 x 268 MAR sensor. The hexagonal illuminance image (a) is shown along with four of its sixteen edge maps at different spatial resolutions (b, c, d and e) which are generated simultaneously in a single

frame period. The resulting representation of the scene after scale-space integration is shown in (f) (see Section 5.1).


lined processor which reads the 16 edge maps (32 bits per pixel) and generates the corresponding DOS values for every edge pixel which is detected by the finest filter. The DOS value is accumulated for three consecutive image scans (from memory to memory) using the three main directions of the hexagonal structure. The sixteen edge images are used as the input of the scale- space integration algorithm in order to track edges in the scale space. The resulting DOS image of the example of Fig. 10a is presented in Fig. 10f. A single treshold was used for the printed representation but, in fact, every edge pixel has its own weight which represents the local DOS value as shown in Fig. 8b. This final image is used for the edge tracking procedure where DOS values are integrated locally along the edge segment.

5.2. Microcoded edge tracking algorithm on hexagonal tessellation

As mentioned in the beginning of the paper, a hexagonal edge tracking algorithm has been implemented as a microcoded component of the direction controller. This interactive procedure extracts linear edge segments from the scene and

transfers them as a line drawing to the host computer. An example of the algorithm is illustrated at Fig. 9 where the continuous curve represents the real edge (or zero-crossings) location and the pixel path is traced using arrows. The basis of the algorithm is to cross from side to side, in a zigzag mode, while the zero-crossing is validated. When the edge stops to be detected (or changes in orientation), the algorithm closes the triangle and restarts on that new direction. The interest of this approach is that the natural edge segments ~ connectivity is naturally recorded during the edge-tracking algorithm. A simple visit flag is set during the edge backing while a basic scan of the sensor data, with break condition set to unvisited pixels, ensures that every edge is extracted from the focal plane.

This step, in the data acquisition process, fol- lows the nature of the scene. A large amount of short and unstructured contrasts (edges) generated by textured surfaces will imply a longer procedure than for a scene with smooth surfaces with polyhedral shapes. Some interesting values may be computed during the edge tracking as a line segment property. The DOS value may be integrated along with the last visited pixels as well as the edge length.

63

(e) Edge Map (0=-6.9) (f) DOS Image (Scale Space Integration)

Fig. 10. (Continued.)


5.3. Towards a data-base representation for robot vision application

The information extracted from a computational sensor must be formatted in order to accommodate application requirements for which it is designed. Our strategy is oriented towards robot vision and machine intelligence. This is why a token description of the scene is preferred to a conventional raster data format. A data-base representation of the line segments may compress significantly the amount of data to be extracted from the sensing element. It may also increase the efficiency of the recognition process if this data-base includes some pre-computed properties. The primary scene representation consists of a sequential list of extracted basic linear edge segments which are oriented towards one of the three main diagonals of the hexagonal tessellation (see Fig. 10). The basic description of a simple edge segment includes global DOS information from multiresolution analysis, line length and orientation, 3D coordinates (using stereo vision) and the natural connectivity of line segments. The software which runs on the host computer will modify this primary data-base into a more compact and significant one. This is done by merging consecutive line segments and replac- ing them by single features such as longer line segments or arcs. This advanced scene description is improved by creating cross-references such as junction pointers (or proximity, vertices, T- junctions), symmetries (or parallelism) and direct 2D accesses to primitives from a multi-scale localization map. Each vertex may include a small illuminance sub-image on which a sophisticated algorithm may proceed in order to compute an accurate junction localization.

6. Experimental results

A 256 X 256 pixels version of the MAR sensor is currently installed in a custom camera case with an opto-electric shutter. This version implements sixteen different filters for multiresolution spatial edge detection. Image acquisitions have been performed on different scenes as a proof of the concept. A typical result is shown in Fig. 10 for a photographic enlarger scene along with the resulting edge data. Four of the sixteen edge

maps are presented (b, c, d and e) as well as the digitized illuminance data (a) of the enlarger (on a hexagonal grid). The high frequency rejection is particularly visible on the small vertical slots which are correctly detected for high resolution filters (tr = 0.8 and tr = 1.1) but are almost completely eliminated and grouped in a single feature by the low resolution filters (tr = 2.4 to tr = 6.9). It may be also observed that any extracted edge segment is oriented according to one of the three main diagonals of the hexagonal structure. Another relevant property, which is related to Marr's theory [10], is the bad localization of edges for low resolution filters. This is particularly visible on the control knobs of the enlarger. These primary edge maps which represents the scene at several spatial resolutions are generated in a single frame period and are updated every 30 ms for a typical light condition.

7. Conclusion

A new concept for a focal plane processing sensor has been presented and is designed to link acquisition with tightly integrated satellite processing. The prototype implements Marr's theory based on the convolution of an image with circularly symmetric operators followed by zero-crossing edge detection. It defines the sensor with reasonable resolution and multiresolution edge extraction capabilities, within a compact package and with flexible and high performance operating modes. The feasibility of a sensor with a resolution up to 500 x 500 pixels is an important conse- quence of this concept. The integration of the microcoded controller, analog processing units as well as the operating software which will support intelligent and automatic feature extraction is being carried out. Ongoing work includes the development of post-processing VLSI modules for scale-space integration, stereo matching with two MAR sensors and shape-from-shading. It is believed that the early and medium processing capabilities of the architecture discussed here could contribute to the development of economi- cal and efficient implementations of intelligent sensing devices capable of generating a compact scene description for subsequent segmentation processes.


Acknowledgment

This work was performed, in part, through support made available through the Institute for Robotic and Intelligent Systems of Canada (pro- jects A5 and NP-1) and through grants FCAR 92-ER-0380 of Qurbec and NSERC A5274 and OGP-0138377 of Canada. The Canadian Micro- electronics Corporation (CMC) provided software, hardware and fabrication support through the Northern Telecom and Gennum Corporation foundries.

References

[1] S. Anderson, W.H. Bruce, P.B. Denyer, D. Renshaw and G. Wang, A single chip sensor & image processor for fingerprint verification, Proc. IEEE 1991 Custom Inte- grated Orcuit Conference, San Diego, CA (1991)12.1.1- 12.1.4.

[2] M. d'Anjou, M. Tremblay and D. Laurendeau, A VLSI implementation of a parallel analog Laplacian of Gauss- ian filter with multiple resolution levels, Proc. Canadian Conference on Very Large Scale Integration, Halifax (1992) 204-211.

[3] A. Baylor, Photoreceptor signal and vision, Invest. Oph- talmology l/'zsion SC1 28 (1987) 34-49.

[4] P.J. Burt, Smart sensing in machine vision, Machine Vtsion (H. Freeman, Editor, Academic Press, New York, 1988).

[5] K. Chen, A. Astr6m and P.-E. Danielson, PASIC. A smart sensor for computer vision, Proc. 10th International

Conference on Pattern Recognition, V.2, Atlantic City (1990) 286-291.

[6] B. Kamgar-Parsi and B. Kamgar-Parsi, Quantization er- ror in hexagonal sensory configurations, IEEE Trans. on PAMI 14 (6) (1992) 665-671.

[7] T.M. Knight, Design of an integrated optical sensor with on-chip preprocessing, Ph.D. Thesis, Massachusetts Insti- tute of Technology, 1983.

[8] M.A. Mahowald and C. Mead, The silicon retina, Scien- tific American 264 (5) (1991) 76-82.

[9] D. Marr, Early processing of visual information, Philo- sophical Transactions of the Royal Society of London, B, 245 (1976) 483-519.

[10] D. Marr and E. Hildreth, Theory of edge detection, Proc. of the Royal Society of London, Series B, 207 (1980) 187-217.

[ll] D. Marr, Vision - A Computational Investigation into the Human Representation and Processing of Visual Informa- tion (Freeman, San Francisco, 1982).

[12] C.A. Mead and M.A. Mahowald, A silicon model of early visual processing, Neural Networks 1 (1988) 91-97.

[13] T.A. Pogio and A.L. Yuille, Scaling theorem for zero crossing, IEEE Trans. on Pattern Analysis and Machine Intelligence PAMI-8 (1) (1986).

[14] J. Spiegel, F. Kreider, C. Claiys, I. Debusschere, G. Sandini, P. Dario, F. Fantini, P. Belluti and G. Soncini, A foveated retina-like sensor using CCD technology, in: C. Mead and M. Ismail, eds., Analog VLSI Implementation of Neural Networks (Kluwer, Boston, 1989).

[15] M. Tremblay and D. Poussart, CMOS photo-sensitive device with multiresolution edge detection capability for computer vision, Proc. Canadian Conference on Very Large Scale Integration, Halifax (1992) 155-162.

[16] G. Wang, D. Renshaw, P.B. Denyer and M. Lu, CMOS video camera, EURO ASIC'91 (1991).

high resolution smart image sensor with integrated parallel analog processing for multiresolution...

Documents