svg rendering by watershed decomposition...years svg has been used for this purpose in software like...

SVG Rendering by Watershed Decomposition

S. Battiato, A. Costanzo, G. Di Blasi, G. Gallo, S. Nicotra {battiato,gdiblasi,gallo,snicotra}@dmi.unict.it, [email protected]

Dipartimento di Matematica e Informatica, Università di Catania - Italy

ABSTRACT This paper presents a novel raster-to-vector technique for digital images by advanced watershed decomposition [5][13] coupled with some ad-hoc heuristics devoted to obtain high quality rendering of digital photography. The system is composed by two main steps: first, the image is partitioned into homogeneous and contiguous regions using Watershed decomposition. Then, a Scalable Vector Graphics (SVG) representation of such areas is achieved by ad-hoc chain code building. The final result is an SVG file of the image that can be used for the transmission of pictures through Internet using different display systems (PC, PDA, Cellular Phones). Experimental results and comparisons provide the effectiveness of the proposed method. Keywords: Scalable Vector Graphics, Watershed Decomposition, Raster to Vector, Chain Codes representation.

1. INTRODUCTION “SVG is a language for describing two-dimensional graphics and graphical applications in XML” [4][9][11]. The definition from W3 consortium well summarizes the potentiality and expressiveness of the language. Two important advantages of SVG are the open recommendations and the opportunity to share and view SVG documents through Internet. The main proposal of this research is to represent raster images using SVG primitives instead of a grid of pixels. Many applications for raster-to-vector have been implemented in different areas of computer vision and computer graphics including cartography, font representation, Geographical Information Systems, etc. In the recent years SVG has been used for this purpose in software like Vector Eye [12], Autotrace [1], Ras2Vec [10], Potrace [8], SVGenie and SVGWave [2][3]. Our approach consists of two main steps: in the first one the original image is partitioned into polygonal homogenous areas using a modified version of Watershed Decomposition [5][7][13]. Then a boundary representation is computed and properly translated in SVG expressions. The final result is an SVG file that appears like the original raster, thus maintaining the advantages of vectorial description like scalability and abstract representation. Figure 1 shows a schematic description of proposed method; the technical details of each involved step will be described later.

Figure 1: General overview of the proposed conversion algorithm.

Partitioning

Input Image Gradient quantization Pre-processing

Pixel sorting

Assign region

Refining map Contouring Output SVG

The paper is structured as follows. In Section 2 the segmentation step is described, the subsequent section describes the contouring algorithm based on chain code. Section 4 contains a large set of experiments and comparisons with existing methods. A final section ends the paper giving directions for future research.

2. PARTITIONING THE IMAGE The Watershed decomposition has been largely used in Computer Vision both for image segmentation and classification [5][7][13]. The idea is to estimate the gradient magnitude values over the image using such information as a virtual altitude over a terrain. Suppose now to embed, gradually, such a virtual landscape into the water. The water will fill first the lowest areas forming a set of “basins”. As the water level increases these basins will progressively merge together along borders that are local maxima of the gradient magnitude. At the end, the algorithm returns a partition of the input image into a number (typically large) of sub-regions called “basins” with the property that are almost homogenous. The algorithm proceeds as follows: first minimal values that represent the deepest point in the “catchment basin” are computed, this is obtained evaluating the gradient of the image.

For each RBG-component of the image, let 2,

2, kykxk GGG += be the gradient of the layer, where kxkx ISG *, = and

kyky ISG *, = (* is the convolution operator and Sx and Sy are Sobel Filters [5]).

In order to merge each computed level, we experimented with several metrics choosing LMAX as the most suitable in terms of efficiency and final results. Values of gradient are then normalized into the range [0,255];

))),((( ...#1 yxGMaxNORMALIZEG klayerk == (1)

Various levels of quantization of the gradient, (i.e. � �kG k = 0.1, …, 1) can significantly reduce the well known over-

segmentation problem thus obtaining a lower total number of basins (see Figure 2 for examples).

Figure 2: The image Lena (a), a toy example “circles” (b) and a gray levels representation of the gradient with L2 (c)(d) and LMAX (e)(f) metrics. Notice how LMAX metrics delineates better edges. Visual representation of basins of “circles” image with different gradient quantization k=1 (g) and k=0.1 (h).

(a)

(b)

(c)

(d)

(e)

(f)

(g) (h)

Let H be the histogram of the gradient G computed as: for i = 1 to m for j = 1 to n insert pixel p into list H[G(i,j)]

where m,n are the dimensions of the original image. List of pixels are processed starting from minimum gradient values, (i.e. from list H[0]), by assigning to each point in the current list a basin in the following manner: let z = 0 to #number_of_gradient_levels for each pixel p in H(z) let m be the number of the basins in the 8-neighborhood of the pixel p then p can be marked as: - a new basin (m = 0) - as an edge (m > 1) - aggregated to an existing basin (m=1)

The processing order for pixels in the lists of histogram may affect the final segmentation of the image. Three methods of pixels sorting have been implemented: the first method (called m1) is without sorting, the other two methods (m2, m3) first extract and process the first pixel in the list; the next pixel to be processed is the nearest, as respect to L1 metric, to the previous one. Pixels that have equal minimal distance are treated differently by the two methods: m2 privileges horizontal left-to-right scanning, while m3 choose a top-to-bottom approach. In Figure 3 it is possible to see the different behaviors of the three methods for “ circle” image. Edge pixels, marked as “ -1” in the basin map, are then assigned to the most similar adjacent region according to a given distance L2 or LMAX, during the refining step (see Figure 4).

Figure 3: Visual appearances of the partition of the image “ circle” using different pixel sorting techniques.

Figure 4: Refining step of the watershed decomposition algorithm (LMAX metric).

1 1 1 -1 -1 -1 -1 5

1 1 -1 -1 0 0 -1 5

-1 -1 -1 0 0 0 -1 5

2 2 -1 0 0 0 -1 5

2 2 -1 -1 -1 -1 -1 5

2 2 2 2 2 2 -1 5

-1 -1 2 2 -1 -1 -1 -1

3 3 -1 -1 -1 4 4 4

1 1 1 1 0 0 0 5

1 1 1 1 0 0 5 5

2 2 2 0 0 0 5 5

2 2 2 0 0 0 5 5

2 2 0 0 0 0 5 5

2 2 2 2 2 2 4 5

2 3 2 2 2 4 4 4

3 3 3 3 4 4 4 4

Method m1 k=0.1 number of basin: 16



3. CONTOURING The partitioned image P is made of connected polygons that can be efficiently represented by SVG primitives. We used chain codes [5] to find a boundary representation of each basin of the image. Chain codes are typically used to represent regions by a sequence of segments with a given direction and size. Standard representations of segments use schemas with four or eight directions. In Figure 5 an example of chain code description of a basin is given. In the top part of the picture a four-directions schema (4CC) has been adopted while in the bottom part the same basin has a eight-directions (8CC) representation. A deterministic automaton has been implemented to build chain code for each basin. Starting from the top-left point, the procedure looks for the next pixel of the perimeter in the neighbourhood following a clockwise direction. Both automata’s state and vector description with the relative position code are updated while search continues until the path returns to the first pixel. In order to achieve SVG representation of the chain code of each basin the primitive PATH has been used. The formal syntax of the PATH is the tag: <path d=”…” style=”…”>. The attribute d contains both instructions and points. A brief summary of the basic command for the path primitive is reported in Table 1. A first optimization of the code uses h or v command in case of multiple horizontal or vertical movements in the chain code. Examples of SVG representations of the chain codes are given in Figure 5.

Figure 5: Chain code representations of basins.

Command Name Arguments Description M,m moveto x y Starts a new sub plot. M (uppercase) indicates that absolute coordinates will

follow; m (lowercase) indicates that relative coordinates will follow L,l lineto x y Draws a line from actual point to (x,y). L (uppercase) indicates that absolute

coordinates will follow; l (lowercase) indicates that relative coordinates will follow

H,h Horizontal lineto

P Draws an horizontal line from actual point (x,y) to point (x+p,y). H (uppercase) indicates that absolute coordinates will follow; h (lowercase) indicates that relative coordinates will follow.

V,v Vertical lineto P Draws a vertical line from actual point (x,y) to point (x,y+p) V (uppercase) indicates that absolute coordinates will follow; v (lowercase) indicates that relative coordinates will follow.

Z,z Closepath (none) Close the current subpath by drawing a straight line from the current point to current subpath's initial point.

Table 1: Brief description of the main commands of primitive PATH [11].

Start position

CC8: 2 2 3 4 4 3 4 6 7 6 0 0 0 7

<path d=”M x y h 2 l 1 1 v 2 l 1 1 v 1 h -1 l -1 -1 h -1 v -4 z “style=”…” />

CC4: 2 2 4 2 4 4 4 2 4 6 0 6 6 0 0 0 0 6

<path d=”M x y h 2 v 1 h 1 v 3 h1 v 1 h -1 v -1 h -2 v -4 z” style=”…” />

Start position

The styles properties relative to the path element are: - stroke, stroke-width: respectively the color and the width of the external line. - fill: the color of the basin (the average color of the pixels)

In CC4 representation initial (and final) coordinates and curvature points are increased by 0.5 pixels in order to fill better the raster points in the image. Due to diagonal lines the CC8 representation don’t cover the whole space of the pixel; for such points the optimal size of the line can be computed as:

41.1211 2222 ≈=+=+==− bacWidthStroke (2) See Figure 6 for further explanations and Figure 7 for examples.

Figure 6: Smart techniques to adjust SVG basin representation.

(a) (b)

(c) (d)

(a) (b)

(c) (d)

Figure 7: SVG rendering of images “ Red Hat” and “ Circle2” respectively CC4 (a)(c) and CC8 (b)(d) representations.

CC4

CC8

Border Line

Stroke-width=1

Stroke-width=1.41

4. EXPERIMENTAL RESULTS The proposed technique SWaterG has been implemented using C programming language; an exhaustive set of experiments has been conducted in order to show the effectiveness of the methods. In Figure 8 a representative part of the dataset, composed by nine images from Kodak collection [6], classical images and text, used is reported. The first set of experiment have been mainly directed to find the most suitable parameters to be used in SWaterG. In order to determine the best contouring method, PSNR values have been computed for all the images for various level of gradient quantization. Results show that CC8 method’s values outperform the correspondent CC4 ones. Figure 9 shows such behavior for the “ peppers” image. Another set of experiments has been devoted to find the “ optimal” quantization factor k for the gradient. Although the basins number (and consequentially SVG file size) is directly proportional to k, choosing the minimum k value maintaining acceptable PSNR is an optimal choice. The “ optimal” (minimum) k value, empirically obtained, is 0.3 (see Figure 10).

Kodim04

Kodim07

Kodim15

Kodim21

Lena

Mandrill

Peppers

Graphology

Image7 Figure 8: Dataset of images used in experiments.

Figure 9: PSNR comparisons for “ peppers” image with CC4 and CC8.

26,500 27,000 27,500 28,000 28,500 29,000 29,500

1 2 3 4 5 6 7 8 9 10 Gradient Quantization (*0.1)

PSNR

cc4 cc8

20

22

24

26

28

30

32

34

0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1

Gradient quantization factor k

PS

NR

KODIM04

KODIM07

KODIM21

mandrill

peppers

graphology

image7

Figure 10: PSNR comparisons for the dataset images with increasing gradient quantization factor. SwaterG has been also compared with analogous methods like Vector Eye [12], the best performing available software (based on our preliminary experiments), SVGenie and SVGWave [2][3] that are recent techniques. For all the dataset images we computed both PSNR and bit per pixels of the SVG outputs obtained by the above methods (see Figure 11, Figure 12). Results show the effectiveness of the proposed method; PSNR values of SwateG outperform in most cases the others, while bit per pixels values are largely acceptable. Also analysis among gzipped compressed SVG outputs (SGVZ) and JPG format, reported in Figure 13, shows similar performances. Finally a visual comparison of results obtained by the four approaches for the “ red hat” image is showed in Figure 14 revealing the perceptive quality of SWaterG output with respect to the others.

5. CONCLUSIONS A raster-to-vector system, namely SWaterG, has been presented. The method consists of an image segmentation step, based on a modified version of watershed decomposition, and a contouring process of regions obtained by ad-hoc chain codes. Vector representation is entrusted with Scalable Vector Graphics (SVG) language. The approach has been implemented and tested over a large dataset of images, experimental results and comparisons show the point of the proposal. Other experiments and on-line demo are available in the web site http://svg.dmi.unict.it. Future researches will be devoted to study advanced region merging heuristics, together with the use of Bezier curves, region gradient filling, filter enhancement and ad-hoc applications on embedded systems (PDA, cellular phone).

15

17

19

21

23

25

27

29

31

33

35

KODIM04 KODIM07 KODIM15 KODIM21 Lena mandrill peppers graphology image7

Images

PS

NR

VectorEye

SwaterG

SVGenie

SVGWave

Figure 11: PSNR comparisons of dataset images outputs.

0

20

40

60

80

100

120

140

160


Images

Bp

p

Original

Vectoreye

SwaterG

SVGenie

SVGWave

Figure 12: Bit per pixels comparisons of dataset images outputs.

0

5

10

15

20

25

30

35

40


Images

Bp

p

Jpg

VectorEye

SwaterG

SVGenie

SVGWave

Figure 13: Bit per pixels comparisons of dataset images compressed outputs.

Figure 14: Visual comparison of results obtained by the different approaches for the “ red hat” image.

SVG

Wave

SVG

enie

Vector E

ye

SWaterG

OR

IGIN

AL

REFERENCES 1. Autotrace – Convert bitmaps to vector graphics http://autotrace.sourceforge.net/ 2. S. Battiato, G. Gallo, G. Messina, SVG Rendering of Real Images using Data Dependent Triangulation, In Proc. Of

ACM/SCCG2004, Spring Conference on Computer Graphic, 2004, Slovakia. 3. S. Battiato, G. Barbera, G. Di Blasi, G. Gallo, G. Messina – Advanced SVG Triangulation Polygonalization of

Digital Images – To appear in Proceedings of SPIE Electronic Imaging 2005 - Internet Imaging VI - Vol. 5670.1 - San Josè, USA, January 2005;

4. D. Duce, I. Herman, B. Hopgood, Web 2D Graphics File Format - Computer Graphics forum vol.21(1) (2002) pages 43-64

5. R.C. Gonzales, R.E.Woods, Digital Image Processing – Second Edition – Prenctice Hall, 2002 6. Kodak's PhotoCD system:ftp://www.cipr.rpi.edu/pub/image/still/KodakImages/ 7. S. Nicotra, Algorithms for Image Segmentation and Classification: Integrating Automated Processing and Human

Perception, Ph.D. Thesis, University of Catania, 2002 8. Potrace - Transforming bitmaps into vector graphics: http://potrace.sourceforge.net/ 9. A. Quint. Scalable Vector Graphics. IEEE Multimedia vol.3 (2003) pages 99-101 10. Ras2Vec - Raster to vector conversion program http://xmailserver.org/davide.html 11. Scalable Vector Graphics (SVG) – XML Graphics for the Web – http://www.w3c.org/Graphics/SVG. 12. Vector Eye – Raster to Vector Converter - http://www.siame.com/index.html (2003). 13. L. Vincent, O. Soille, Watersheds in Digital Spaces: an Efficient Algorithm based on Immersion Simulations, IEEE

Trans. on PAMI, 13(6), pages 583 - 598.

svg rendering by watershed decomposition...years svg has been used for this purpose in software like...

Documents