plane segmentation in organized point clouds using flood fill

7
Plane Segmentation in Organized Point Clouds using Flood Fill Arindam Roychoudhury Marcell Missura Maren Bennewitz Abstract— The segmentation of a point cloud into planar primitives is a popular approach to first-line scene interpre- tation and is particularly useful in mobile robotics for the extraction of drivable or walkable surfaces and for tabletop segmentation for manipulation purposes. Unfortunately, the planar segmentation task becomes particularly challenging when the point clouds are obtained from an inherently noisy, robot-mounted sensor that is often in motion, therefor re- quiring real time processing capabilities. We present a real time-capable plane segmentation technique based on a region growing algorithm that exploits the organized structure of point clouds obtained from RGB-D sensors. In order to counteract the sensor noise, we invest into careful selection of seeds that start the region growing and avoid the computation of surface normals whenever possible. We implemented our algorithm in C++ and thoroughly tested it in both simulated and real-world environments where we are able to compare our approach against existing state-of-the-art methods implemented in the Point Cloud Library. The experiments presented here suggest that our approach is accurate and fast, even in the presence of considerable sensor noise. I. I NTRODUCTION Planes are a prevalent and often dominating geometrical feature in both indoor and outdoor environments. Indoor structures such as floors, walls, cupboards, tables, chair seats and backs etc. as well as outdoor ones like streets, buildings, cars, and pavements can be efficiently represented as a composition of planar surfaces. Such environment repre- sentations are an important aspect of mobile robotics as they can drastically reduce the costs of storing and processing a map in comparison to raw point clouds. Plane segmentation also finds application in SLAM [1] and visual odometry where planes are considered to be good geometric features to match between two frames. Furthermore, as supporting surfaces such as a table top or the floor are typically planes, subtracting the points that lie on planes is often a useful operation when it comes to isolating and detecting objects of interest in point clouds. In this paper, we tackle the problem of plane segmentation of organized point clouds, specifically those that can be obtained from light and cheap RGB-D sensors that can be fitted to mobile robots. Organized point clouds can answer point neighbor queries efficiently because of their image- like, two-dimensional grid structure. Our approach exploits this property and uses a region growing algorithm to first coarsely segment planes whose inliers form a connected component within the grid. In a second step, the found plane segments are merged into larger ones keeping both the plane and neighborhood constraints in mind. The problem of noise All authors are with the Humanoid Robots Lab, University of Bonn, Germany. Fig. 1: Planar segmentation of a tabletop scene using our approach. The orientation of the plane normals are indicated by arrows. is endemic to all robotic sensors and RGB-D cameras are no exception. To account for this, our method does not use per-point precomputed normals during the region growing as computing reasonably noise-free normals is not trivial [2] [3]. We also pay special attention to choosing the seed points of the region growing algorithm, being careful to avoid object edges or especially noisy areas of the RGB-D image. To keep our approach time-efficient, we use an optimized version of the traditional four-neighbor flood fill algorithm that makes fewer computations per pixel/point to determine pixel membership. II. RELATED WORK Plane segmentation is a widely studied problem and sev- eral different approaches have been tried such as Random Sample Consensus (RANSAC) [4], Hough Transformation [5], Region Growing, and Expectation Maximization [6]. Although the original RANSAC [4] algorithm was meant for single-model, noise-tolerant inlier estimation, there have been several variants developed that are capable of multi- model segmentation, such as MultiRANSAC [7], RANSAC in conjunction with Normal Distribution Transforms (NDT) [8], or RANSAC in conjunction with Surfels [9]. Sophisti- cated algorithms based on RANSAC have been deployed for not only plane, but multi-primitive segmentation [10] [11]. However, these methods, although robust against noise

Upload: others

Post on 15-Apr-2022

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Plane Segmentation in Organized Point Clouds using Flood Fill

Plane Segmentation in Organized Point Clouds using Flood Fill

Arindam Roychoudhury Marcell Missura Maren Bennewitz

Abstract— The segmentation of a point cloud into planarprimitives is a popular approach to first-line scene interpre-tation and is particularly useful in mobile robotics for theextraction of drivable or walkable surfaces and for tabletopsegmentation for manipulation purposes. Unfortunately, theplanar segmentation task becomes particularly challengingwhen the point clouds are obtained from an inherently noisy,robot-mounted sensor that is often in motion, therefor re-quiring real time processing capabilities. We present a realtime-capable plane segmentation technique based on a regiongrowing algorithm that exploits the organized structure of pointclouds obtained from RGB-D sensors. In order to counteractthe sensor noise, we invest into careful selection of seeds thatstart the region growing and avoid the computation of surfacenormals whenever possible. We implemented our algorithm inC++ and thoroughly tested it in both simulated and real-worldenvironments where we are able to compare our approachagainst existing state-of-the-art methods implemented in thePoint Cloud Library. The experiments presented here suggestthat our approach is accurate and fast, even in the presence ofconsiderable sensor noise.

I. INTRODUCTION

Planes are a prevalent and often dominating geometricalfeature in both indoor and outdoor environments. Indoorstructures such as floors, walls, cupboards, tables, chairseats and backs etc. as well as outdoor ones like streets,buildings, cars, and pavements can be efficiently representedas a composition of planar surfaces. Such environment repre-sentations are an important aspect of mobile robotics as theycan drastically reduce the costs of storing and processing amap in comparison to raw point clouds. Plane segmentationalso finds application in SLAM [1] and visual odometrywhere planes are considered to be good geometric featuresto match between two frames. Furthermore, as supportingsurfaces such as a table top or the floor are typically planes,subtracting the points that lie on planes is often a usefuloperation when it comes to isolating and detecting objectsof interest in point clouds.

In this paper, we tackle the problem of plane segmentationof organized point clouds, specifically those that can beobtained from light and cheap RGB-D sensors that can befitted to mobile robots. Organized point clouds can answerpoint neighbor queries efficiently because of their image-like, two-dimensional grid structure. Our approach exploitsthis property and uses a region growing algorithm to firstcoarsely segment planes whose inliers form a connectedcomponent within the grid. In a second step, the found planesegments are merged into larger ones keeping both the planeand neighborhood constraints in mind. The problem of noise

All authors are with the Humanoid Robots Lab, University of Bonn,Germany.

Fig. 1: Planar segmentation of a tabletop scene using ourapproach. The orientation of the plane normals are indicatedby arrows.

is endemic to all robotic sensors and RGB-D cameras areno exception. To account for this, our method does not useper-point precomputed normals during the region growingas computing reasonably noise-free normals is not trivial[2] [3]. We also pay special attention to choosing the seedpoints of the region growing algorithm, being careful to avoidobject edges or especially noisy areas of the RGB-D image.To keep our approach time-efficient, we use an optimizedversion of the traditional four-neighbor flood fill algorithmthat makes fewer computations per pixel/point to determinepixel membership.

II. RELATED WORK

Plane segmentation is a widely studied problem and sev-eral different approaches have been tried such as RandomSample Consensus (RANSAC) [4], Hough Transformation[5], Region Growing, and Expectation Maximization [6].Although the original RANSAC [4] algorithm was meantfor single-model, noise-tolerant inlier estimation, there havebeen several variants developed that are capable of multi-model segmentation, such as MultiRANSAC [7], RANSACin conjunction with Normal Distribution Transforms (NDT)[8], or RANSAC in conjunction with Surfels [9]. Sophisti-cated algorithms based on RANSAC have been deployedfor not only plane, but multi-primitive segmentation [10][11]. However, these methods, although robust against noise

Page 2: Plane Segmentation in Organized Point Clouds using Flood Fill

and outliers, are not real time capable. Appearance of spuri-ous non-connected planes is yet another problem. AlthoughHough transforms provide an alternative, Borrmann et al.[12]evaluated different Hough Transform-based methods andfound that the algorithm needed a considerable amount ofmemory and processing time, making the approach unsuit-able for real time robotic applications.

In contrast, region growing is an efficient method forplane segmentation and is especially suited to organized orstructured point clouds where neighborhood information isreadily available. Holz et al.[13] used pre-computed per-point normals to collect connected regions. Hahnel et al.[14]and Poppinga et al.[15] used random seeds and collectedconnected neighbors in an organized point cloud. Trevoret al.[16] used fast connected-component labeling and sub-sequent plane refinement for multi-plane segmentation butdid not use seeds like [14] or [15]. The approach of [16]is also available in the open source Point Cloud Library(PCL) [17]. Xiao et al.[18] used a sub-window based iterativeregion growing approach that worked in both structured andunstructured point clouds. Proenca et al.[19] proposed aregion growing method that used a histogram of image patchnormals to guide patch membership in a plane segment.

Other methods that do not strictly belong to the abovecategories include Holz et al.[3], who used clustering in bothnormal space and spherical coordinates to find plane seg-ments and Feng et al.[20] who generalized two-dimensionalline extraction to three-dimensional plane extraction in anorganized point cloud and used Agglomerative HierarchicalClustering or AHC to merge extracted plane segments.Erdogan et al.[21] used an advanced Markov Chain MonteCarlo (MCMC) method to efficiently search the space ofplanar segmentations and merged them using linear leastsquare fits.

In this work, we too use a seeded region growing ap-proach, but unlike previous approaches, we use an algorithmthat accesses each pixel only a little more than once onaverage to determine its membership in a planar connectedcomponent. This is more efficient than traditional four-neighbor region growing which spends four such accessesper pixel. We also abstain from using per pixel normalswhich makes our method more tolerant to noise. Instead,we determine the surface normal only for carefully chosenseeds that do not lie on object edges or near holes in thedepth image. This ensures that using a simple cross productin the immediate neighborhood of the seed already providesus with a reliable plane normal. This way, we also avoidperforming expensive operations such as least square fits orPCA to compute plane parameters.

III. OUR APPROACH

A. The Region Growing Algorithm

RGB-D data obtained from sensors like the MicrosoftKinect or the Asus Xtion series of depth cameras have an or-ganized two-dimensional grid structure akin to images. Eachpixel p = (i, j) ∈ Z2 in a frame of depth image obtainedfrom these sensors corresponds to a point P = (x, y, z) ∈

R3 in three-dimensional space. The two-dimensional pixelgrid allows efficient access to the neighborhood of a three-dimensional point by accessing the points that correspondto pixel neighbors. We exploit this property and define theneighborhood N (Pi,j) = {Pi−1,j , Pi+1,j , Pi,j−1, Pi,j+1} ofa point Pi,j that corresponds to a pixel pi,j in the i-th rowand j-th column of the depth image. The set contains thefour points that correspond to the up, down, left, and rightneighbor pixels of pi,j . We define the neighborhood of a pixelN (pi,j) in the same fashion. Note, however, that a three-dimensional neighbor is guaranteed to be a pixel neighborbut not vice versa. Cases where the neighborhood relationdoes not hold can be detected by checking the Euclideandistance between neighbors against a threshold.

Region Growing, or flooding, begins with a seed pixel andspreads to its four-connected neighbors N (pseed), recursivelycollecting neighbours of neighbors as long as the neighborhas not already been collected and a distance functionf(p, p′) is satisfied. Eventually, no more valid neighbors arefound and the recursion terminates with a set S of collectedpixels. The naive recursive algorithm accesses each pixelfour times. For a more efficient way of region growing thataccesses each pixel a little more than one time in average,we adapt the algorithm described by [22] and [23]. Given aseed point, the algorithm proceeds by identifying a scanlineof pixels called a span, all of which satisfy the distancefunction f with respect to the seed. A span can be denoted as(i, jleft, jright, κ). The pixels constituting the span all belongto the same row i of the depth image and their membership ofthe span is decided by only one comparison made with eitherits left pi,j−1 or right pi,j+1 neighbor. κ ∈ {−1, 1} indicatesthe direction of a parent span which can only be in the rowabove or below. Once the seed span has been found, thealgorithm proceeds by pushing unexplored spans above andbelow onto a stack and examining them in a similar fashionto a depth first graph traversal. The authors of [23] term theseunexplored spans as shadows which can be one of the threetypes that are illustrated in Fig. 2. Each span of connectedpixels may spawn one or more shadows. The algorithm scanseach popped shadow and finds all spans within it while thechild shadows keep track of their parent span. Apart fromthe reduced number of accesses, growing the region a spanof pixels at a time favors spatial locality in memory andsupports faster access through caching.

Fig. 2: The three kinds of turns and the shadows generatedby the flood fill algorithm. The shadows of a shadow indicatehow the algorithm proceeds. The crosses indicate pixels thatpass the membership criteria.

Page 3: Plane Segmentation in Organized Point Clouds using Flood Fill

B. The Distance Function

Our distance function

f(p, p′) = fplane(pseed, p′) < σplane

∧ feuclidean(p, p′) < σeuclidean (1)

determines the membership of a pixel p′ in a plane segmentduring a flood where p ∈ N (p′) is already a member of thesegment. The distance function consists of two components.The first component

fplane(pseed, p′) = |(Pseed − P ′) · nseed| (2)

measures the perpendicular distance of a point P ′ to theplane defined by the seed point Pseed and its normal nseed. The(·) operation represents the vector dot product. Note that theseed remains constant during a flood and only the normal ofthe seed is used to determine the plane distance. The secondcomponent involves a Euclidean distance function

feuclidean(p, p′) = ||P − P ′|| (3)

that is used to ensure that pixel neighbors are also neighborsin three-dimensional space. The two thresholds σplane andσeuclidean determine whether p′ is to be connected to p. Whilewe provide σplane as a parameter to our algorithm, we inferσeuclidean from the seed selection process described in Sec. III-C.

As an extension to our flooding algorithm, we maintain adistance buffer D(p) that keeps track of the minimum planedistance encountered for each pixel p. We initialize D(p) to∞ for all pixels and update it with the least encounteredplane distance value during the flood. We assign a pixel toa plane segment only if

fplane(pseed, p′) < D(p′), (4)

but we also reassign a pixel if a lower plane distance isencountered during a later flood. Although this operationdisconnects earlier labeled pixels from their segments, wefind that it performs a better segmentation of the scenewith spurious planes arising due to noise getting overwrittenduring later flood iterations with better seeds. Thus we needto perform a subsequent operation that relabels the segmentsthat have been disconnected due to the best plane assignment.Similar to [16], we use an efficient connected componentlabeling algorithm using the union-find data structure. Atthe end of this operation, we obtain a four-connected over-segmentation of the depth image.

C. Seed Selection

Seed selection significantly affects the quality of the planesegment gathered by a region growing operation. A seedthat lies on the edge of an object or in a bad region of thedepth image will represent a spurious plane that is absent inthe original scene. We scan the depth image sequentially inrow major order and consider every unmarked pixel to bea seed candidate. However, without careful seed selection,using a sequential scanning technique would always leadto the next seed lying on the boundary of a plane segment

Fig. 3: Using camera projection to find the seed radius inpixels.

that has already been flooded. We use a seed size parameterσseed to define a cube of side length σseed around the pointP under test and project the diagonal end points onto thecamera plane to find the minimum pixel distance ρ requiredto ensure the seed size. This is illustrated in Fig. 3. Theprojection operation yields four end points pup, pdown, pleft,and pright in pixel space, separated by ρ pixels from the centerpixel p corresponding to point P . Similar to [24], to checkfor distance discontinuity, we ensure that the centroid of thepoints corresponding to the four pixel endpoints does notdeviate from p by more than σseed, i.e., the seed area doesnot have a distance discontinuity if∣∣∣∣∣∣Pup + Pdown + Pleft + Pright

4− P

∣∣∣∣∣∣ ≤ σseed. (5)

For checking curvature discontinuities, we employ a two-grained approach. In the coarse grain, we skip all pixels asseed candidates whose cross products with its four neighborsdo not agree with each other. Let nup, ndown, nleft, and nrightbe the unit normals calculated with the four neighbors of apixel p using the vector cross product, then we select thepixel p as a seed pixel only if

∀ni, nj ∈ {nup, ndown, nleft, nright}, i 6= j :

1− |ni · nj | ≤ θangle, (6)

where θangle is the normal orientation difference threshold.The coarse-grained check takes care of cases where the seedpoint lies on or near an edge but fails in cases where there aremore than one edges within the seed area. To handle thesecases, we use a fine-grained check that looks for curvaturediscontinuities along the four pixel segments defined by the

Fig. 4: Distance and curvature discontinuities. Left: Distancediscontinuity with δ > σseed. Middle: Coarse grained curva-ture discontinuity. Right: Fine grained curvature discontinu-ity. The dotted yellow line indicates one of the four segmentsalong which we check for a discontinuity.

Page 4: Plane Segmentation in Organized Point Clouds using Flood Fill

end points pup, pdown, pleft, pright with the candidate pixel pat the center. We accept the seed only if

∀Pi ∈ {P1 . . . Pρ} :1− |(Pendpoint − P ) · (Pi − P )| ≤ θangle. (7)

Pi is the corresponding point to pixel pi along the segmentstarting at candidate pixel p and ending at pendpoint ∈{pup, pdown, pleft, pright}. See Fig. 4 for examples of distanceand curvature discontinuities encountered during seed find-ing. We infer σeuclidean from the maximum average distancebetween the endpoints and the seed, i.e.,

σeuclidean = max{ ||Pi − Pseed||

ρ

},

where, Pi ∈ {Pup, Pdown, Pleft, Pright}. (8)

D. Plane Merging

Due to the presence of noise, at the end of the regiongrowing process, we end up with a set of N over-segmentedplanes labeled {l1, l2 . . . lN}. In the merging step, we mergeplane segments that have similar normal orientation andare also neighbors not only in pixel space but also inthree-dimensional space. We scan through the four-connectedsegments obtained at the end of region growing and re-labeling and keep track of the neighboring plane segmentsencountered. The stopping conditions for a flood are eitherof the following:

1) The next pixel is bad data.2) The next pixel does not satisfy the distance threshold

(both plane and Euclidean).

If we stop due to the second condition, we check if the nextpixel is a Euclidean neighbor of the current pixel and add itto the list of neighbors of the current plane segment. Thisensures that the two segments that are pixel neighbors, arethree-dimensional neighbors as well. Denoting the label ofthe current pixel as lcurrent and that of the neighbor pixelas lneighbor, we also add lcurrent to the list of neighbors oflneighbor. In this way, we build up a graph structure thatrepresents the neighborhood relationships of the floodedplane segments. This gives us the list of potential candidatesto be merged. Considering only the physical neighbors ofeach plane segment makes the merge operation much moreefficient. We use the Agglomerative Hierarchical Clusteringalgorithm for the merge operation. For two plane segments

(a) IoU

Scene RANSAC CC Ours1 0.60 0.59 0.742 0.99 0.93 0.993 1.00 0.99 0.994 0.66 0.66 0.705 0.17 0.71 0.94

(b) Runtime (ms)

RANSAC CC Ours96.34 35.15 31.5813.58 34.82 27.2235.91 34.15 28.8821.12 34.86 27.6618.56 34.76 27.18

TABLE I: IoUs and average runtimes in the simulated scenes.

l and l′ to be mergeable, the condition

1− |nseed · n′seed| < θangle

∧(|(Pcentroid − P ′centroid) · nseed| < σplane-merge

∨ |(Pcentroid − P ′centroid) · n′seed| < σplane-merge)

(9)

must be satisfied. Here, Pcentroid and P ′centroid denote the three-dimensional centroids of the planes labeled l and l′, respec-tively. Similarly, nseed and n′seed denote the seed normals.

For each plane l found during the region growing step,we look through its neighbor set and also maintain a listof planes that have already been merged into it, which isempty for all planes at the start of the algorithm. Oncea pair of planes l and l′ passes the merge conditions, weobtain a new centroid and normal representing the combinedplane by adding the centroids and normals of the individualplanes weighted by their inlier counts. The inlier set and theneighbor set of the combined plane contains the union ofthe inlier sets and neighbor sets of the individual constituentplanes, respectively. After iterating through all planes foundduring the region growing step, we obtain the final set ofmerged planes representing the full planar segmentation ofthe scene.

IV. EXPERIMENTAL EVALUATION

The main focus of this paper is to obtain a fast andaccurate plane segmentation in an organized point cloud. Tothis end, we first show the results of applying our methodin simulated environments where the ground-truth planesegmentation is known. For comparison, we also apply twoother popular plane segmentation approaches available inthe open source Point Cloud Library [17], namely, Sequen-tial RANSAC [4], and Organized Multi-Plane Segmentationusing Connected Components (CC) [16]. The simulatedenvironments consist of five scenes constructed with ba-sic geometric primitives like boxes, spheres, and polygons.These scenes are shown in Fig. 6. The segmentation accuracyand the computation times of all three methods in each sceneare tabulated in Tab. I. We measure accuracy in terms ofthe Intersection over Union (IoU) of found planes with theground-truth planes. Computation times were measured onan Intel i7 8700 3.2 GHz CPU by averaging over a thousandframes. The result of the experiment supports the increasedaccuracy and efficiency of our approach with respect to thePCL methods.

The real-world scenes we chose for our experiments andtheir segmentation results are shown in Fig. 7. We used a

Fig. 5: Times spent in different phases of the algorithm.

Page 5: Plane Segmentation in Organized Point Clouds using Flood Fill

(a) Scene 1 (b) Scene 2 (c) Scene 3 (d) Scene 4 (e) Scene 5

(f) (g) (h) (i) (j)

(k) (l) (m) (n) (o)

(p) (q) (r) (s) (t)

Fig. 6: The top row (a)-(e) shows the five noise-free simulated scenes we used for testing. The camera position is indicatedby the axes. Individual planes are marked in different colors. Black areas are non-planar. The second row (f)-(j) showssegmentations performed using Sequential RANSAC. The third row (k)-(o) shows segmentations using the PCL ConnectedComponent method. The last row (p)-(t) shows the the scenes segmented using our method.

ASUS Xtion Pro Live RGB-D camera to record our testscenes. The runtimes of all three tested methods are shownin Tab. II. All runtimes are averaged over a thousand frames.Additionally, Fig. 5 shows a pie chart of the percentualcomputation times spent in different phases of our algorithm.

For Sequential RANSAC, we kept the point-plane distanceparameter fixed at 0.01 m and set the minimum inlier countto 100. Since the PCL connected component algorithmrequires pre-computed normals, we set the parameters ofthe normal computation algorithm to the following values.Normal smoothing size: 6, normal depth change factor: 0.02.For the plane computation algorithm, we set the angularthreshold to 0.09, the plane threshold to 0.02 m, and theminimum inlier count to 50. We chose these values by visualinspection as they gave us the best segmentation for allscenes. For our plane segmentation algorithm, the followingparameter values were used: σplane was set to 0.01 m, σseedwas set to 0.06 m, θangle was set to 0.1 rad (this correspondsto about 6° of angular tolerance), and σplane-merge to 0.03 m.

The Sequential RANSAC algorithm is able to find large

Scene Id RANSAC CC OursTable-top 127.28 ms 39.76 ms 44.36 ms

Stairs 174.15 ms 35.48 ms 33.85 msSkateboard 203.38 ms 38.59 ms 55.67 ms

Nao 387.43 ms 40.54 ms 42.13 msChairs 606.71 ms 40.84 ms 36.72 ms

TABLE II: Average runtimes of the real-world scenes.

planes with many inliers within the image. However, dueto its random point selection, it often finds planes that donot correspond to any planes in the original scene. Thisis especially true when there are many smaller planes ofroughly equal size, for example the stairs shown in Fig. 6(j)and Fig. 7(g). In these cases, RANSAC tends to collect inliersthat happen to be coplanar by coincidence, but do not belongthe same planar object. Region growing with neighborhoodconstraints appears to produce better results in such cases.

The CC algorithm essentially works by clustering pre-computed normals into plane segments. Although the al-gorithm uses a sophisticated method to calculate the nor-mals, it is still highly susceptible to noise as even a slightdepth difference within pixels in a neighborhood can leadto extremely divergent normal directions. For this reason,even with a generous angular threshold, the algorithm hasa tendency towards over-segmentation as can be seen inFig. 6(n), Fig. 6(o), Fig. 7(k), and Fig. 7(o). Due to noisynormals, it can also miss planes as for example the die andthe cupboard in Fig. 7(n). It also misses the first step of the

Resolution Runtime640 × 480 44.36 ms320 × 240 11.63 ms160 × 120 3.07 ms80 × 60 0.86 ms

TABLE III: Runtimes at lower resolutions.

Page 6: Plane Segmentation in Organized Point Clouds using Flood Fill

(a) Tabletop (b) Stairs (c) Skateboard (d) Clutter (e) Chairs

(f) (g) (h) (i) (j)

(k) (l) (m) (n) (o)

(p) (q) (r) (s) (t)

Fig. 7: The top row (a)-(e) shows the five noisy real world scenes we used for testing. The second row (f)-(j) showssegmentations performed using Sequential RANSAC. The third row (k)-(o) shows segmentations using the PCL ConnectedComponent method. The last row (p)-(t) show the the scenes segmented using our method.

stairs in Fig. 7(l) and the second step of the stairs in Fig. 6(o).We also experimented with segmentation at lower resolu-

tions. We found that sub-sampling the original image at fixedintervals often led to better, more consistent results. This isbecause sub-sampling tends to reduce the noise present inthe original resolution by essentially performing a smoothingoperation. To validate our point, we show the tabletop scenein Fig. 8 segmented at successively lower resolutions. Ofcourse, we have to strike a balance between noise reductionand losing details from the original scene. In our experimentswe found that sub-sampling beyond a factor of eight tendsto lose objects present in the original scene. Sub-samplinghas the added advantage that the segmentation times aredrastically reduced as can be seen in Tab. III. At the lowestresolution, we are able to perform an adequate segmentationof the scene in less than a millisecond.

V. CONCLUSION

In this paper, we presented an improved approach towardsplane segmentation in depth images using an efficient seededregion growing algorithm coupled with careful seed selection

(a) 320 × 240 (b) 160 × 120 (c) 80 × 60

Fig. 8: Planar segmentation of a tabletop scene at differentdepth image resolutions.

and avoidance of problematic surface normals. To obtain afinal smooth segmentation, we augmented the region grow-ing with a hierarchical plane merging algorithm based onneighborhood relationships between planes obtained duringthe region growing phase. We avoided per-point normalcomputation which allowed us to handle noisy scenes better.Through experiments on both simulated and real-world envi-ronments, we showed that our approach is real time capableand can provide better segmentation results than the existingstate-of-the-art for robotic applications.

Page 7: Plane Segmentation in Organized Point Clouds using Flood Fill

REFERENCES

[1] R. F. Salas-Moreno, B. Glocken, P. H. Kelly, and A. J. Davison, “Denseplanar slam,” in Proc. of the Intl. Symposium on Mixed and AugmentedReality (ISMAR). IEEE, 2014, pp. 157–164.

[2] S. Holzer, R. B. Rusu, M. Dixon, S. Gedikli, and N. Navab, “Adaptiveneighborhood selection for real-time surface normal estimation fromorganized point cloud data using integral images,” in Proc. of theIEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS). IEEE,2012, pp. 2684–2689.

[3] D. Holz, S. Holzer, R. B. Rusu, and S. Behnke, “Real-time planesegmentation using rgb-d cameras,” in Robot Soccer World Cup.Springer, 2011, pp. 306–317.

[4] M. A. Fischler and R. C. Bolles, “Random sample consensus: Aparadigm for model fitting with applications to image analysis andautomated cartography,” Commun. ACM, vol. 24, no. 6, p. 381395,June 1981.

[5] D.H. and Ballard, “Generalizing the hough transform to detect arbi-trary shapes,” Pattern Recognition, vol. 13, no. 2, pp. 111 – 122, 1981.

[6] R. Lakaemper and L. J. Latecki, “Using extended em to segment planarstructures in 3d,” in International Conference on Pattern Recognition,vol. 3. IEEE, 2006, pp. 1077–1082.

[7] M. Zuliani, C. S. Kenney, and B. Manjunath, “The multiransac algo-rithm and its application to detect planar homographies,” in Proc. ofthe IEEE Intl. Conf. on Image Processing (ICIP), vol. 3. IEEE, 2005,pp. III–153.

[8] L. Li, F. Yang, H. Zhu, D. Li, Y. Li, and L. Tang, “An improved ransacfor 3d point cloud plane segmentation based on normal distributiontransformation cells,” Remote Sensing, vol. 9, no. 5, p. 433, 2017.

[9] B. Oehler, J. Stueckler, J. Welle, D. Schulz, and S. Behnke, “Efficientmulti-resolution plane segmentation of 3d point clouds,” in Interna-tional Conference on Intelligent Robotics and Applications. Springer,2011, pp. 145–156.

[10] R. Schnabel, R. Wahl, and R. Klein, “Efficient ransac for point-cloudshape detection,” in Computer graphics forum, vol. 26, no. 2. WileyOnline Library, 2007, pp. 214–226.

[11] R. Toldo and A. Fusiello, “Robust multiple structures estimation withj-linkage,” in European conference on computer vision. Springer,2008, pp. 537–547.

[12] D. Borrmann, J. Elseberg, K. Lingemann, and A. Nuchter, “The 3dhough transform for plane detection in point clouds: A review and anew accumulator design,” 3D Research, vol. 2, no. 2, p. 3, 2011.

[13] D. Holz and S. Behnke, “Fast range image segmentation and smooth-ing using approximate surface reconstruction and region growing,” inIntelligent Autonomous Systems. Springer, 2013, pp. 61–73.

[14] D. Hahnel, W. Burgard, and S. Thrun, “Learning compact 3d modelsof indoor and outdoor environments with a mobile robot,” Journal onRobotics and Autonomous Systems (RAS), vol. 44, no. 1, pp. 15–27,2003.

[15] J. Poppinga, N. Vaskevicius, A. Birk, and K. Pathak, “Fast planedetection and polygonalization in noisy 3d range images,” in Proc. ofthe IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems (IROS).IEEE, 2008, pp. 3378–3383.

[16] A. J. Trevor, S. Gedikli, R. B. Rusu, and H. I. Christensen, “Efficientorganized point cloud segmentation with connected components,”Semantic Perception Mapping and Exploration (SPME), 2013.

[17] R. B. Rusu and S. Cousins, “3D is here: Point Cloud Library (PCL),”in Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA),Shanghai, China, May 9-13 2011.

[18] J. Xiao, J. Zhang, B. Adler, H. Zhang, and J. Zhang, “Three-dimensional point cloud plane segmentation in both structured andunstructured environments,” Journal on Robotics and AutonomousSystems (RAS), vol. 61, no. 12, pp. 1641–1652, 2013.

[19] P. F. Proenca and Y. Gao, “Fast cylinder and plane extractionfrom depth cameras for visual odometry,” in Proc. of the IEEE/RSJIntl. Conf. on Intelligent Robots and Systems (IROS). IEEE, 2018,pp. 6813–6820.

[20] C. Feng, Y. Taguchi, and V. R. Kamat, “Fast plane extraction inorganized point clouds using agglomerative hierarchical clustering,”in Proc. of the IEEE Intl. Conf. on Robotics & Automation (ICRA).IEEE, 2014, pp. 6218–6225.

[21] C. Erdogan, M. Paluri, and F. Dellaert, “Planar segmentation of rgbdimages using fast linear fitting and markov chain monte carlo,” inConference on Computer and Robot Vision. IEEE, 2012, pp. 32–39.

[22] P. S. Heckbert, “A seed fill algorithm,” in Graphics gems. AcademicPress Professional, Inc., 1990, pp. 275–277.

[23] K. P. Fishkin and B. A. Barsky, “An analysis and algorithm for fillingpropagation,” in Computer-generated images. Springer, 1985, pp.56–76.

[24] S. M. Ahmed, Y. Z. Tan, C. M. Chew, A. Al Mamun, and F. S. Wong,“Edge and corner detection for unorganized 3d point clouds withapplication to robotic welding,” in Proc. of the IEEE/RSJ Intl. Conf. onIntelligent Robots and Systems (IROS). IEEE, 2018, pp. 7350–7355.