robust line simplification on the plane

8

Click here to load reader

Upload: jlg

Post on 21-Dec-2016

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Robust line simplification on the plane

Robust line simplification on the plane

J.L.G. Pallero n

ETSI en Topografía, Geodesia y Cartografía, Universidad Politécnica de Madrid, Autovía de Valencia, km 7.5, 28031 Madrid, Spain

a r t i c l e i n f o

Article history:Received 9 April 2013Received in revised form30 July 2013Accepted 29 August 2013Available online 7 September 2013

Keywords:Douglas–Peucker algorithmPolyline simplificationMap generalizationCartographyOpenMPComputing

a b s t r a c t

Line simplification is an important task in map generalization, in traditional paper series as well as ingeographic information systems and web map server services. Using the adequate method an accuraterepresentation of the original elements should be obtained by suppression of redundant information whilemaintaining the shape of the original elements according to the new scale. To that effect one of the mostwidely used algorithms is the so-called Douglas–Peucker algorithm. It can lead to inconsistent results such asself-intersections or intersections with other elements, so the operator's supervision is necessary followingthe automatic treatment. In this piece of work a robust and easy-to-implement variation of the Douglas–Peucker algorithm for individual line simplification in two dimensions is presented. The robustness of thenew algorithm is based on the concept of intersection of segments and it can be easily implemented inparallel. The new algorithm brings about correct results regardless of tolerance and morphology of theoriginal line or polygon.

The algorithm is coded in standard C99 and it can be compiled for serial or parallel execution viaOpenMP. Both the algorithm itself and a program that implements it are distributed as free software. Thevalidity of the solution was tested using the GSHHG geography database that can be obtained free throughtheWeb. Results about accuracy of the output, execution speed and scalability of the parallel implementationare presented.

& 2013 Elsevier Ltd. All rights reserved.

1. Introduction

Line simplification is an important task in map generalization.Using the adequate method an accurate representation of theoriginal elements should be obtained. At the same time all redundantinformation in the original elements should be eliminated in keepingwith the new scale. The main features in a map subject to begeneralized by these techniques are shorelines, lake perimeters,rivers, roads, contour lines, etc. Line simplification is used not onlyin classical paper map series, but also in web services related togeographic information systems that, in some cases, can offer theproduct at a user selected scale (Cecconi et al., 2002; Harrower andBloch, 2006).

Several methods have been proposed to solve the problem,each one having singular characteristics. Based on the classifica-tion presented in McMaster (1987) and AGENT (1999) the differentalgorithms can be classified taking into account the use of thepoints of the original line. A first category would be the algorithmsbased on random point selection which do not use any informa-tion about neighboring points. These algorithms do not need anyinformation about lines but they neither follow any cartographic

criteria nor do they produce the same results in different execu-tions. Another category is known as local algorithms, which use areduced set of points for the selection of the final vertices (Jenks,1989). Ultimately global algorithms use the whole original polylinefor the selection of each vertex in the resulting line (Douglas andPeucker, 1973).

Also specific techniques for cartography have been developed.The work of Wang and Muller (1993) is focused on shorelinesimplification, taking into account the influence of topography andrivers. Another type of algorithms is focused on networks, such asthe ones to pattern roads or rivers, where preserving the coordi-nates of the intersection points between entities is important(Mackaness and Mackechnie, 1999).

However one of the most used methods (Ebisch, 2002; Gormanet al., 2007) in practice is the above-mentioned Douglas–Peuckeralgorithm (Douglas and Peucker, 1973), mainly because of its easeof implementation, its high execution speed, and the quality ofresults in most cases. By retaining a set of points from the originalpolyline, this method discards those out of tolerance from thegeneralized line.

The main limitation of the Douglas–Peucker technique is thepossibility for the resulting polyline to contain self-intersections(Saalfeld, 1999); hence that technique would have to be considereda non-robust algorithm. Also although a single generalized poly-line cannot contain self-intersections, when a set of lines (e.g.contour lines) is generalized, intersections can also appear

Contents lists available at ScienceDirect

journal homepage: www.elsevier.com/locate/cageo

Computers & Geosciences

0098-3004/$ - see front matter & 2013 Elsevier Ltd. All rights reserved.http://dx.doi.org/10.1016/j.cageo.2013.08.011

n Tel.: þ34 913366482.E-mail address: [email protected]

Computers & Geosciences 61 (2013) 152–159

Page 2: Robust line simplification on the plane

between them. The Douglas–Peucker algorithm works with indi-vidual lines, so this effect cannot be avoided. These facts canintroduce visualization problems in classical paper map series,as well as wrong results in subsequent computer treatment of theentities, such as area computations, Boolean operations betweenpolygons, point-in-polygon checking, etc.

In order to prevent self-intersections and intersections withother elements, several algorithms have been developed. All ofthem are based on the computation of the convex hull for thevertices that form the line subject to the simplification. Saalfeld(1999) laid the foundations of this kind of technique which wasoptimized later by Bertolotto and Zhou (2007). Also based onconvex hull computations we can find the studies of Wu et al.(2003) and Wu et al. (2004). Although all these techniquesgenerate robust results, they slow down the computations andmake the algorithm non-applicable in certain cases such as webservices in real time (Bertolotto and Zhou, 2007).

In this paper a description of the original Douglas–Peuckeralgorithm is presented and a variation of it is proposed to the endof obtaining robust results working with individual polylinesregardless of tolerance and morphology of the original line orpolygon. Also the feasibility of parallel execution is analyzed forboth methods, and some conclusions are drawn thereupon. Finally,a set of tests are performed using the different algorithm varia-tions in order to check the results. This task is carried out using theGSHHG1 geography database (Wessel and Smith, 1996).

2. Original Douglas–Peucker algorithm

After a length tolerance for point rejection is established, theoriginal Douglas–Peucker algorithm (Douglas and Peucker, 1973)comprises the following steps:

1. The first and the last point in the original polyline are consideredfixed (i.e. belonging to the output line). These two fixed points(and, in general, two fixed points) linked form a straight line thatwill be called a base segment.

2. The distances between each remaining point in the original lineand the base segment defined in the previous step arecomputed. The points considered from the original line in thisstep are those between the points that form the base segment.

3. The farthest point from the base segment is considered; if thisdistance is greater than a predefined tolerance, the point isadded as a new vertex of the output polyline.

4. The algorithm goes back to the first step, and considers now allthe previously added vertices to the output line as fixed (thefirst and the last ones from the original line included). All ofthem, linked in pairs (in order from beginning to end) make upthe new base segments.

5. When no more points belonging to the original polyline arefarther away than the tolerance from the correspondent basesegment, the algorithm concludes.

The procedure can be seen graphically in Fig. 1, where theexample polyline formed by 12 vertices is reduced to 7 after thealgorithm execution.

Referred to the distance computations from the points to thebase segments, the details explained in Ebisch (2002) must betaken into account. The author warns about a common error insome algorithm implementations. The error consists of consideringonly the perpendicular distance between the candidate point andthe corresponding base segment. The perpendicular distance must

be considered only when the candidate point lies between thepoints that form the base segment. In any other situation, thedistance between the candidate point and the closest point fromthe base segment must be computed.

The weakest point of the original Douglas–Peucker algorithm is thenon-robustness of the result which in some cases can present self-intersections due to special configurations of the original polyline ortoo high tolerance values. As explained, only a distance criterion wasused in order to accept and reject points and no other characteristicsof the original line were taken into account (i.e. relations betweenneighboring points), so it is impossible to detect any anomaly in theresult. An example of self-intersections is shown in Fig. 2.

The described algorithm is essentially recursive. Once a vertexis added to the output polyline, the algorithm is again invokeduntil no more points are to be added or a base segment is formedby two contiguous points in the original line. Also due to thenature of the algorithm, it is (at least theoretically) easy toparallelize. We use the first step of the algorithm shown in Fig. 1as an example. Taking into consideration the first base segmentbetween vertices 1 and 12, and according to the distance criteria,the next vertex to be added is number 8. At the next step, the samealgorithm must be applied considering two base segments: oneformed by vertices 1–8 and the other formed by the vertices 8–12.At this point, the work can be shared by two processors, eachperforming the computations related to each base segment.

In short, it may be stated that the original Douglas–Peuckeralgorithm consists in the addition of significant points to the basesegment formed by the first and last points of the original polyline.A point is significant when at a particular step of the algorithm it isfarthest from a base segment among the points exceeding the

tol

dmáx>tol

dmáx>tol

dmáx>tol

d>toldmáx<=tol

d<=tol

d<=tol

d<=tol

dmáx>tol1

2

3 4

5

67 8

9

10

1112

Fig. 1. Original Douglas–Peucker algorithm step by step. The final result is the lineformed by the triangles; between them no points from the original line are fartheraway than tol.

1 Known as GSHHS before its 2.2.0 version.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159 153

Page 3: Robust line simplification on the plane

tolerance. Once a point is added to the output line, it cannot bedeleted.

3. Robust polyline simplification

The methodology presented in this paper uses the criterion ofsegment intersection in order to make the algorithm robust as workis carried out with individual polylines, so the results will be devoidof self-intersections. In addition, computation of any element addedto the original and generalized polyline is not necessary at executiontime.

3.1. Non-recursive and non-robust variation of the Douglas–Peuckeralgorithm

The variation of the original algorithm presented in this paperis not recursive. It consists of the following steps (as in the originalalgorithm the first and the last point of the original line are kept inthe result):

1. The last point (N in the original line) added to the output line(at the first step of the algorithm it will be the starting point),is joined to vertex Nþ2 to generate the base segment. Then thedistance between vertex Nþ1 and the base segment is com-puted and compared with the established tolerance.

2. If the computed distance does not exceed the tolerance, a newbase segment will be created between vertices N and Nþ3 andthe distances between the base segment and all the intermediatepoints between N and Nþ3 will be computed again. As long as

the greatest of these distances does not exceed the predefinedtolerance, new base segments will be created between points Nand Nþ4…Nþn from the original line. In the extreme case thelast point from the original line could be reached.

3. When a base segment N/Nþk is found for which the distance tothe farthest intermediate point is greater than the tolerance,the vertex Nþk�1 will be added to the output line. Hence thetolerance of all vertices between N and Nþk�1 will be ensuredsince the base segment N/Nþk�1 was checked at the previousstep.

4. The algorithm goes back to the first step and considers now thepoint Nþk�1 as the initial vertex for the new base segment.

5. The algorithm ends as the last working base segment has thelast point from the original polyline as final vertex and allintermediate vertices included in it are within tolerance.

Fig. 3 shows a step by step example of the modified algorithm.As in the case of the original algorithm, the resulting polyline ismade up of 7 vertices, although they are not the same as in Fig. 1.

This variation transforms the original Douglas–Peucker algo-rithm from global into local, as stated in McMaster (1987), i.e. inorder to add a new vertex to the output line, only the neighboringpoints to the last added point are taken into account. Only inextreme cases the algorithm might be considered as global, i.e.those cases in which all points up to the end of the original linewould have to be tested because no intermediate point could befound out of tolerance. The main characteristic of this algorithm isthat at each step the discarded vertices from the original line(those behind the last vertex added to the output line) will not beused anymore at the subsequent steps. It should be stated that the

Fig. 2. Self-intersections in a polyline simplified with the original Douglas–Peucker algorithm (not all steps are shown).

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159154

Page 4: Robust line simplification on the plane

result could be different depending on the processing direction ofthe original line. At any rate the tolerance will be complied withwhereby both possible different results will be valid.

The above-mentioned variation and the original algorithm areboth non-robust. In certain cases, due to the original line mor-phology or high tolerance values, results with self-intersectionssimilar to Fig. 2 can be obtained.

The modified version, as the original one, can be implementedin parallel. In this case the implementation could consist of usingthe available processors in order to try several base segments atthe same initial point concurrently.

3.2. Robust variation of the Douglas–Peucker algorithm

The technique developed in order to convert the modifiedalgorithm into a robust one involves the detection of the possibleintersections between:

1. The new segments of the output line and the non-processedsegments of the original line.

2. The new segments of the output line and the previoussegments of the same output line.

The fact that any segment formed by two consecutive points ofthe output line intersects with the original should be considered aclue for a possible self-intersection in the result. In a similar way anew segment of the output line can intersect with a previouslyadded segment.

The conversion from the non-robust to the robust modifiedversion consists of two steps:

1. When the modified algorithm selects a new point to add to theoutput polyline, the intersection between the segment formedby this point and the previous one in the output line and thesegments formed by the still non-processed points of theoriginal line is verified. If an intersection is not detected, theselected vertex is maintained and will be processed by thesecond step of the robust algorithm. If an intersection isdetected, the new point is discarded and the previous pointin the original line is taken as a candidate (by the nature of themodified algorithm, this previous point is a valid point becauseall intermediate points in the segment are below the tolerance).Subsequently the intersection verification will be repeated untilno intersections are detected, or in the most unfavorable case,the point next to the starting point (the last vertex added to theoutput line) in the segment will be the candidate point. At theend of this part, the non-intersection between the outputworking line and the original line (from the last added pointto the end) is guaranteed.

2. In a similar manner to the previous step, in this part the potentialintersections between the segment formed by the candidate pointand the last point added to the output line and the segmentsformed by the previous points added to the output line will beverified. If any intersection is detected, the candidate point isdiscarded and the previous vertex in the original line is consid-ered. When no intersections are detected or the candidate point isthe one next to the last point added to the output line, theworking vertex is definitely added to the output polyline.

After accomplishing the previous two steps the output polylinewill not contain any self-intersections. The algorithm is robust.Fig. 4 shows the solutions provided by the original and robustalgorithms. The line simplified by the robust method will contain ingeneral (except special cases) more points than the one obtained bythe classic method.

The described two parts can be executed individually inparallel. The intersection verification with the segments from theoriginal line (part 1) and with the segments from the simplifiedline (part 2) may be shared by the available processors in order toaccelerate the computations.

4. C implementation and OpenMP parallelization

Themain reason for the C programming language to be selected foralgorithm implementation is its running speed. C also has theadvantage of having many implementations (e.g. the GCC compilercan be used in over 60 combinations of processor and operatingsystems2). Furthermore, the implementation in C allows the use of thealgorithms from other high level languages as Cþþ , Python or Java.

Nowadays the multicore processors are the most commonchips in the market. They are present in any kind of device, frommobile phones to supercomputing centres. In order to takeadvantage of the multicore architecture, the OpenMP program-ming model3 was selected. OpenMP allows an easy use of thedifferent execution threads in a shared memory environment (as amulticore chip, for example), and at the same time it allowsmaintaining the serial and parallel versions of the algorithm inthe same piece of code (Chapman et al., 2008).

Fig. 3. Variation of the Douglas–Peucker algorithm step by step (only for the firstsegment of the output line). Starting at the first vertex, base segments aresequentially created using the original polyline points until the distance from anintermediate point to a base segment reaches the tolerance.

2 http://gcc.gnu.org/install/specific.html.3 http://openmp.org.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159 155

Page 5: Robust line simplification on the plane

It should be also noted that the main available free softwarecompilers, as GCC,4 PathScale EKOPath5 and Open64,6 have theirown OpenMP implementation.

4.1. Computation of distance between points and segments

An essential part of the algorithm is the computation of thedistance between points and segments. In this study the efficientalgorithm presented in Ebisch (2002) is used. Actually the algo-rithm internally computes the square of the distance in order toavoid a call to the square root function and save computationaltime. So the computed square distances must be compared withthe square of the tolerance.

4.2. Original Douglas–Peucker algorithm

To have a reference for comparison purposes the originalDouglas–Peucker algorithm was also implemented. The input ofthe implemented recursive function are: a scalar storing thetolerance, two vectors with the original polyline x and y coordi-nates, a scalar with the x and y lengths and two more scalars withthe indices of the start and end points of the working basesegment (in the first call the indices of the first and last point ofthe original line). As output argument it also receives a vector ofthe same length as x and y that stores 0 in all positions. After theexecution has taken place this output vector will store the value1 in the positions corresponding to the used vertices from theoriginal line that forms the simplified line. The function declara-tion DPOrig() is presented in Fig. 5.

Internally a function called DistMax2() computes the maximumsquare distance between the base segment and all the pointscontained between its extremes. This function also returns theposition at the coordinate vectors corresponding to the farthestpoint. Once this position is determined (it cannot exist due to allpoints being below the tolerance or to the base segment beingformed by two contiguous points) the DPOrig() function invokesitself in order to process the two new generated base segments: theone from the start point of the base segment to the position of thefarthest point detected by DistMax2() and the other from this lastpoint to the end point of the original base segment. At this point the

work can be shared by two available processors. In order to do that,the OpenMP sections construction is used.

4.3. Robust Douglas–Peucker algorithm

In this case the implemented function is not recursive. It has asinput a scalar storing the tolerance, two vectors with the originalpolyline x and y coordinates, a scalar with the x and y lengths anda pointer to a scalar in order to store the length of the outputvector. The function returns a vector containing the indices of theoriginal polyline that forms the simplified line. The functiondeclaration DPRob() is presented in Fig. 6.

The function goes into a loop until the end of the original line isreached. The base segments are being created inside this loop andthe distances between the intermediate points to them are beingverified. Although in Section 3.1 it was stated that this part of thealgorithm is in theory parallelizable, some tests have shown thatits OpenMP implementation slows down the execution. The mainproblem is the lack in OpenMP of orders to break a for loop whena valid point is obtained; the implementation of a workaround isnot efficient in this case.

When a point out of tolerance is detected, the next stepconsists of verifying the possible intersections of the base segmentwith the rest of the original polyline (function DPRobOInt()) andthen with the prior segments in the simplified line (functionDPRobSInt()). After these two checks the result will be robust.For the intersection computations in the 2D Euclidean space, themethod explained in O'Rourke (1998) is used.

The method DPRobOInt() has as input the two vectors withthe original polyline x and y coordinates, a scalar with the x and ylengths and two scalars with the indices of the start and endpoints of the working base segment. The scalar with the end indexis a pointer, so at the end of the function it will store theappropriate index ensuring that the base segment does notintersect with any segment of the remainder of the original line.The function declaration DPRobOInt() is presented in Fig. 7.

tol

Fig. 4. Simplified polylines by the original Douglas–Peucker algorithm (left) and by the robust algorithm (right). As can be observed, the robust solution contains more pointsthan the non-robust version.

Fig. 5. Declaration of DPOrig() function that implements the original Douglas–Peucker algorithm.

4 http://gcc.gnu.org.5 http://www.pathscale.com (only the version 4.0.12.1 is free software).6 http://www.open64.net.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159156

Page 6: Robust line simplification on the plane

The intersection verification is carried out inside a loop fromthe segment next to the base in the original polyline to the end ofit. The function behavior in the case of a serial execution of thecode is different than in the case of an OpenMP execution. For thefirst instance when an intersection is detected, the loop is brokenand the function returns. For an OpenMP execution the intersec-tion checks are shared between the different threads. In this case,due to the fact that it is not possible to use the break C orderinside a loop, all segments must be verified.

The function DPRobSInt() has as input the two vectors withthe original polyline x and y coordinates, a scalar with the x and ylengths and a vector s with the positions in x and y that are usedon the simplified polyline, and two scalars with the indices of thestart and end points of the working base segment. The scalar withthe end index is a pointer, so at the end of the function it will storethe appropriate index that ensures the base segment will notintersect with any segment of the remainder of the original line.The working schema is the same as already described for theDPRobOInt() function. The function declaration DPRobSInt() ispresented in Fig. 8.

In both cases the DPRobOInt() and DPRobSInt() functionsverify the intersections between the working base segment and allthe segments in the original line (from the base segment in advance),and all the segments in the simplified line (from the start to the basesegment). However it may be expected that the possible intersec-tions be located in close proximity to the base segment, so that theDPRobOInt() and DPRobSInt() functions can be slightly modifiedin order to check only a small number of segments near the basesegment and then to speed up the process. This limit number can beadded as a new input argument.

5. Results

The geography database GSHHG7 (Wessel and Smith, 1996) hasbeen used for the tests in this piece of work. GSHHG is a set ofpublic domain data with position accuracy between 50 m and5000 m (depending on the origin of the data). It is divided into5 levels of detail which were generated using the classical Douglas–Peucker algorithm with a subsequent processing in order to detectand remove the possible self–intersections in the results. Theselevels of detail are: (0) full level, containing the original data;(1) high level, generated using a tolerance of 0.2 km; (2) intermedi-ate level, generated with 1 km of tolerance; (3) low level, with 5 kmof tolerance, and (4) crude level, 25 km of tolerance.

The serial tests were performed in a workstation equipped witha quad-core (4 threads) Intel Core i5-2500 3.30 GHz, running theDebian GNU/Linux operating system and the GCC 4.7.0 compiler.The OpenMP parallel tests were performed in a workstationequipped with two six-core (total of 12 threads) Intel Xeon

X5650 2.67 GHz, running the Red Hat Enterprise Linux 4 operatingsystem and the GCC 4.6.0 compiler.

The first test consists of the execution time and the comparisonof the output number of points between the original Douglas–Peucker and the modified non-robust and robust versions of thealgorithm. The serial versions for all the algorithms were used. Thepolygon representing the shoreline for Eurasia from GSHHG wasused at the full resolution level (1181127 vertices). The workingtolerances correspond to high to crude resolution levels as in theoriginal GSHHG, and the results are shown in Table 1. In regard therunning speed, it can be observed that the execution time for theoriginal Douglas–Peucker algorithm decreases with an increase inthe tolerance value. For the non-robust variation of the algorithmthe behavior is the opposite, i.e. execution time increases astolerance increases. For the robust version the execution timedecreases with an increase in the tolerance value, same as for theoriginal Douglas–Peucker algorithm. However, when all possibleintersections are checked in the robust variation, the executiontime for small tolerances proves to be too high, making themethod unsuitable for some kinds of work, as for example real-time applications. This fact can be attenuated by limiting thenumber of intersections to be verified (a limit of 2000 was usedin the testing) as can be seen. For very small tolerance values(0.2 km in the test) the modified non-robust variation of thealgorithm was faster than the original Douglas–Peucker. This factshould be borne in mind in a twofold manner: (1) the appearanceof self-intersections is not so common for very small tolerancevalues, and (2) this algorithm can be an option for use withembedded and low power devices. In all cases the modifiedalgorithms bring about results with 5% to 15% lower number ofvertices than the original Douglas–Peucker algorithm. In turn, therobust version of the algorithm produces results with a slightlyhigher number of vertices than the non-robust version.

In Fig. 9 an example of the behavior of the robust variationof the algorithm is shown. The outcome corresponds to theprocessing of Eurasia with a tolerance of 25 km. The displayrepresents the Curonian Lagoon (between Lithuania and Russia).Due to the complicated morphology of the shoreline (long, thin,curved sand-dune spit that separates the Curonian Lagoon fromthe Baltic Sea coast) the non-robust algorithms brings about aself-intersection on the simplified polyline for high tolerances.

Fig. 6. Declaration of DPRob() function that implements the robust Douglas–Peucker algorithm.

Fig. 7. Declaration of DPRobOInt() function that implements intersection checksbetween the working base segment and the original polyline.

Fig. 8. Declaration of DPRobSInt() function that implements intersection checksbetween the working base segment and the simplified polyline.

Table 1Execution times and number of points for Eurasia (11,81,127 vertices) for theoriginal Douglas–Peucker and non-robust and robust variations of the algorithm.Serial versions of the algorithms. Intel Core i5-2500 3.30 GHz.

Tol. (km) Original Non-rob. Robust Rob. (2000)

0.2 0.171 s 0.056 s 276.837 s 1.510 s170009 p 161406 p 161553 p 161553 p

1 0.142 s 0.187 s 56.652 s 0.598 s42516 p 36706 p 37218 p 37218 p

5 0.117 s 0.803 s 10.263 s 0.918 s8711 p 7386 p 7578 p 7578 p

25 0.090 s 5.486 s 7.251 s 5.530 s1329 p 1187 p 1220 p 1218 p

7 http://www.ngdc.noaa.gov/mgg/shorelines/gshhs.html.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159 157

Page 7: Robust line simplification on the plane

The robust version avoids the self-intersection and produces acorrect result. The outcome is the same for the robust version ofthe algorithm limited to a number of intersections of 2000.

The last test performed consists of the execution of the robustvariation (calculating all the possible intersections) of the algo-rithm in parallel, using OpenMP on a workstation equipped withtwo Intel Xeon X5650 2.67 GHz for a total of 12 execution threads.The polygon representing Eurasia at level full from GSHHG wasused again, and the tolerance values were also 0.2 km, 1 km, 5 km,and 25 km. Taking as reference the times from the serial execution(which were recalculated for the new environment) the perfor-mance of the parallel execution was calculated. The algorithm wasexecuted for all tolerance levels limiting the use of threads from1 to 12, and the results can be seen in Fig. 10. As a global overviewit can be said that the scalability of the algorithm is near linear,although it should be clarified. For a level of tolerance of 0.2 kmthe scalability is linear and the parallel algorithm outperforms theserial version from 2 threads onward. The approximate speedup isabout 0.57� per core. For the tolerance of 1 km the scalability isalso near linear with an approximate speedup of 0.45� per core,which guarantees better execution times than the serial versionfrom 3 threads onward. For medium and high tolerance values theresults are worse than for lower tolerance levels. For tolerances of5 km the parallel algorithm needs 7 threads or more to outperformthe serial version (for 12 threads the speedup is about 1.8� ), andfor tolerances of 25 km only using 9 or more threads, theexecution time is the same as for the serial algorithm.

6. Conclusions

In this study an original, robust, and easy-to-implement methodfor individual line simplification has been presented. The algorithm

guarantees results without self-intersections whatever the mor-phology of the input line and the value of tolerance.

For a fixed tolerance value the algorithm produces (in its robustand non-robust versions) results with a slightly lower number ofvertices than the classical Douglas–Peucker method. It should alsobe highlighted that the non-robust version of the algorithm showsa lower execution time than the classical Douglas–Peucker algo-rithm for small values of tolerance. In these cases of small valuesfor the tolerance, when the robust version is used and highrunning speed is desired, it is necessary to limit the number ofvertices in the intersection verification steps.

Self-intersection

Original Douglas-Peucker

OriginalSimplified

55.8

55.6

55.2

54.8

55.0

55.4

Latitude

20.4 20.6 20.8 21.0 21.2 21.4

Longitude

OriginalSimplified

Non-robust Douglas-Peucker variation55.8

55.6

55.2

54.8

55.0

55.4

Latitude

20.4 20.6 20.8 21.0 21.2 21.4

Longitude

Self-intersections

OriginalSimplified

Robust Douglas-Peucker variation55.8

55.6

55.2

54.8

55.0

55.4

Latitude

20.4 20.6 20.8 21.0 21.2 21.4

Longitude

Fig. 9. Detail of the Curonian Lagoon (Lithuania–Russia) from the Eurasia process with tolerance of 25 km. Top left: Original Douglas–Peucker algorithm. Top right: Non-robust variation of Douglas–Peucker. Bottom: Robust algorithm.

0

1

2

3

4

5

6

7

2 4 6 8 10 12

Speedup

Threads

OpenMP robust algorithm

Fig. 10. Performance of the parallel OpenMP algorithm regarding the serial versionfor Eurasia. 2 Intel Xeon X5650 2.67 GHz (12 threads). GCC 4.6.0.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159158

Page 8: Robust line simplification on the plane

Referred to the parallel OpenMP execution, the algorithmshows a near-linear scalability with respect to the increase inthe number of threads. Good results are obtained with respect tosingle thread execution from 2–3 threads onward for low values oftolerance (0.2 km and 1 km in the tests). For medium tolerancelevels (5 km in the tests) the improvement starts at 7 threads, butno speedup higher than 2� is obtained. For high levels oftolerance (25 km in the tests) no improvement is obtained withrespect to single thread execution. In any case, before facing theissue of a larger piece of work in parallel, it is necessary to carryout a suitability study where variables such as tolerance level,number of entities to be simplified and their number of verticeswill have to be taken into account.

As future work and in order to reduce the running time, thepossibility of using POSIX Threads instead of OpenMP could beentertained. This could cut down the time spent in the creationand destruction of execution threads (in POSIX Threads this shouldbe done manually). It is to be expected that forthcoming versionsof the OpenMP Standard will support the capability of breaking afor loop so that the running time could be slightly shortened.

7. Software

The software implemented in this study can be downloadedfrom the repository https://bitbucket.org/jgpallero/rls. The sourcecode of the original and robust algorithms can be obtained fromthis website as well as a program that utilizes them and gets theinput data through an ASCII file. In addition to the code, someprecompiled binaries of the program (ready to use in MS Win-dows, GNU/Linux and FreeBSD systems) can be downloaded toofor both 32 and 64 bit systems. The code is distributed as freesoftware under the terms of the Apache 2.0 license.8

Acknowledgements

I am truly grateful to all the people out there who makefree software. My thanks to María Charco Romero (Institute of

Geosciences, IGEO, Spain) for letting me access her workstationand perform some algorithm testing. I thank also the anonymousreferees for their reviews of this paper.

References

AGENT, 1999. Selection of Basic Algorithms. Technical Report, University of Zurich.Bertolotto, M., Zhou, M., 2007. Efficient and consistent line simplification for web

mapping. International Journal of Web Engineering and Technology 3 (January(2)), 139–156.

Cecconi, A., Weibel, R., Barrault, M., 2002. Improving automated generalisation foron-demand Web mapping by multiscale databases. In: Symposium on Geos-patial Theory, Processing and Applications, Ottawa.

Chapman, B., Jost, G., van der Paas, R., 2008. Using OpenMP. Portable SharedMemory Parallel Programming, 1st edition The MIT Press, Cambridge,Massachusetts.

Douglas, D.H., Peucker, T.K., 1973. Algorithms for the reduction of the number ofpoints required to represent a digitized line or its caricature. The CanadianCartographer 10 (2), 112–122.

Ebisch, K., 2002. A correction to the Douglas–Peucker line generalization algorithm.Computers and Geosciences 28 (October (8)), 995–997.

Gorman, G.J., Piggott, M.D., Pain, C.C., 2007. Shoreline approximation for unstruc-tured mesh generation. Computers and Geosciences 33 (May (5)), 666–677.

Harrower, M., Bloch, M., 2006. MapShaper.org: a map generalization Web service.IEEE Computer Graphics and Applications 26 (July/August (4)), 22–27.

Jenks, G.F., 1989. Geographic logic in line generalization. Cartographica 26 (1),27–43.

Mackaness, W.A., Mackechnie, G.A., 1999. Automating the detection and simplifica-tion of junctions in road networks. GeoInformatica 3 (2), 185–200.

McMaster, R.B., 1987. Automated line generalization. Cartographica 24 (2), 74–111.O'Rourke, J., 1998. Computational Geometry in C, 2nd edition Cambridge University

Press, Cambridge, UK.Saalfeld, A., 1999. Topologically consistent line simplificationwith the Douglas–Peucker

algorithm. Cartography and Geographic Information Science 26 (January (1)),7–18.

Wang, Z., Muller, J.C., 1993. Complex coastline generalization. Cartography and GIS20 (2), 96–106.

Wessel, P., Smith, W.H.F., 1996. A global, self-consistent, hierarchical, high-resolution shoreline database. Journal of Geophysical Research 101 (April (B4)),8741–8743.

Wu, S.-T., da Silva, A.C.G., Márquez, M.R.G., 2004. The Douglas–Peucker algorithm:sufficiency conditions for non-self-intersections. Journal of the Brazilian Com-puter Society 9 (3), 67–84.

Wu, S.-T., Gonzales Márquez, M.R., 2003. A Non-self-intersection Douglas–PeuckerAlgorithm. In: XVI Brazilian Symposium on Computer Graphics and ImageProcessing (SIBGRAPI’03), São Carlos, Brazil, p. 60.

8 http://www.apache.org/licenses/LICENSE-2.0.

J.L.G. Pallero / Computers & Geosciences 61 (2013) 152–159 159