a procedure for semi-automatic segmentation in obia based on the

16
A Procedure for Semi-automatic Segmentation in OBIA Based on the Maximization of a Comparison Index Andres Auquilla 1,3 , Stien Heremans 2 , Pablo Vanegas 1 , and Jos Van Orshoven 2 1 Computer Science Department, Universidad de Cuenca, Cuenca, Ecuador, {andres.auquilla,pablo.vanegas}@ucuenca.edu.ec 2 Department of Earth and Environmental Sciences, KU Leuven, Celestijnenlaan 200E, B-3001 Leuven, Belgium [email protected], [email protected] 3 Centre for Industrial Management, Department of Mechanical Engineering, KU Leuven, Celestijnenlaan 300A, B-3000 Leuven, Belgium Abstract. In an Object Based Image Analysis Classification (OBIA) process, the quality of the classification results are highly dependent on segmentation. However, a high number of the studies that make use of an OBIA process find the segmentation parameters by making use of trial-and-error methods. It is clear that a lack of a structured procedure to determine the segmentation parameters produces unquantified errors in the classification. This paper aims to quantify the effects of using a semi-automatic approach to determine optimal segmentation parame- ters. To this end, an OBIA process is performed to classify land cover types produced by both a manual and an automatic segmentation. Even though the classification using the manual segmentation outperforms the automatic segmentation, the difference is only 2%. Since the automatic segmentation is performed with optimal parameters, a procedure to ac- curately determine those parameters must be performed to minimize the error produced by a misjudgment in the segmentation step. Keywords: OBIA,segmentation,classification,supportvectormachines, segmentation parameters, comparison index. 1 Introduction An Object Based Image Analysis (OBIA) is a two-step process that involves the segmentation of spatial objects, and the classification thereof. In this process, the segmentation step is very important since, the classification accuracy is highly dependent on the quality of the segmented objects [1]. Literature provides suc- cessful studies that used an OBIA approach to classify land cover types based on multispectral imagery [2,3,4]. Traditionally in remote sensing, a pixel based paradigm is used for classification of land cover types. However, an OBIA is now a feasible alternative that even outperforms pixel based approaches in terms of classification accuracy [2,3,4,5,6]. B. Murgante et al. (Eds.): ICCSA 2014, Part I, LNCS 8579, pp. 360–375, 2014. c Springer International Publishing Switzerland 2014

Upload: truonglien

Post on 10-Dec-2016

213 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation

in OBIA Based on the Maximization of aComparison Index

Andres Auquilla1,3, Stien Heremans2, Pablo Vanegas1, and Jos Van Orshoven2

1 Computer Science Department, Universidad de Cuenca, Cuenca, Ecuador,{andres.auquilla,pablo.vanegas}@ucuenca.edu.ec

2 Department of Earth and Environmental Sciences, KU Leuven, Celestijnenlaan200E, B-3001 Leuven, Belgium

[email protected], [email protected] Centre for Industrial Management, Department of Mechanical Engineering, KU

Leuven, Celestijnenlaan 300A, B-3000 Leuven, Belgium

Abstract. In an Object Based Image Analysis Classification (OBIA)process, the quality of the classification results are highly dependent onsegmentation. However, a high number of the studies that make use ofan OBIA process find the segmentation parameters by making use oftrial-and-error methods. It is clear that a lack of a structured procedureto determine the segmentation parameters produces unquantified errorsin the classification. This paper aims to quantify the effects of using asemi-automatic approach to determine optimal segmentation parame-ters. To this end, an OBIA process is performed to classify land covertypes produced by both a manual and an automatic segmentation. Eventhough the classification using the manual segmentation outperforms theautomatic segmentation, the difference is only 2%. Since the automaticsegmentation is performed with optimal parameters, a procedure to ac-curately determine those parameters must be performed to minimize theerror produced by a misjudgment in the segmentation step.

Keywords: OBIA, segmentation, classification, support vectormachines,segmentation parameters, comparison index.

1 Introduction

An Object Based Image Analysis (OBIA) is a two-step process that involves thesegmentation of spatial objects, and the classification thereof. In this process, thesegmentation step is very important since, the classification accuracy is highlydependent on the quality of the segmented objects [1]. Literature provides suc-cessful studies that used an OBIA approach to classify land cover types basedon multispectral imagery [2,3,4]. Traditionally in remote sensing, a pixel basedparadigm is used for classification of land cover types. However, an OBIA is nowa feasible alternative that even outperforms pixel based approaches in terms ofclassification accuracy [2,3,4,5,6].

B. Murgante et al. (Eds.): ICCSA 2014, Part I, LNCS 8579, pp. 360–375, 2014.c© Springer International Publishing Switzerland 2014

Page 2: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 361

Whereas the salt and pepper effect arises when the pixel-based approach isapplied in Very High Resolution (VHR) images [7], OBIA partitions the imageinto meaningful and homogeneous objects with similar spectral characteristics[1]. The pixel-based paradigm has limitations because pixels are not geographicalobjects as such, and is unable to use texture, context and shape information [1],[3]. OBIA paradigm allows objects to encapsulate spectral values, and attributessuch as shape, texture, and morphology [8]. Moreover, it avoids the salt andpepper effect, since the segmented objects represent land cover types containedin homogeneous regions which, may be spectrally variable at the pixel level [4]. Toobtain meaningful objects, the segmentation parameters must be set accordingto the resolution and scale of the real world objects [9].

There are numerous algorithms to perform segmentation in an OBIA ap-proach. One of the most used ones is the Fractal Net Evolution Approach pro-posed by [10] and implemented in the software Definiens eCognition [11], [12].This algorithm is based on a bottom-up region growing approach, where smallerobjects are merged to create bigger ones [13]. Additionally, it creates segmentsbased on homogeneity criteria and a scale parameter [10]. Literature reports thatthis algorithm has been widely used, e.g. [4], [7], [13], [14]. As for the subsequentclassification step, Machine Learning approaches offer a good alternative to clas-sical statistical classifiers as they impose minimal prior assumptions to the data[15], [16]. Support Vector Machines usually outperform other Machine Learningclassifiers in terms of classification accuracy when applied to land cover [2], [6],[13], [17]. SVM has emerged as a promising technique for classification of mul-tispectral images with a low number of training samples in a high dimensionalspace [6]. In a classification of land cover types, usually, several classes must beassociated to the input data, and taking into account that SVM are intrinsicallybinary classifiers [18], a multi-class problem must be reduced to a binary one.Several approaches, such as one versus all and one versus one, can be applied toperform multi-class classification using binary SVM classifiers [17].

Even though the OBIA paradigm has been extensively studied in literature,the effects of the segmentation parameters on the classification results have notbeen well addressed. It is clear that a process to assess the reliability of thesegmentation process is needed. A segmented object must represent a real worldentity and not only artifacts in the image [14]. This issue has been tackled byseveral methodologies, which have been developed in order to obtain optimalsegmentation parameters [19]. Since there are no general accepted guidelinesto select the optimal segmentation parameters, trial and error approaches areoften implemented [2]. In other cases, the scale parameter selected is set equalto the average size of the land cover patches in the image [4]. Worth to mentionthat several studies did not use any structured way to establish the optimalsegmentation parameters, e.g: [3], [6].

In general, the evaluation methods for segmentation can be divided in threegroups: (1) visual, (2) supervised and, (3) unsupervised. [14] proposed a super-vised methodology that involves the comparison of multiple segmentations gen-erated with different sets of parameters with ground truth. In this case, ground

Page 3: A Procedure for Semi-automatic Segmentation in OBIA Based on the

362 A. Auquilla et al.

truth consists of manually defined reference segments for a certain subset of thestudy area. The comparison between the automatic segments and the groundtruth is based on a dissimilarity measure. A disadvantage of this method is thatthe definition of ground truth (manual creation of segments) is a subjective taskprone to errors [19].

This study performs a comparison between a semi-automatic and a manualsegmentation with the aim to classify six land cover classes. The classificationresults of the manual segmentation were taken as a reference for assessing theclassification results of the semi-automatic segmentation. The latter was car-ried out using the Fractal Net Evolution Approach (FNEA) algorithm and, amethodology to optimize the segmentation parameters similar to the proposedin [14].

2 Study Area, Data and Software

The study area is part of the Millingerwaard which is a nature area locatedalong the river the Waal, near Nijmegen (Figure 1.a) in the Netherlands. TheMillingerwaard area extends over 800 hectares approximately. 10% is a forestreserve (white willow trees) and the remaining part consists of swamps, meadowsand surface water. Due to its high diversity degree, Millingerwaard is part ofthe Natura 2000 site Gelderse Poort. The study area extends approximately500 meters from west to east and 1200 meters from north to south containingvegetation complexes classes that range from meadows to forest. This area isprone to shrub encroachment.

a) Study area b) Canopy model c) Location in the Netherlands

Fig. 1. Overview of the study area located in the Millingerwaard

Within the study area, five land cover classes of interest are identified. Theirrepresentations are depicted in Figure 2. Roughly, their division according to theheight of vegetation are: water, bare soil, rough grass (< 1m), shrubs (1 − 5m)and trees (> 5m).

Page 4: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 363

Fig. 2. Indication of the five classes in the study area

2.1 Datasets

Multispectral Image and LIDAR Information. The analysis was per-formed using multi-spectral images acquired in 2012 by an unmanned aerialsystem equipped with a TetraCam MiniMCA camera. The images were takenfrom about 200 meters above the surface. The sensor includes four bands of 10nm wide in the green, red, red-edge and NIR areas of the spectrum (respectivelyband centers of 550 nm, 670 nm, 700 nm and 780 nm). These sensor settingsare ideal for vegetation mapping and the spatial resolution of 40 cm is detailedenough to perform an OBIA to classify the five classes of interest. As for theLIDAR, data from the AHN2 database (Algemeen Hoogtebestand Nederland)was used to create a canopy model (Figure 1.b). The LIDAR dataset has apoint density of 16 points per m2. The canopy model was created by subtractingthe AHN2 height from the original LIDAR height. This dataset was used as anadditional input in the segmentation process to improve classification results.

Test and Training Points. A geodataset containing points with the exact GPSmeasured locations of sample elements of the five classes of interest was used todetermine the test objects for the classification. This geodataset contains, in ashape file, 25 points for every class of land cover type. The training points werecreated by visual interpretation of the multispectral image since, there was noground information regarding this. A total of 100 points per class were created.

Manual Reference Segments. Reference segments were created manually byvisual interpretation of the study area. These segments were used to compare theclassification performance using a manual segmentation (manual reference seg-ments) with the automatic one, produced by the FNEA algorithm using the op-timal segmentation parameters. Six circular patches were selected, on the studyarea, in which this manual segmentation was performed.

Page 5: A Procedure for Semi-automatic Segmentation in OBIA Based on the

364 A. Auquilla et al.

2.2 Software

The imagery, used as input, was processed to validate and normalize it. Thepreprocessing step, before the segmentation, involves operations of the multi-spectral and LIDAR data such as, merging, clipping and interpolation to fillgaps in the LIDAR. The software used for this task was Quantum GIS [20]. Forthe classification process, the scikit-learn package [21] implemented in Python[22] was used. Whereas the segmentation process was performed using DefinienseCognition Developer v8.8 [11], [12]. The information about the polygons wasstored in PostgresSQL [23] and PostGis [24].

3 Methods

Figure 3 depicts the general procedure used in this work, and shows the as-sociation between the inputs and outputs for each of the processes involved.The optimal segmentation parameter selection was the first process to be car-ried out. Then, the automatic and manual segmentations were created; LIDARheight data was added in this process. Finally, the classification was performedfor the two kinds of segmentations.

Fig. 3. General overview of the methodology used in this work

3.1 Optimal Segmentation Parameter Selection

Since the first step in an OBIA process is segmentation, a quality assessmentof the segmented objects must be carried out. When quality assessment is notperformed as part of the process, the probability of classification errors increasessince, there is no certainty that the segmented objects correspond to real-worldentities. To tackle this problem, an approach similar to that of [14] was applied;the only difference is that [14] performs a k-means algorithm to cluster similar

Page 6: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 365

objects in terms of position of gravity centers and areas of the intersected poly-gons. On the other hand, the method used in this work performs spatial queriesthrough the postGIS module. Two calculations were performed to compute val-ues related to under and over-segmentations. The latter is divided in the follow-ing operations: (1) Determination of the polygons that intersected the manualand the automatic segmentation, (2) Determination of the areas and distances(between centroids) of the intersected polygons with the manual polygons, (3)Computing the maximal distance of the intersected polygons, (4) Determinationof two ratios: area of the intersected polygon and area of the manual polygon,and distance of the intersected polygon and the maximal distance, and (5) theratio between the area of manual polygons and the circular patch that containsthem (see Figure 4). As for the under-segmentation, the process is similar; thedifference is that the automatic segmentation was taken as a reference. Bothmethods, k-means clustering and spatial queries, aim to determine the relationsbetween the super-objects and intersected objects contained in a region of inter-est. We refer the reader to [14] for detailed information about the methodologyto asses the quality of the segmentation.

A segmentation scenario is a combination of two parameters: shape and com-pactness. For every scenario, a total of eight scale parameters values, rangingfrom 10 to 80 with increments of 10, was used. The optimal combination ofparameters, to perform segmentation, was found by comparing the ComparisonIndex (CI) values for different segmentation scenarios. The one that providedthe highest CI value was the one that provided the optimal set of parameters forthe segmentation. A total of five segmentation scenarios (Table 1) were createdto select the optimal segmentation parameters. The five scenarios used in thiswork were the same as the ones used in [14]. In total, 40 segmentations wereperformed (5 segmentation scenarios x 8 scale values).

Table 1. Five segmentation scenarios used to assess the segmentation quality

Parameter Scenario

S1 S2 S3 S4 S5shape 0.2 0.4 0.4 0.4 0.5

compactness 0.5 0.5 0.3 0.7 0.3scale (H) 10,20,...,80

The FNEA algorithm allows to provide different weights to the layers in thesegmentation process [25]. In this work, all the layers were configured with aweight equal to 1. This means that all the layers had equal importance for thealgorithm.

In addition to the automatic segments product of the FNEA algorithm, aset of reference manual segments was created to compare their similarities interms of common area and gravity center distances. As described in [14], a set ofregions in the area of interest had to be defined. Then, for every selected area,a manual delimitation of the objects was done. Six circular patches (Figure 4)

Page 7: A Procedure for Semi-automatic Segmentation in OBIA Based on the

366 A. Auquilla et al.

were defined aiming to select the most representative areas for all the land coverclasses considered since, the land cover classes are not equally represented in thestudy area. Afterward, a manual segmentation process was performed by visualinterpretation. Although a circular shape was used in this study, other geometricshapes are also feasible.

Fig. 4. Circular regions with manual segments

For the sake of simplicity, the polygons produced by the optimal segmentationparameters are said to be the result of automatic segmentation and, the polygonsproduced manually are referred to as manual segmentation.

The segmentation validation is a problem of matching objects [26], and re-lations between objects can be explained in terms of topology and geometry.In the former, the relations of overlap and containment are analyzed [14], [26].In the latter, geometric difference are explained in terms of distances betweenobjects. Usually, the gravity centers are used to compute the distances betweenthe objects [14]. In the methodology described by [14], the intersection of thesegments (automatic-manual, manual-automatic) are compared with the objectthat contains the intersected areas. In this process, a super object is an objectthat contains objects produced by the intersection operation. This process isperformed twice since, the first time, the manual segmentation is used as superobjects (over-segmentation). Then, the automatic segmentation is used as superobjects (under-segmentation). Equation 1 shows the formula used to determinethe geometric grade of alikeness between a super object and an intersected area.

RAso =m∑

i=1

Ai

Aso(1)

Where, m is the number of polygons that intersects the super object analyzed,Ai is the area of the intersected polygon, Aso is the area of the super object and,

Page 8: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 367

0 ≤ RAso ≤ 1. A value of 1 represents a perfect match between the intersectedpolygon and the super object. Equation 2 shows the formula to determine thetopological grade of alikeness between a super object and an intersected area.

RPso =

m∑

i=1

didmaxi

(2)

Where, m has the same meaning as in Equation 2, di is the distance betweenthe gravity centers of the intersected polygon and the super object, dmaxi is thehighest value of distance between the intersected areas contained in the superobject and 0 ≤ RPso ≤ 1. A value of 0 means that the intersected polygonand the super object gravity centers are located in the same place. RAso andRPso were combined to come up with a comparison index to determine thegrade of comparison between the automatic and manual segments in terms ofover and under-segmentation. Equation 3 shows the formula for the ComparisonIndex (CI).

CI =

n∑

i=1

100 ∗ Ci ∗ (RAsoi − (1−RPsoi))

2 ∗Ncircles(3)

Where, n is the number of super objects, Ci is the size relation between thesuper-object analyzed with the total area used in the analysis and, Ncircles isthe number of circular areas used in the study. In this case, Ncircles is six.

Two CI values were computed for every segmentation scenario and scale pa-rameter. One value represented the CI value of the over-segmentation. The othervalue represented the CI of under-segmentation. Thus for every scenario, twocurves with the CI value as a function of the scale parameter (H) were created.The scenario that provided the highest CI value, at the intersection of the overand under-segmentation curves, was the one that provides the optimal set ofsegmentation parameters.

3.2 Segmentation

After selecting the optimal segmentation parameters, the segmentation processwas carried out. To analyze the effects of using the CI-based optimization todetermine the optimal segmentation parameters, two segmentations were per-formed (manual and automatic). Furthermore, to improve the classification ac-curacy, LIDAR height information is attached to the segmented objects and,more input features are available for the classification process [5], [27].

The workflow for the first automatic segmentation is the following: (1) mul-tiresolution segmentation using the optimal segmentation parameters; (2) dele-tion of very small objects (noise), which are absorbed by the biggest object thatencompasses them; (3) spectral difference operation with a value of three; (4)export results to a shape file. The software eCognition provides a vast amount ofpossibilities to retrieve information [11], [12]. To select the list of features, threegroups of information were envisaged i.e. layer values, geometry and texture.

Page 9: A Procedure for Semi-automatic Segmentation in OBIA Based on the

368 A. Auquilla et al.

The workflow for the manual segmentation is the following: (1) multiresolutionsegmentation using the manually delimited dataset as the boundary; (2) exportresults to a shape file. At the end of this step, two shape files containing theobjects and related information for the automatic and manual segmentationswere generated.

3.3 Classification

The second step in an OBIA is the classification of the segmented objects. Classi-fication was performed using one Machine Learning algorithm: SVM. A selectionof training objects was done by visual interpretation of the objects contained inthe multispectral image. In total, 100 objects per class were selected for thetraining phase.

There were classes that could be very easily located in the study area. Forinstance, classes such as: trees, water and bare soil. On the other hand, classessuch as rough grass and shrubs are not easy to distinguish.

Visual interpretation was done by selecting in a random way objects of thedifferent classes. Furthermore, special attention was put on selecting a widerange of examples. Additionally, the training objects were selected only insidethe circular regions previously defined.

To select the test objects, the geodataset containing the test points was usedto intersect the polygons created by the manual and automatic segmentation.

Table 2. Grid search parameters for the RBF kernel

Parameter Generated Values

C 3.0 ** np.arange(-2, 9)Gamma 3.0 ** np.arange(-5, 4)

To compute the average accuracy for every kind of segmentation, ten clas-sification were performed per segmentation. In this way, the average and thestandard deviation were computed. The SVM classification was performed usingthe scikit-learn module [21] in Python module. Besides, two kernels were tested:RBF and Linear. The only option for a multi-class problem in this module is onevs one. The parameter tuning was performed using a simple grid search of valuesas suggested in [21]. Table 2 shows the parameter values used in the grid searchfor the RBF kernel. The Linear kernel was used with the default parametersimplemented in the scikit-learn module.

4 Results

4.1 Optimal Parameters Selection

Five scenarios were analyzed. For every scenario, eight segmentations were per-formed with scale values in the range of 10 to 80 with increments of 10. The

Page 10: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 369

resulting CI diagrams are shown in Figure 5. The intersection point between thetwo curves was selected as the optimal scale parameter since, that scale param-eter provides the equilibrium point between over and under-segmentations.

As can be seen in Figure 5 and Table 3, the scenario that provided the optimalsegmentation parameters setting was number one with a scale parameter equalto 19, shape equal to 0.2 and compactness equal to 0.5. All the analyzed scenariosprovided slightly similar CI values. Indeed, the highest value provided is 26 andthe lowest 25.3. As for the optimal scale parameter (H), the range goes from14 to 19. The CI curves of the scenario 5 did not intersect each other in theanalyzed range.

Table 3. CI and H values, at the intersections of the over and under segmentationcurves, per scenario

Scenario Intersection CIValue

Scale Value(H)

1 26 192 25.3 143 25.5 174 25.5 145 - -

4.2 Segmentation

Two segmentations were performed to generate the manual and automatic seg-ments. Figure 6 depicts one representative area of the image generated by theautomatic segmentation. Figure 6 shows the segmentation operation after apply-ing the multispectral difference operation with a value of three. This operationmerges neighboring image objects if their layer mean intensities difference is be-low a certain value (maximal spectral difference). For this study, the maximalspectral difference value used was three and, was found by a trial and errorprocess and visual interpretation of the results.

4.3 Classification

The final step of the OBIA paradigm is classification. Thus, four classificationswere performed. Table 4 shows the overall accuracy results for each of the inputsegmentations that were the subject of classification. The values shown in Table4 are the average values of the accuracy after ten trials.

The linear kernel has the highest overall accuracy value. Furthermore, themanual segmentation outperforms the automatic. However, the automatic seg-mentation provides an overall accuracy value of only 2% lower than the manualsegmentation.

Page 11: A Procedure for Semi-automatic Segmentation in OBIA Based on the

370 A. Auquilla et al.

a) CI-H Scenario1 b) CI-H Scenario2

c) CI-H Scenario3 d) CI-H Scenario4

e) CI-H Scenario5

Fig. 5. CI-H Diagrams for the five scenarios analyzed

Fig. 6. Sample segmented area generated by the automatic segmentation

Page 12: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 371

Table 4. Overall classification accuracies on the two segmentations using a SVM clas-sifier with Linear and RBF kernels

Segmentation SVM Linear SVM RBF

Automatic 89.38% 80.53%Manual 91.15% 82.83%

As can be seen in Table 5, the bare soil, trees and water classes were classifiedaccurately. The classifier had problems in classify rough grass. It incorrectlyclassified shrubs and trees as rough grass.

Table 5. Confusion matrix of the SVM with linear kernel for the manual segmentation.Overall accuracy = 91.2%

Predicted/Actual Bare SoilRough

GrassShrubs Trees Water

User’s

Accuracy

Bare Soil 24 1 0 0 0 96%

Rough Grass 0 22 4 2 0 78.6%

Shrubs 0 0 19 2 0 90.5%

Trees 0 0 1 21 0 95.5%

Water 0 0 0 0 19 100%

Producer’s

Accuracy100% 95.7% 79.2% 84% 100% 91.2%

When the SVM classifier is trained using the automatic segmentation (Table6), the classifier had problems to classify shrubs; it misclassified as Rough Grass.As for the other classes in the problem, the classification accuracies were over90%.

Table 6. Confusion matrix of the SVM with linear kernel for the automatic segmen-tation. Overall accuracy = 89.38%

Predicted/Actual Bare SoilRough

GrassShrubs Trees Water

User’s

Accuracy

Bare Soil 24 0 0 1 0 96%

Rough Grass 1 23 7 1 0 71.9%

Shrubs 0 0 16 0 0 100%

Trees 0 0 2 21 0 91.3%

Water 0 0 0 0 17 100%

Producer’s

Accuracy96% 100% 64% 91.3% 100% 89.38%

Page 13: A Procedure for Semi-automatic Segmentation in OBIA Based on the

372 A. Auquilla et al.

5 Discussion

The results of this work provided evidence to support that the process of findingthe optimal scale parameter for the segmentation algorithm is complex, andmust be performed with care, since human interpretation is needed to define thecircular areas where the manual segmentation will be performed. Furthermore,the manual segmentation is correct when the segments encloses as accurately aspossible the real objects in the region of interest. Hence, this process is highlysubjective and requires experience to differentiate, by visual interpretation, thedifferent classes involved in the land cover classification problem.

Although LIDAR information was used as part of the input features, theresults in the classification process showed that the SVM classifier presentedproblems to differentiate between shrubs, rough grass and trees. Other studies,such as [5], [27], showed that by adding LIDAR information in the classificationprocess, the accuracy improves notoriously.

LIDAR data was not used in the selection of the optimal segmentation pa-rameter process since, by its addition, the segmentation output did not changenotoriously. On the other hand, by adding LIDAR, the computation time in-creased excessively.

The homogeneity of the physical objects contained in the study area had abig impact on the computation of the comparison index (CI). As stated in [14],the CI is a measure of how well a set of automatic and manual segmentationscoincide. When the overlap is perfect, the CI value is equal to 100. As can be seenin Figure 5, the obtained CI values were rather small for all the segmentationscenarios. [14] obtained a CI value slightly higher than 90. Our low CI value canbe explained by the fact that the objects in the study area are not homogeneousin terms of area. For instance, the area of water and bare soil objects are large incomparison to the shrubs. Additionally, as can be seen in Figure 1, the patchesof trees have a large variability in size. In the northern part, the patches arebig, while in the southern part, the tree patches are small. Furthermore, shrubareas, mostly, located in the central part of the study area have a small size. Insummary, the scales of the objects in the study area are very diverse, and thereis no single scale parameter that is able to characterize correctly their areas. Theimagery used in [13], [14], [19] contains objects with different sizes. However, thedifference in size between the objects is not as big as in the multispectral imageused in this study.

[13], [14], [19] all state that, smaller objects are to be preferred over largerobjects. In other words, when under-segmentation arises, it is not possible, nomatter the classifier used, to perform a correct classification [13]. This analy-sis suggest that the scale parameter obtained for this study is correct since,the overall classification accuracy, using the automatic segmentation, was high.However, a spectral difference operation had to be performed to merge similarneighbor objects into a bigger one. This operation became crucial, especially tocope with the diversity in object sizes between the different classes, since a scaleof 19 produced several small objects that represented a bigger one, e.g., waterand rough grass.

Page 14: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 373

The classification accuracies were highest with manual segmentation although,the classification accuracy values resulting from the automatic segmentation wereless than 2% lower. These results confirm the idea that the scale parameter of19 was correct.

The creation of manual segments is a challenging task since, there are noclear guidelines available. Besides, the information of the image bands is notfully used. In this scenario, only an expert could interpret correctly the combi-nation of colors of the multispectral image. Since the manual segmentation isa subjective task, errors are very likely to arise during this process. The onlyway to avoid interpretation errors is to define a segmentation based on groundtruth. Another possibility is that an expert, who has previous knowledge aboutthe study area, performs this task. In this study, the manual segmentation wasdone by visual interpretation. It is not possible to quantify the amount of errorresulting from the manual segmentation due to the lack of a ground-validatedmanual segmentation for the study area. A similar problem arises in the processof selecting objects to train the classifier. Even though some objects were veryeasy to determine such as trees, water and bare soil, there were others such asshrubs and rough grass patches which were not. This may be one of the reasonswhy the SVM classifier had problems in classifying certain classes. The confusionmatrices shown in Table 5 and Table 6 confirm that classifiers had problems inclassifying shrubs, rough grass and trees. Since there is no validated trainingpoints for the study area, it is not possible to quantify the amount of the errorproduced by an incorrect process of selecting training objects.

Although the process of finding the optimal segmentation parameters usedin our study seemed to be correct, another feasible methodology to assure thequality in segmentation is described in [19]. In that study, the idea of improvingthe segmentation by refining the under-segmentation with further segmentationsat finer scales and refining the over-segmentation by merging similar neighborobjects, after finding the optimal scale parameter, is interesting from the pointof view that sets of heterogeneous objects could benefit from this process. Theworkflow of the methods to assure the segmentation process is similar in a sensethat all of them perform several segmentations at different scale ranges anddefine comparability measures that must be maximized [13], [14], [19].

6 Conclusions

In this paper, an OBIA was performed on a multispectral image using optimalsegmentation parameters derived from CI-optimization. To analyze the effects ofthe optimal segmentation parameters, a manual segmentation was produced asa reference. The manual and automatic segmentations then were used to train aSVM classifier to extract six land cover classes. A SVM classifier with a Linearand RBF kernel was used in the classification step. The best overall accuracyresult was produced by combination of the manual segmentation and the linearkernel (91.15%). However, the overall accuracy result produced by the auto-matic segmentation, using a SVM with a linear kernel, was only 2% less than

Page 15: A Procedure for Semi-automatic Segmentation in OBIA Based on the

374 A. Auquilla et al.

the manual one. It implies that a procedure to determine the optimal segmenta-tion parameters can help to assure the quality in the OBIA process. The scaleparameter found in this study characterize correctly the small objects. How-ever, it produces over-segmentation in bigger objects such as water bodies andrough grass patches. To overcome this problem, a multispectral difference oper-ation had to be performed. Even though a new operation had to be performed,the scale parameter found in this study is the most feasible since, it minimizedthe under-segmentation. This fact is important because, as stated in [13], over-segmentation is preferred to under-segmentation to guarantee the classificationaccuracy. Since the multispectral difference operation had an important role inthis study, a future study might address their effects and how to determine theoptimal parameter for this operation.

Acknowledgments. We would like to acknowledge Prof. Lammert Kooistrafrom Wageningen University and Research Centre in the Netherlands who pro-vided the datasets that were used in this work.

The multispectral dataset was provided through the project URD-Delta Oostfunded by the program Urban Regions in the Delta of the The NetherlandsOrganisation for Scientific Research (NWO).

References

1. Hay, G., Castilla, G.: Object-based image analysis: Strengths, weaknesses, oppor-tunities and threats (SWOT). In: Proc. 1st Int. Conf. OBIA (2006)

2. Dronova, I., Gong, P., Clinton, N.E., Wang, L., Fu, W., Qi, S., Liu, Y.: Landscapeanalysis of wetland plant functional types: The effects of image segmentation scale,vegetation classes and classification methods. Remote Sensing of Environment 127,357–369 (2012)

3. Gao, Y., Mas, J.: A Comparison of the Performance of Pixel Based and ObjectBased Classifications over Images with Various Spatial Resolutions. Online Journalof Earth Sciences (8701) (2008)

4. Whiteside, T.G., Boggs, G.S., Maier, S.W.: Comparing object-based and pixel-based classifications for mapping savannas. International Journal of Applied EarthObservation and Geoinformation 13(6), 884–893 (2011)

5. Hantson, W., Kooistra, L., Slim, P.A.: Mapping invasive woody species in coastaldunes in the Netherlands: a remote sensing approach using LIDAR and high-resolution aerial photographs. Applied Vegetation Science 15(4), 536–547 (2012)

6. Tzotsos, A., Argialas, D.: A support vector machine approach for object based im-age analysis. In: . . . Conference on Object-Based Image Analysis . . . (Negnevitsky)(2006)

7. Yu, Q., Gong, P., Clinton, N.: Object-based detailed vegetation classification withairborne high spatial resolution remote sensing imagery. . . . Remote Sensing 72(7),799–811 (2006)

8. Navulur, K.: Multispectral Image Analysis Using the Object-Oriented Paradigm.CRC Press (2007)

9. Blaschke, T.: Object-based contextual image classification built on image segmen-tation. Advances in Techniques for Analysis of Remotely . . . 00(C) (2003)

Page 16: A Procedure for Semi-automatic Segmentation in OBIA Based on the

A Procedure for Semi-automatic Segmentation in OBIA 375

10. Baatz, M., Schape, A.: Multiresolution segmentation: an optimization approach forhigh quality multi-scale image segmentation. Angewandte Geographische . . . (2000)

11. Trimble: eCognition Developer 8.8 Reference Book (2012)12. Trimble: eCognition Developer 8.8 User Guide (2012)13. Liu, D., Xia, F.: Assessing object-based classification: advantages and limitations.

Remote Sensing Letters 1(4), 187–194 (2010)14. Moller, M., Lymburner, L., Volk, M.: The comparison index: A tool for assess-

ing the accuracy of image segmentation. International Journal of Applied EarthObservation and Geoinformation 9(3), 311–321 (2007)

15. Gahegan, M.: Is inductive machine learning just another wild goose (or might itlay the golden egg)? International Journal of Geographical Information Science 17,69–92 (2003)

16. Atkinson, P.M., Tatnall, A.R.L.: Introduction Neural networks in remote sensing.International Journal of Remote Sensing, 37–41 (2010, October 2012)

17. Melgani, F., Bruzzone, L.: Classification of hyperspectral remote sensing imageswith support vector machines. IEEE Transactions on Geoscience and Remote Sens-ing 42(8), 1778–1790 (2004)

18. Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 297, 273–297(1995)

19. Johnson, B., Xie, Z.: Unsupervised image segmentation evaluation and refinementusing a multi-scale approach. ISPRS Journal of Photogrammetry and Remote Sens-ing 66(4), 473–483 (2011)

20. QuantumGIS: Quantumgis (August 2013), http://www.qgis.org/21. scikit learn: scikit-learn (August 2013), http://scikit-learn.org/22. Python: Python (August 2013), http://www.python.org/about/23. PostgresSQL: Postgressql (August 2013),

http://www.postgresql.org/docs/9.2/interactive/index.html

24. PostGIS: Postgis (August 2013), http://postgis.net/25. Benz, U.C., Hofmann, P., Willhauck, G., Lingenfelder, I., Heynen, M.: Multi-

resolution, object-oriented fuzzy analysis of remote sensing data for GIS-readyinformation. ISPRS Journal of Photogrammetry and Remote Sensing 58(3-4),239–258 (2004)

26. Zhan, Q., Molenaar, M.: Quality assessment for geo-spatial objects derived fromremotely sensed data. . . . Journal of Remote Sensing 26(14), 2953–2974 (2005)

27. Mountrakis, G., Im, J., Ogole, C.: Support vector machines in remote sensing: Areview. ISPRS Journal of Photogrammetry and Remote Sensing 66(3), 247–259(2011)