thematic mapping of urban areas from worldview-2 satellite ... · support vector machines...

10
UCTEA Chamber of Surveying and Cadastre Engineers Journal of Geodesy and Geoinformation TMMOB Harita ve Kadastro Mühendisleri Odası Jeodezi ve Jeoinformasyon Dergisi © 2013 HKMO Vol.2 Iss.1 pp.29-38 May 2013 Journal No.107 Doi: 10.9733/jgg.070114.2 www.hkmodergi.org Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms Dilek Koc-San* Department of Space Sciences and Technologies, Akdeniz University, Antalya 07058, Turkey Volume: 2 Issue: 1 Page: 29 - 38 May 2013 Abstract Monitoring and mapping urban areas from remote sensing imagery using image classification techniques are important for urban and regional planners and municipalities. High resolution satellite images are essential data sources for the generation of urban land-use/land-cover maps. In this paper, urban thematic maps are generated from WorldView-2 satellite images using machine learning algorithms, namely random forest and support vector machines classifiers. The performances of these classifiers are evaluated and compared using four test areas that have different urban characteristics. The obtained visual and quantitative results for four test areas indicate the effectiveness of the machine learning algorithms in urban thematic mapping. The overall accuracies were computed in the range of 89.92 and 96.38 and overall kappa values were computed in the range of 0.8790 and 0.9566 for random forest classifier, which are considerably high. Similarly, for support vector machines classifier, the overall accuracies were computed in the range of 91.98 and 96.07 and overall kappa values were computed in the range of 0.9038 and 0.9528. The results also show that different classification accuracies for different test areas are related to the properties of the selected urban patterns. Keywords Urban thematic mapping, Random forest classifier, Support vector machines classifier, WorldView-2, Antalya Cilt: 2 Sayı: 1 Sayfa: 29 - 38 Mayıs 2013 Özet Kentsel Alanların WorldView-2 uydu görüntülerinden makine öğrenme algoritmaları kullanılarak tematik haritalanması Kentsel alanların uzaktan algılama görüntülerinden görüntü sınıflandırma teknikleri kullanılarak izlenmesi ve haritalanması şehir ve bölge plancıları ve belediyeler için önemlidir. Yüksek çözünürlüklü uydu görüntü- leri arazi-kullanımı/arazi-örtüsü haritalarının elde edilmesinde önemli veri kaynaklarıdır. Bu makalede, ma- kine öğrenme algoritmalarından rastgele orman ve destek vektör makineleri sınıflandırmaları kullanılarak WorldView-2 uydu görüntülerinden kentsel tematik haritalar elde edilmiştir. Bu sınıflandırmaların perfor- mansları seçilen farklı kentsel karakteristiklere sahip dört test alanında değerlendirilmiş ve karşılaştırılmış- tır. Dört test alanı için elde edilen görsel ve nicel sonuçlar, makine öğrenme algoritmalarının kentsel tematik haritalamada verimliliğini göstermektedir. Rastgele orman sınıflandırması kullanıldığında sınıflandırma doğrulukları 89.92 ile 96.38 değerleri arasında ve kappa değerleri 0.8790 ile 0.9566 değerleri arasında hesaplanmıştır ki bu değerler oldukça yüksektir. Benzer şekilde, destek vektör makineleri sınıflandırması kullanıldığında sınıflandırma doğrulukları 91.98 ile 96.07 değerleri arasında ve kappa değerleri 0.9038 ile 0.9528 değerleri arasında hesaplanmıştır. Sonuçlar, ayrıca farklı test alanları için elde edilen farklı sınıflan- dırma doğruluklarının seçilen kentsel dokuların özellikleriyle ilişkili olduğunu göstermektedir. Anahtar Sözcükler Kentsel tematik haritalama, Rastgele orman sınıflandırması, Destek vektör makineleri sınıflandırması, WorldView-2, Antalya *Corresponding Author: Tel: +90 (242) 310 2340 Fax: +90 (242) 227 8911 E-mail: [email protected]

Upload: others

Post on 28-May-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

UCTEAChamber of Surveying and Cadastre Engineers

Journal of Geodesy and Geoinformation

TMMOBHarita ve Kadastro Mühendisleri Odası

Jeodezi ve Jeoinformasyon Dergisi

© 2013 HKMO

Vol.2 Iss.1 pp.29-38 May 2013 Journal No.107 Doi: 10.9733/jgg.070114.2 www.hkmodergi.org

Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithmsDilek Koc-San* Department of Space Sciences and Technologies, Akdeniz University, Antalya 07058, Turkey

Volume: 2Issue: 1Page: 29 - 38May 2013

AbstractMonitoring and mapping urban areas from remote sensing imagery using image classification techniques are important for urban and regional planners and municipalities. High resolution satellite images are essential data sources for the generation of urban land-use/land-cover maps. In this paper, urban thematic maps are generated from WorldView-2 satellite images using machine learning algorithms, namely random forest and support vector machines classifiers. The performances of these classifiers are evaluated and compared using four test areas that have different urban characteristics. The obtained visual and quantitative results for four test areas indicate the effectiveness of the machine learning algorithms in urban thematic mapping. The overall accuracies were computed in the range of 89.92 and 96.38 and overall kappa values were computed in the range of 0.8790 and 0.9566 for random forest classifier, which are considerably high. Similarly, for support vector machines classifier, the overall accuracies were computed in the range of 91.98 and 96.07 and overall kappa values were computed in the range of 0.9038 and 0.9528. The results also show that different classification accuracies for different test areas are related to the properties of the selected urban patterns.

KeywordsUrban thematic mapping, Random forest classifier, Support vector machines classifier, WorldView-2, Antalya

Cilt: 2Sayı: 1Sayfa: 29 - 38Mayıs 2013

ÖzetKentsel Alanların WorldView-2 uydu görüntülerinden makine öğrenme algoritmaları kullanılarak tematik haritalanmasıKentsel alanların uzaktan algılama görüntülerinden görüntü sınıflandırma teknikleri kullanılarak izlenmesi ve haritalanması şehir ve bölge plancıları ve belediyeler için önemlidir. Yüksek çözünürlüklü uydu görüntü-leri arazi-kullanımı/arazi-örtüsü haritalarının elde edilmesinde önemli veri kaynaklarıdır. Bu makalede, ma-kine öğrenme algoritmalarından rastgele orman ve destek vektör makineleri sınıflandırmaları kullanılarak WorldView-2 uydu görüntülerinden kentsel tematik haritalar elde edilmiştir. Bu sınıflandırmaların perfor-mansları seçilen farklı kentsel karakteristiklere sahip dört test alanında değerlendirilmiş ve karşılaştırılmış-tır. Dört test alanı için elde edilen görsel ve nicel sonuçlar, makine öğrenme algoritmalarının kentsel tematik haritalamada verimliliğini göstermektedir. Rastgele orman sınıflandırması kullanıldığında sınıflandırma doğrulukları 89.92 ile 96.38 değerleri arasında ve kappa değerleri 0.8790 ile 0.9566 değerleri arasında hesaplanmıştır ki bu değerler oldukça yüksektir. Benzer şekilde, destek vektör makineleri sınıflandırması kullanıldığında sınıflandırma doğrulukları 91.98 ile 96.07 değerleri arasında ve kappa değerleri 0.9038 ile 0.9528 değerleri arasında hesaplanmıştır. Sonuçlar, ayrıca farklı test alanları için elde edilen farklı sınıflan-dırma doğruluklarının seçilen kentsel dokuların özellikleriyle ilişkili olduğunu göstermektedir.

Anahtar SözcüklerKentsel tematik haritalama, Rastgele orman sınıflandırması, Destek vektör makineleri sınıflandırması, WorldView-2, Antalya

*Corresponding Author: Tel: +90 (242) 310 2340 Fax: +90 (242) 227 8911

E-mail: [email protected]

Page 2: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

30 Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms

1. Introduction

The land-use/land-cover detection from remotely sensed data is an important research area and has been attracted great attention in remote sensing community. In urban planning, there is a great need for current urban land-use/land-cover maps. These maps are important sources for the determina-tion of urban areas, urban facility distributions, monitoring urban changes and updating urban maps. The detected urban objects in land-cover maps are quite important for a variety of applications related to urban such as urban planning and management, urban information systems, GIS applications, and 3D city modeling.

Using remote sensing imagery for urban thematic map-ping purposes is a rapid, accurate and cost-effective method. For accurate urban area mapping high spatial resolution im-ages (≤ 5m) are needed (Welch 1982; Ridd 1995). With the recent technological developments, high spatial resolution satellite images have been used increasingly and efficiently for accurate and reliable land-cover mapping of urban areas. The remote sensing images obtained by high spatial resolu-tion satellite sensor are important data sources for land-cover classification. Among them, WorldView-2 satellite imagery became prominent with its eight multispectral bands and increased fidelity. There are studies that use high resolu-tion satellite/aerial imagery for urban land-cover mapping. Yonezawa (2007) used maximum likelihood classification combined with spectral angle mapper algorithm for urban land cover detection from QuickBird imagery and concluded that the proposed algorithm increase the classification accu-racy. Huang et al. (2007) compared maximum likelihood, back-propagation neural network, probability neural network and support vector machines classifiers using QuickBird satellite imagery for feature extraction in urban areas. Pacifici et al. (2009) performed neural network classifier for urban area classifi-cation from Quickbird and WorldView-1 satellite imagery. Chen et al. (2012) proposed a modified object-oriented clas-sification algorithm for high resolution image classification and applied this algorithm on IKONOS satellite images. On the other hand, there are studies that use Lidar data in addi-tion to high resolution imagery for urban thematic mapping (Guan et al. 2013; Berger et al. 2013).

There are lots of classification techniques that can be used in land use/land cover classification, such as unsupervised, parametric supervised, non-parametric supervised classifica-tion techniques. Machine learning algorithms are non-para-metric supervised classification techniques and relatively new research area in remote sensing. Among them, ensemble learning techniques and Support Vector Machines (SVM) Classifier attract great attention in the remote sensing ap-plications. Ensemble learning techniques, such as bagging and boosting, use a set of decision trees and therefore gen-erally provide better performances than a single classifier (Breiman 1996; Gislason et al. 2006, Rodriguez-Galiano et al. 2012). Compared with bagging, boosting provides more accurate results. However, bagging is computationally more effec-tive than boosting. The random forest (RF) classifier utilizes improved bagging algorithm. The RF classifier has compu-tational efficiency and often provides higher performance

than other classifiers (Waske et al. 2012). On the other hand, SVM classifier is based on statistical learning theory. It was initially proposed for binary classification and afterward ex-panded for multiclass classification problem. The SVM clas-sification has recently been used in remote sensing research area providing promising and challenging results.

In the literature, there are studies that compare RF classifier with other classification techniques. Waske and Braun (2009) compared RF classification algorithm with Decision Tree, Maximum Likelihood, Boosted Decision Tree algorithms for land-cover mapping of agricultural areas from multispec-tral SAR imagery and found that RF outperform the other classification algorithms. In the study conducted by Rodri-guez-Galiano and Chica-Rivas (2012), the performances of classification trees, artificial neural networks, support vector machines, and random forests were evaluated. They con-cluded that RF provides better classification accuracy than the other classification techniques. Palsson et al. (2012) used RF and Support Vector Machines (SVM) classification tech-niques to classify urban satellite images and they showed that RF classifier produced better classification accuracies in almost all cases. In the study carried out by Akar and Güngör (2012), the RF classification results were compared with Gentle Ad-aBoost, SVM and Maximum Likelihood Classification. They revealed that the RF classification provides higher accura-cies.

The superiority of SVM over other classification techniques has been indicated in the previous studies. Huang et al. (2002) evaluated the comparative performances of SVM, maxi-mum likelihood, neural network and decision tree classifiers and found that SVM gave higher accuracies in most cases. Kavzoglu and Colkesen (2009) compared SVM and maxi-mum likelihood classifiers and reported that SVM outper-form maximum likelihood classifier. In the study performed by Watanachaturaporn et al. (2008), the SVM classifier was compared with other popular classifiers (maximum likeli-hood, decision tree, back-propagation neural network and ra-dial basis function network). In this study, they indicated that SVM has potential for accurate multisource classification. Besides, in the studies performed by Waske et al. (2009), Sesnie et al. (2010), Dalponte et al. (2013) and Koc-San (2013), the performances of SVM and RF classifiers were compared. In these studies, the SVM classification technique offered slightly better results than RF classification. On the other hand, there are studies that found classification perfor-mances of RF and SVM machine learning algorithms nearly equal (Pal 2005; Waske et al. 2009).

The aim of this study is to generate thematic maps of urban areas that have different characteristics using WorldView-2 satellite imagery. The performance and usability of two machine learning algorithms, namely Random Forests (RF) and Support Vector Machines (SVM) for urban thematic mapping are explored and compared. This article is organized as follows: in Section 2 the selected four test sites and the data used in this study are described. The methodology used for urban thematic mapping generation is explained in Section

Page 3: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

3931Dilek Koc-San / Vol.2 Iss.1 2013

3. Then, Section 4 presents and discusses the experimental results. Finally, the conclusions are provided in Section 5.

2. Test sites and used data

For preparing urban thematic maps four test areas showing different characteristics were selected from Muratpasa, Konyaalti and Kepez districts of Antalya city of Turkey (Figure 1). The test areas represent different density urban areas and include buildings with various shapes, sizes and rooftops. The selected first test area is located in Kaleici (the old quarter) in the Muratpasa district and includes low-rise (one or two storey) buildings that are used as residential, touristic and commercial purposes. The second test area is located in the city center of Antalya in Muratpasa district and contains medium and high rise buildings. The third test area

is a newly developed residential area including medium and high rise buildings and located in Konyaalti district. On the other hand, the fourth test area is located in Kepez district. In this test area, there are low rise residential buildings that are very close to each other.

The first test area has traditional residential patterns. The second and third test areas are planned and regularly devel-oped settlements, while the fourth test area is an unplanned residential area. In addition, when the spectral characteristics of the test areas were analyzed it can be stated that, the first and fourth test areas mostly include brick roof buildings, as well as concrete roof buildings. On the other hand, there are mostly concrete roof buildings in the second and third test areas. The second and third test areas have advantage over the first and fourth test areas, because the buildings in these test areas have larger sizes.

Figure 1: The true color composites of WorldView-2 satellite images covering the test areas those are located in Antalya province of Turkey: Test Area I (a), Test Area II (b), Test Area III (c), and Test Area IV (d).

Page 4: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

4032 Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms

The data used in this study are WorldView-2 panchro-matic and multispectral satellite imagery. The WorldView-2 satellite imagery is one of the highest spatial resolution satel-lite imagery available on board, with 0.5m panchromatic and 2m multispectral spatial resolutions. In addition, the World-View-2 satellite imagery is the first commercial high reso-lution satellite image that has eight spectral bands (Coastal, Blue, Green, Yellow, Red, Red Edge, Near-IR1 and Near-IR2). The images used in this study were acquired on April 6, 2012. The WorldView-2 satellite imagery were purchased as geometrically corrected by DigitalGlobe (Updike and Comp 2010; Oumar and Mutanga 2013). As a pre-process, the mul-tispectral image fused with panchromatic image using PAN-SHARP2 (High performance image fusion) algorithm of PCI Geomatica 2013 and 0.5 m color image (8 bands) was ob-tained. This pan sharpening algorithm produces excellent pan sharpening results and preserves the spectral characteristics of the original images (PCI Software Users Manual 2013) and therefore preferred in this study.

3. Methodology3.1. Training and testing samples collection

To perform image classification initially the classes were de-termined for each test area by analyzing the land-covers in-cluded in test areas visually and then the training and testing samples were collected independently. The training samples were selected using stratified random sampling technique from the initially collected rough areas. For each class 500 training samples were selected. In addition, the testing sam-ples were collected from different areas than training sam-ples for generating impartial reference data. In this study, 1000 testing samples were collected for each class.

3.2. Thematic mapping of urban areas using machine learning algorithms

In this study, two powerful machine learning algorithms, RF and SVM, were utilized for urban land-cover mapping. The RF classifier, which is an ensemble learning algorithm, is proposed by Breiman (2001) as a new and promising classi-fication technique. The usage of RF classifier in remote sens-ing applications is relatively new (Rodriguez-Galiano and Chica-Rivas 2012). It utilizes bagging for producing ensem-ble of classification and regression tree (CART) classifier. Using bootstrapping, one third of the training samples are used for out-of-bag error estimate and the remaining training samples are used for constructing the tree. There are two im-portant parameters in RF classification; number of trees (k) that is used for growing the RF and number of variables (m) that is used to split each node. These parameters should be defined by user. Further details on RF classification can be found in Breiman (2001).

In this study, the RF classification was carried out using ImageRF, which is an IDL based random forests classifica-tion tool for remotely sensed imagery (Jakimow et al. 2012 and Waske et al. 2012). The k and m values were decided by testing these values between 100 - 1000 in increments of 100 and 1 - 8 in increments of 1, respectively. For each test area, the best parameters were determined by evaluating the out-

of-bag accuracy, overall accuracy and computational time. For Test Area I, the optimum k and m values were selected as 100 and 2, respectively. For Test Area II, the same values were determined as 700 and 2. For Test Area III, the optimum k and m values were selected as 900 and 1, respectively. Fi-nally for Test Area IV, the optimum k and m parameters were determined as 300 and 3. During the RF classification, for all test areas Gini coefficient was utilized as impurity function.

The fundamentals of SVM classifier were introduced by Vapnik (1995). In SVM classification, the aim is to find op-timal separating hyperplane that separates the classes. For this purpose, initially the training samples are represented as feature vectors. For the linearly inseparable case, the feature vectors are transformed into a feature space using a kernel function. The support vectors that are located at the edge of class distributions are used for building the hyperplane. Further details on SVM classification can be found in Vapnik (1995) and Wang (2005). During SVM clas-sification, determination of the penalty parameter, selection of the kernel method and related parameters are important.

In the present study, pairwise SVM classification was car-ried out using ImageSVM tool that is developed by Geomatics Lab of Humboldt-Universitat zu Berlin (Van der Linden et al. 2010;Rabe et al. 2010). During the SVM classification radial basis function was used as the kernel method. The optimum values for gamma term and penalty parameter were deter-mined using 3 fold cross-validation technique. The both pa-rameters were tested in 0.0001-1000 ranges for all test areas. The optimum gamma term value was determined as 0.0001 for all test areas. Additionally, the optimum penalty param-eter value was found as 10 for Test Areas I and II, while the same parameter was selected as 100 for Test Areas III and IV.

3.3. Accuracy Assessment

The quantitative evaluation of the obtained urban thematic maps was performed using error matrices, which is a widely used technique for the accuracy assessment in remote sens-ing. Based on error matrices the overall accuracy and overall kappa coefficient values were computed to measure the clas-sification accuracy. Additionally, to measure the individual class accuracies producer’s accuracy, user’s accuracy and per-category kappa values were calculated. The overall ac-curacy, which measures the classification accuracy, is calcu-lated by dividing the number of accurately classified pixels by the number of whole pixels. The overall kappa measures the overall agreement. On the other hand, the producer’s ac-curacy indicates how well a class is mapped and the user’s accuracy shows the reliability of classification for a certain class. The per-category kappa measures the agreement of each class.

The accuracies obtained by using both machine learning algorithms were further compared using McNemar test that is used to evaluate the statistical significance differences of pair of classifiers. The McNemar test can be stated with fol-lowing formula (Foody 2004):

2112

221122 )1(ff

ffX

(1)

Page 5: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

4133Dilek Koc-San / Vol.2 Iss.1 2013

where f12 indicates the number of pixels that are correctly classified using first classifier, while incorrectly classified using second classifier and f21 shows the number of pixels that are correctly classified using second classifier but incor-rectly classified using first classifier.

4. Results and discussion

The urban thematic maps results generated by using RF and SVM classifiers are given in Figure 2. When the classifica-tion results are evaluated visually, it can be stated that the urban thematic maps were generated quite successfully. The obtained error matrices of RF classification results are pro-vided in Tables 1, 2, 3, and 4 for Test Areas I, II, III, and IV, respectively. Besides, the error matrices of SVM classifica-tion are given in Tables 5, 6, 7 and 8 for Test Areas I, II, III,

and IV, respectively. The obtained results indicate that the land-cover maps were produced quite accurately with over-all accuracy values higher than 89% for each test area using both classification techniques.

Using RF classifier, the overall accuracies were com-puted as 93.99% for Test Area I, 90.22% for Test Area II, 96.38% for Test Area III, and 89.92% for Test Area IV. The kappa coefficient values were computed as 0.9298, 0.8826, 0.9566, and 0.8790 for Test Areas I, II, III and IV, respec-tively. On the other hand, using SVM classifier, the over-all accuracies were computed as 94.00%, 92.68%, 96.07%, and 91.98% for Test Areas I, II, III, and IV, respectively. The kappa coefficient values were computed as 0.9300 for Test Area I, 0.9122 for Test Area II, 0.9528 for Test Area III, and 0.9038 for Test Area IV.

Figure 2: The RF and SVM classification results of Test Area I, Test Area II, Test Area III, and Test Area IV.

Page 6: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

4234 Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms

When urban thematic maps of the individual test areas are compared, the most accurate maps were generated for Test Area III using both classification techniques. This test area is a planned settlement and therefore regularly developed. The buildings in this area are regular and widely spaced. In ad-dition, the determined classes are spectrally distinct and the confusions between the classes are lower, when compared with the other test areas. The obtained thematic maps of Test Area I were also quite accurate. However, the overall accu-racy of this test area is lower than the Test Area III, because in this test area the size of the most buildings are quite small and the buildings are closely spaced. On the other hand, rela-

tively lower accuracies were obtained for Test Areas II. This test area includes dense urban pattern and most of the build-ings in this area have concrete roofs that leads to confusion with road and bareland classes. In addition, the drawback of this test area during the classification is that there are solar energy water heating systems on the roofs of the buildings that may lead to problems in classification process. Using both classifiers, the lowest accuracies were obtained for Test Area IV, which is an unplanned settlement. The buildings in this test area are very closely spaced and their sizes are too small leading to the decrease in the classification accuracy.

Table 1: The error matrix of the RF classification result for Test Area I.

Ground Truth

Class Building Building2 Vegetation Road Bareland Water Shadow Total

Building 999 31 4 16 21 1 5 1077

Building2 0 835 0 109 27 0 0 971

Vegetation 0 0 993 0 0 0 0 993

Road 0 114 0 868 51 1 0 1034

Bareland 1 6 0 7 901 0 0 915

Water 0 2 0 0 0 993 5 1000

Shadow 0 12 3 0 0 5 990 1010

Total 1000 1000 1000 1000 1000 1000 1000 7000

Producer's Accuracy (%) 99.90 83.50 99.30 86.80 90.10 99.30 99.00

User's Accuracy (%) 92.76 85.99 100.00 83.95 98.47 99.30 98.02

Overall Accuracy (%): 93.99; Overall Kappa: 0.9298

Table 2: The error matrix of the RF classification result for Test Area II.

Ground Truth

Class Building Building2 Vegetation Road Bareland Shadow Total

Building 872 9 0 226 157 7 1271

Building2 2 965 1 4 29 0 1001

Vegetation 0 0 999 0 0 0 999

Road 54 1 0 770 0 0 825

Bareland 68 25 0 0 814 0 907

Shadow 4 0 0 0 0 993 997

Total 1000 1000 1000 1000 1000 1000 6000

Producer's Accuracy (%) 87.20 96.50 99.90 77.00 81.40 99.30

User's Accuracy (%) 68.61 96.40 100.00 93.33 89.75 99.60

Overall Accuracy (%): 90.22; Overall Kappa: 0.8826

Table 3: The error matrix of the RF classification result for Test Area III.

Ground Truth

Class Building Vegetation Road Bareland Agriculture Shadow Total

Building 904 0 88 14 0 14 1020

Vegetation 0 1000 0 0 1 0 1001

Road 80 0 912 0 0 0 992

Bareland 10 0 0 985 3 0 998

Agriculture 0 0 0 1 996 0 997

Shadow 6 0 0 0 0 986 992

Total 1000 1000 1000 1000 1000 1000 6000

Producer's Accuracy (%) 90.40 100.00 91.20 98.50 99.60 98.60

User's Accuracy (%) 88.63 99.90 91.94 98.70 99.90 99.40

Overall Accuracy (%): 96.38; Overall Kappa: 0.9566

Page 7: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

4335Dilek Koc-San / Vol.2 Iss.1 2013

Table 4: The error matrix of the RF classification result for Test Area IV.

Ground Truth

Class Building Building2 Vegetation Road Bareland Shadow Total

Building 940 17 1 0 216 0 1174

Building2 0 818 0 56 89 0 963

Vegetation 0 0 999 0 0 0 999

Road 0 148 0 944 1 0 1093

Bareland 60 7 0 0 694 0 761

Shadow 0 10 0 0 0 1000 1010

Total 1000 1000 1000 1000 1000 1000 6000

Producer's Accuracy (%) 94.00 81.80 99.90 94.40 69.40 100.00

User's Accuracy (%) 80.07 84.94 100,00 86.37 91.20 99.01

Overall Accuracy (%):89.92; Overall Kappa: 0.8790

Table 5: The error matrix of the SVM classification result for Test Area I.

Ground Truth

Class Building Building2 Vegetation Road Bareland Water Shadow Total

Building 999 6 2 3 6 0 7 1023

Building2 0 836 5 107 18 0 12 978

Vegetation 0 0 992 0 0 0 0 992

Road 0 144 0 887 85 1 1 1118

Bareland 1 1 0 3 891 0 0 896

Water 0 0 0 0 0 998 3 1001

Shadow 0 13 1 0 0 1 977 992

Total 1000 1000 1000 1000 1000 1000 1000 7000

Producer's Accuracy (%) 99.90 83.60 99.20 88.70 89.10 99.80 97.70

User's Accuracy (%) 97.65 85.48 100.00 79.34 99.44 99.70 98.49

Overall Accuracy (%): 94.00; Overall Kappa: 0.9300

Table 6: The error matrix of the SVM classification result for Test Area II.

Ground Truth

Class Building Building2 Vegetation Road Bareland Shadow Total

Building 881 8 0 154 91 6 1140

Building2 0 965 0 0 27 0 992

Vegetation 0 0 1000 0 0 0 1000

Road 73 1 0 839 0 0 913

Bareland 41 26 0 0 882 0 949

Shadow 5 0 0 7 0 994 1006

Total 1000 1000 1000 1000 1000 1000 6000

Producer's Accuracy (%) 88.10 96.50 100.00 83.90 88.20 99.40

User's Accuracy (%) 77.28 97.28 100.00 91.89 92.94 98.81

Overall Accuracy (%): 92.68; Overall Kappa: 0.9122

Table 7: The error matrix of the SVM classification result for Test Area III.

Ground Truth

Class Building Vegetation Road Bareland Agriculture Shadow Total

Building 890 6 73 38 1 5 1013

Vegetation 0 994 0 0 0 0 994

Road 80 0 927 0 0 0 1007

Bareland 13 0 0 962 3 0 978

Agriculture 1 0 0 0 996 0 997

Shadow 16 0 0 0 0 995 1011

Total 1000 1000 1000 1000 1000 1000 6000

Producer's Accuracy (%) 89.00 99.40 92.70 96.20 99.60 99.50

User's Accuracy (%) 87.86 100.00 92.06 98.36 99.90 98.42

Overall Accuracy (%): 96.07; Overall Kappa: 0.9528

Page 8: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

36 Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms

Table 8: The error matrix of the SVM classification result for Test Area IV.

Ground TruthClass Building Building2 Vegetation Road Bareland Shadow Total

Building 960 3 0 0 142 0 1105Building2 3 865 5 39 95 6 1013

Vegetation 0 0 995 0 0 0 995Road 0 123 0 961 19 0 1103Bareland 37 6 0 0 744 0 787Shadow 0 3 0 0 0 994 997Total 1000 1000 1000 1000 1000 1000 6000Producer's Accuracy (%) 96.00 86.50 99.50 96.10 74.40 99.40User's Accuracy (%) 86.88 85.39 100.00 87.13 94.54 99.70 Overall Accuracy (%): 91.98; Overall Kappa: 0.9038

The producer’s accuracy, user’s accuracy and per-category kappa values of individual land-cover classes for four test ar-eas using RF and SVM classifiers are provided in Table 9 and 10, respectively. The producer’s and user’s accuracy values were calculated over 75% for most of the land-cover classes demonstrating the success of the RF and SVM classifiers for urban thematic mapping from WorldView-2 satellite imag-ery. Based on the obtained producer's accuracy, user's accu-racy and per-category kappa values of individual classes, the most accurately mapped land-cover classes were vegetation and shadow, because of their distinct spectral characteristics. On the other hand, the classes building/building2 and road were less accurately mapped, due to the spectral variability

of pixels belong to these land-cover types and similar spec-tral characteristics of these land-cover types with others. To be more specific for the first test area, vegetation and wa-ter classes were classified most accurately, while road and building2 classes were classified least accurately. In the sec-ond test area, the classes vegetation and shadow were clas-sified most accurately, however the building, bareland and road classes were classified least accurately. Similarly, for the third test area the most accurately classified classes were vegetation, agriculture and shadow and the least accurately classified classes were building and road. Finally, the vegeta-tion and shadow were the most accurately classified classes, while the building and building2 classes were the least accu-rately classified classes for the fourth test area.

Table 9: Comparison of the overall and per-category accuracy measures of classification results obtained using RF classifier for Test Areas I, II, III, and IV.

Test Area I Test Area II Test Area III Test Area IV PA UA K PA UA K PA UA K PA UA K

(%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%)Building 99.90 92.76 91.55 87.20 68.61 63.38 90.40 88.63 86.73 94.00 80.07 76.75Building2 83.50 85.99 83.66 96.50 96.40 95.80 - - - 81.80 84.94 82.43Vegetation 99.30 100.00 100.00 99.90 100.00 100.00 100.00 99.90 99.88 99.90 100.00 100.00Road 86.80 83.95 81.27 77.00 93.33 92.22 91.20 91.94 90.59 94.40 86.37 84.10Bareland 90.10 98.47 98.21 81.40 89.75 88.04 98.50 98.70 98.48 69.40 91.20 89.73Agriculture - - - - - - 99.60 99.90 99.88 - - -Water 99.30 99.30 99.18 - - - - - - - - -Shadow 99.00 98.02 97.69 99.30 99.60 99.53 98.60 99.40 99.29 100.00 99.01 98.84OA (%)OK

93.990.9298

90.220.8826

96.380.9566

89.920.8790

OA: Overall Accuracy, OK: Overall Kappa, PA: Producer's Accuracy, UA: User's Accuracy, K: Per-CategoryKappa

Table 10: Comparison of the overall and per-category accuracy measures of classification results obtained using SVM classifier for Test Areas I, II, III, and IV.

Test Area I Test Area II Test Area III Test Area IV

PA UA K PA UA K PA UA K PA UA K

(%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%) (%)

Building 99.90 97.65 97.26 88.10 77.28 73.49 89.00 87.86 85.83 96.00 86.88 84.69

Building2 83.60 85.48 83.06 96.50 97.28 96.82 - - - 86.50 85.39 82.95

Vegetation 99.20 100.00 100.00 100.00 100.00 100.00 99.40 100.00 100.00 99.50 100.00 100.00

Road 88.70 79.34 75.89 83.90 91.89 90.54 92.70 92.06 90.73 96.10 87.13 84.98

Bareland 89.10 99.44 99.35 88.20 92.94 91.76 96.20 98.36 98.09 74.40 94.54 93.63

Agriculture - - - - - - 99.60 99.90 99.88 - - -

Water 99.80 99.70 99.65 - - - - - - - - -

Shadow 97.70 98.49 98.24 99.40 98.81 98.61 99.50 98.42 98.15 99.40 99.70 99.65

OA (%) 94.00 92.68 96.07 91.98

OK 0.9300 0.9122 0.9528 0.9038

OA: Overall Accuracy, OK: Overall Kappa, PA: Producer's Accuracy, UA: User's Accuracy, K: Per-CategoryKappa

Page 9: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

37Dilek Koc-San / Vol.2 Iss.1 2013

The McNemar test results between pair of classifiers are pro-vided in Table 11. According to this test, the X2 value was computed as 0.00, 69.71, 1.51, and 35.51 for Test Areas I, II, III and IV, respectively. For Test Areas I and III the computed X2 values are lower than 3.84. Hence, the performances of RF and SVM classifiers were found to be nearly equal for these test areas. On the other hand, for Test Areas II and IV, the X2 values are greater than 3.84 indicating the significant accuracy increase at the 95% confidence level when SVM classifier is preferred instead of RF classifier.

5. Conclusions

In this study, the performances of the RF and SVM classifi-ers for urban thematic mapping from WorldView-2 satellite imagery were observed and compared. The classification algorithms were applied on the test areas that show differ-ent urban characteristics using pan-sharpened WorldView-2 satellite imagery. The obtained results indicate that, the the-matic urban maps can be generated quite accurately using both machine learning algorithms. The most accurate clas-sification results were obtained for regularly developed new residential pattern (Test Area III), because in this test area the buildings are sparsely located and have relatively larg-er sizes. The second best results were achieved for the test area that includes traditional urban pattern (Test Area I). In this test area, the buildings have small sizes and do not have regular pattern. However, the advantage of this test area is the spectral separabilities of the classes. Although the sec-ond test area includes regularly developed urban pattern, the classification accuracy of this test area is not as high as the third test area. In this case, there may be various reasons that lead to decrease in classification accuracy, which are (i) the spectral similarities of the classes, (ii) dense urban pattern, (iii) existence of solar energy water heating systems on the roofs of the buildings. Finally, the least classification accu-racies were obtained for the fourth test area, which includes unplanned urban pattern. In this test area, the buildings are too small in size and very close to each other.

The SVM and RF classifiers provided nearly equal over-all accuracies for the Test Areas I and III. However, the SVM classifier produced slightly better overall accuracies than RF for Test Areas II and IV. Although the SVM classifier pro-vides relatively higher accuracies than RF classification for these test areas, the RF classification results are also chal-lenging. As a conclusion, both machine learning algorithms can be used effectively for urban land cover mapping from WorldView-2 satellite imagery.

Table 11: The McNemar test results to compare SVM and RF classifiers for Test Areas I, II, III, and IV.

Test Area Pair of Classifier f11 f12 f21 f22 Total X2

I SVM vs RF 6458 122 121 299 7000 0.00

II SVM vs RF 5332 229 81 358 6000 69.71

III SVM vs RF 5666 98 117 119 6000 1.51

IV SVM vs RF 5244 275 151 330 6000 35.51

f11: the number of pixels that are correctly classified using both classifiers; f12: the number of pixels that are correctly classified using first classifier, while incorrectly classified using second classifier; f21: the number of pixels that are correctly classified using second classifier but incorrectly classified using first classifier; f22: the number of pixels that are incorrectly classified using both classifiers.

The significant differences at the 95% confidence level (X2 ≥ 3.84) are shown in boldface type.

ReferencesAkar Ö., Güngör O., (2012), Classification of multispectral images

using Random Forest Algorithm, Journal of Geodesy and Geoinformation, 1(2), 105-112, doi: 10.9733/jgg.241212.1.

Berger C., Voltersen M., Hese S., Walde I., Schmullius C., (2013). Robust extraction of urban land cover information from HSR multi-spectral and Lidar data, IEEE Journal of Selected Topics in Applied Earth Observation and Remote Sensing, 6(5), 2196-2211, doi: 10.1109/JSTARS.2013.2252329.

Breiman L. (2001), Random Forests, Machine Learning, 45, 5–32, doi: 10.1023/A:1010933404324,

Breiman L., (1996), Bagging Predictors, Machine Learning, 24,123–140, doi: 10.1007/BF00058655.

Chen Z., Wang G., Liu J., (2012), A modified object-oriented classification algorithm and its application in high-resolution remote-sensing imagery, International Journal of Remote Sensing, 33(10), 3048-3062., doi: 10.1080/01431161.2011.625055.

Dalponte M., Orka H. O., Gobakken T., Gianelle D., Naesset E., (2013), Tree species classification in Boreal forests with hyperspectral data, IEEE Transactions on Geoscience and Remote Sensing, 51(5), 2632-2645, doi: 10.1109/TGRS.2012.2216272.

Foody G. M., (2004), Thematic Map Comparison: Evaluating the Statistical Significance of Differences in Classification Accuracy, Photogrammetric Engineering and Remote Sensing 70(5), 627–633, doi: 10.14358/PERS.70.5.627.

Gislason P. O., Benediktsson J. A., Sveinson J. R., (2006), Random forests for land cover classification, Pattern Recognition Letters, 27(4), 294–300, doi: 10.1016/j.patrec.2005.08.011.

Guan H., Li J., Chapman M., Deng F., Ji Z., Yang X., (2013), Integration of orthoimagery and Lidar data for object-based urban thematic mapping using random forests, International Journal of Remote Sensing, 34(14), 5166–5186, doi: 10.1080/01431161.2013.788261.

Huang C., Davis L. S., Townshend J. R. G., (2002), An assessment of support vector machines for land cover classification, International Journal of Remote Sensing, 23(4), 725–749, doi: 10.1080/01431160110040323.

Huang X., Zhang L., Li P., (2007), Classification and extraction of spatial features in urban areas using high-resolution multispectral imagery, IEEE Geoscience and Remote Sensing Letters, 4(2), 260-264, doi: 10.1109/LGRS.2006.890540.

Jakimow B., Oldenburg C., Rabe A., Waske B., van der Linden S., Hostert P., (2012), Manual for Application: ImageRF (1.1), Universtat Bonn and Humboldt Universitat zu Berlin, Germany.

Kavzoglu T., Colkesen I., (2009), A kernel functions analysis for support vector machines for land cover classification, International Journal of Applied Earth Observation and Geoinformation, 11, 352-359, doi: 10.1016/j.jag.2009.06.002.

Page 10: Thematic mapping of urban areas from WorldView-2 satellite ... · support vector machines classifiers. The performances of these classifiers are evaluated and compared using ... Benzer

38 Thematic mapping of urban areas from WorldView-2 satellite imagery using machine learning algorithms

Koc-San D., (2013), Evaluation of different classification techniques for the detection of glass and plastic greenhouses from WorldView-2 satellite imagery, Journal of Applied Remote Sensing, 7(1), 073553-1-20, doi: 10.1117/1.JRS.7.073553.

Oumar Z., Mutanga O., (2013), Using World View-2 bands and indices to predict bronze bug (Thaumastocoris peregrinus) damage in plantation forests, International Journal of Remote Sensing, 34(6), 2236–2249, doi: 10.1080/01431161.2012.743694.

Pacifici F., Chini M., Emery W. J., (2009), A neural network approach using multi-scale textural metrics from very high-resolution panchromatic imagery for urban land-use classification, Remote Sensing of Environment, 113(2009), 1276-1292, doi: 10.1016/j.rse.2009.02.014.

Pal M., (2005), Random Forest Classifier for Remote Sensing Classification, International Journal of Remote Sensing, 26(1), 217-222, doi: 10.1080/01431160412331269698.

Palsson F., Sveinsson J.R., Benediktsson J. A., Anaes H., (2012), Classification of pansharpened urban satellite images, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 5(1), 281–297, doi: 10.1109/JSTARS.2011.2176467.

PCI Software Users Manual, (2013), PCI Geomatics Enterprises Inc., Richmond Hill, Ontario, Canada.

Rabe A., Van der Linden S., Hostert P., (2010), ImageSVM, Version 2.1, http://www.hu-geomatics.de [Accessed 1 May 2013].

Ridd M. K., (1995), Exploring a VIS (vegetation-impervious-surface-soil) model for urban ecosystem analysis through remote sensing: Comparative anatomy for cities, International Journal of Remote Sensing, 16(12), 2165–2185, doi: 10.1080/01431169508954549.

Rodriguez-Galiano V. F., Chica-Rivas M., (2012), Evaluation of different machine learning methods for land cover mapping of a Mediterranean area using multi-seasonal Landsat images and Digital Terrain Models, International Journal of Digital Earth, doi: 10.1080/17538947.2012.748848.

Rodriguez-Galiano V. F., Ghimire B., Rogan J., Chica-Olmo M. Rigol-Sanchez J. P., (2012), An assessment of the effectiveness of a random forest classifier for land-cover classification, ISPRS Journal of Photogrammetry and Remote Sensing, 67, 93-104, doi: 10.1016/j.isprsjprs.2011.11.002.

Sesnie S. E., Finegan B., Gessler P. E., Thessler S., Bendana Z. R., Smith A. M. S., (2010), The multispectral separability of Costa Rican rainforest types with support vector machines and Random Forest decision trees, International Journal of Remote Sensing, 31(11), 2885-2909, doi: 10.1080/01431160903140803.

Updike T., Comp C., (2010), Radiometric use of WorldView-2 imagery, http://www. digitalglobe.com/downloads/Radiometric_Use_of_WorldView-2_Imagery.pdf [Accessed 1 May 2013].

Van der Linden S., Rabe A., Wirth F., Suess S., Okujeni A., Hostert P., (2010), ImageSVM Classification, Manual for Application: ImageSVM, Version 2.1, Humboldt-Universitat zu Berlin, Germany.

Vapnik, V. N., (1995), The Nature of Statistical Learning Theory, Springer-Verlag, New York, doi: 10.1007/978-1-4757-2440-0

Wang L., (2005), Support Vector Machines: Theory and Applications, Springer-Verlag, Berlin, Heidelberg.

Waske, B., Benediktsson J. A., Arnason K., Sveinsson J. R., (2009), Mapping of hyperspectral Aviris data using machine-learning algorithms, Canadian Journal of Remote Sensing, 35(S1), S106-S116, doi: 10.5589/m09-018.

Waske B., Braun M., (2009), Classifier ensembles for land cover mapping using multitemporal SAR imagery, ISPRS Journal of Photogrammetry and Remote Sensing, 64, 450-457, doi: 10.1016/j.isprsjprs.2009.01.003.

Waske B., van der Linden S., Oldenburg C., Jakimow, B., Rabe A., Hostert P., (2012), ImageRF – A user-oriented implementation for remote sensing image analysis with Random Forests, Environmental Modelling and Software, 35, 192-193, doi: 10.1016/j.envsoft.2012.01.014.

Watanachaturaporn P., Arora M. K., Varshney P. K., (2008), Multisource classification using support vector machines: an empirical comparison with decision tree and neural network classifiers, Photogrammetric Engineering and Remote Sensing, 74(2), 239-246, doi: 10.14358/PERS.74.2.239.

Welch R., (1982). Spatial resolution requirements for urban studies, International Journal of Remote Sensing, 3(2), 139–146, doi: 10.1080/01431168208948387.

Yonezawa C., (2007). Maximum likelihood classification combined with spectral angle mapper algorithm for high resolution satellite imagery, International Journal of Remote Sensing, 28(16), 3729–3737, doi: 10.1080/01431160701373713.