data mining for pothole detection pro gradu seminar 10.2.2011
DESCRIPTION
Data mining for pothole detection Pro gradu seminar 10.2.2011. Hannu Hautakangas Jukka Nieminen. Contents. Introduction Related work Data Data preprocessing Feature extraction Feature selection Support vector machine Results Problems References. Introduction. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/1.jpg)
Data mining for pothole detectionPro gradu seminar10.2.2011
Hannu HautakangasJukka Nieminen
![Page 2: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/2.jpg)
ContentsIntroductionRelated workDataData preprocessingFeature extractionFeature selectionSupport vector machineResultsProblemsReferences
![Page 3: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/3.jpg)
IntroductionPurpose of the research is to detect
anomalies on road surfaceExpansion jointsPotholesSpeed bumpsEtc.
SupervisorsTapani RistaniemiFengyu Cong
![Page 4: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/4.jpg)
Related workAccelerometer based techniques
Pothole PatrolNericellTerrain classification
Other techniquesImage detectionLaser profilometerGround penetrating radar
![Page 5: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/5.jpg)
DataAcceleration data
Contains lateral, longitudinal and vertical axis
GPS position and timestamp for each measurement
Class label for each measurementSampling rate is 38 Hz
Data was collected using several different vehicles
![Page 6: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/6.jpg)
Data preprocessingSeveral filters were produced and tested
using different passbands in the frequency range 0.5 – 6 Hz
Data was windowed using sliding windowDifferent sliding window functions were tested
ChebyshevHammingTaylorEtc.
Normalization in the range [0,1]
![Page 7: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/7.jpg)
Original and filtered Y-axis data
![Page 8: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/8.jpg)
Feature extractionSeveral different features were extracted
MeanPeak-to-peak ratioRoot mean squareStandard deviationVariancePower spectrum density
21 frecuency bins in the frequency range 1-5 HzPartial sum of the frequency bins in the frequency
range 1-5HzFirst sum is between 1-2 Hz, second in 2-3 Hz, etc.
Wavelet packet decompositionThis was done by Fengyu Cong
![Page 9: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/9.jpg)
Feature extraction
![Page 10: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/10.jpg)
Feature selectionFeature selection is used to reduce the
number of features and thus reduce the computational effort and make the classification operation faster and more accurate
Different techniques were tested– Backward and forward selection– Genetic algorithm– Principal component analysis
![Page 11: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/11.jpg)
Backward and forward selectionOriginally introduced by M. A. Efroymson
1960Tries to find best feature subset
Model includes only significant featuresFeatures are usually evaluated using F-test
Based on linear regressionFeature is significant if it’s f-value >
predetermined significant level
![Page 12: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/12.jpg)
Backward and forward selectionBackward selection
Starts with all features in the modelRemoves features one by one starting from most
unsignificantContinues until model includes only significant
featuresForward selection
Opposite to backward selectionStarts with zero features in the modelAdds features one by one starting from most
significantContinues until all significant features are selected
![Page 13: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/13.jpg)
Genetic algorithmGenetic algorithm is a computational model that
searches a potential solution to a specific problem using data structure, which is inspired by evolution
It was introduced by John Holland in 1975The algorithm can be considered as a two-stage
process It begins with the current population where the best
chromosomes are selected, based on their fitness values, to create an intermediate population
Then crossover, mutation and reproduction is applied to create the next population
This two-stage process constitutes one generation
![Page 14: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/14.jpg)
Genetic algorithmCrossover operation selects randomly two
individuals ands generates two new ones by combining the selected ones
Mutation operation randomly selects an individual, removes its subtree from randomly selected node and then generates a new subtree
![Page 15: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/15.jpg)
Genetic algorithmReproduction operation moves selected
individuals to next population without any change
Each feature is represented as a binary vector of dimension m, where m is the amount of features
Bit 1 means that the corresponding feature is part of the subset and bit 0 means that the corresponding feature is not part of the subset
![Page 16: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/16.jpg)
Principal component analysisPCA was introduced by Karl Pearson in
1901The method was not able to calculate more
than two or three variablesIn 1933 Harold Hotelling described the
methods for computing multivariate PCA
![Page 17: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/17.jpg)
Principal component analysisThe object of PCA is to find uncorrelated
principal components Z1, Z2,…, Zp that describes the debendencies between variables X1, X2,.., Xp
Principal components are ordered so that the first component Z1 displays the largest amount of variation in the data, second component Z2 displays the second largest amount of variation, and so on
Principal components are selected based on their eigenvalues
![Page 18: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/18.jpg)
Support vector machineVladimir Vapnik introduced SVM in 1995A binary classification toolTries to find optimal separating hyperplane
to separate classes from each otherBasic SVM can classify only two classes but
it can be extended to multiclass classifier
![Page 19: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/19.jpg)
Support vector machineCreates a model based on training data
Each data sample has a class labelModel predicts to which class a specific data
sample belongsModel is tested using testing data
Predicted labels are compared to known labelsA Matlab library LIBSVM was used as an SVM-
toolLIBSVM implements most of the common SVM
methodsSupports multiclass classification
![Page 20: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/20.jpg)
ResultsData was classified with SVM using PCATwo data sets
Set 1 consists of 1779 normal and 12 anomaly samplesSet 2 consists of 1779 normal and 21 anomaly samplesBoth sets have 30 features which are generated with
wavelet packet decomposition70% of the normal samples were used to create SVM
modelRest of the normal samples (534) were used to test the
modelPCA was used to select the features that represents most
of the variation in the dataTests were run 1000 times to get proper results
The results are mean values of 1000 test runs
![Page 21: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/21.jpg)
ResultsSet 1
With three or more principal componentsAll 12 anomalies were classified correctly6.93 normal samples out of 534 were classified incorrectly
Set 2With three principal components
1.63 anomalies out of 21 were classified incorretly7.26 normal samples out of 534 were classified incorrectly
With 10 or more principal components0.02 anomalies out of 21 were classified incorrectly6.53 normal samples out of 534 were classified incorrectly
![Page 22: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/22.jpg)
Results, set 1
![Page 23: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/23.jpg)
Results, set 2
![Page 24: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/24.jpg)
ProblemsTimestamps are not accurate which affects
the labeling of the classesSome normal data samples are labeled as
anomalies and vice versaSmall number of anomaly data samples
compared to normal data samplesFor example data set 1 has 12 anomaly and
1779 normal samplesMulticlass classification is difficult because
there is not enough anomaly samples to create multiclass SVM model
![Page 25: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/25.jpg)
References Backward and forward selection
N. R. Draper and H. Smith, Applied regression analysis 2nd edition, John Wiley & Sons Inc, 1981. M. A. Efroymson, Multiple regression analysis, in Mathematical Methods for Digital Computers,
editors A. Ralston and H.S. Wilf, John Wiley & Sons Inc,1960.
Genetic algorithm John Holland, Adaptation in natural and artificial systems : an introductory analysis with
applications to biology, control, and artificial intelligence, University of Michigan Press, 1975. L. B. Jack and Asoke K. Nandi, Genetic algorithms for feature selection in machine condition
monitoring with vibration signals, in IEEE Signal Processing Vol 147, No 3, June 2000. Darrell Whitley, A genetic algorithm tutorial, in Statistics and computing 4, pages 65-85, 1994.
PCA Harold Hotelling, Analysis of a complex of statistical variables into principal components, in Journal
of Educational Psychology, volume 24, issue 7 pages 498-520, October 1933. Ian T. Jolliffe, Principal Component Analysis, Springer-Verlag, New York, 2002. Karl Pearson, On lines and planes of closest fit to a system of points in space, Philosophical
Magazine, Vol. 2, pages 559-572, 1901.
![Page 26: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/26.jpg)
References Related Work
W. Dargie, Analysis of time and frequency domain features of accelerometer measurements, Proceedings of 18th Internatonal Conference on Computer Communications and Networks. ICCCN 2009, pages 1-6.
DuPont, Edmond and Moore, Carl and Collins, Emmanuel and Coyle, Eric, Frequency response method for terrain classification in autonomous ground vehicles, in Autonomous Robots, vol. 24, pages 337-347, 05/04, 2008.
J. Eriksson, L. Girod, B. Hull, R. Newton, S. Madden and H. Balakrishnan, The pothole patrol: using a mobile sensor network for road surface monitoring, in MobiSys 2008: Proceeding of the 6th international conference on Mobile systems, applications and services, ACM, New York, 2008, pages 29-39.
D.H. Kil, F.B. Shin, Automatic road-distress classification and identification using a combination of hierarchical classifiers and expert systems-subimage and object processing, Proceedings of International Conference on Image Processing, pages 414 - 417 vol 2, Santa Barbara, CA, USA 1997.
J. Lin and Y. Liu, Potholes detection based on SVM in the pavement distress image, in Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science, pages 544 - 547, Hong Kong, China 2010.
P. Mohan, V. N. Padmanabhan and R. Ramjee, Nericell: Rich monitoring of road and traffic conditions using mobile smartphones, in SenSys 2008: Proceedings of the 6th ACM conference on Embedded network sensor systems, ACM, New York, 2008, pagess 323-336.
SVM Corinna Cortes and Vladimir Vapnik, Support-Vector Networks, Machine Learning, Volume 20, pages
273-297, Kluwer Academic Publishers, Boston, 1995. LIBSVM – A library for Support Vector Machines, http://www.csie.ntu.edu.tw/~cjlin/libsvm/ , referred
4.2.2011.
![Page 27: Data mining for pothole detection Pro gradu seminar 10.2.2011](https://reader034.vdocuments.net/reader034/viewer/2022042608/5681403d550346895dabad0a/html5/thumbnails/27.jpg)
Tack så mycket!