feature subset selection for automatically classifying anuran calls using sensor networks

17
Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks Juan G. Colonna Afonso D. Ribas Eduardo F. Nakamura Eulanda M. dos Santos Institute of Computing (IComp) Federal University of Amazon (UFAM)

Upload: ufam-universidade-federal-do-amazonas

Post on 21-Jun-2015

195 views

Category:

Technology


1 download

DESCRIPTION

Anurans (frogs or toads) are commonly used by biologists as early indicators of ecological stress. The reason is that anurans are closely related to the ecosystem. Although several sources of data may be used for monitoring these animals, anuran calls lead to a non-intrusive data acquisition strategy. Moreover, wireless sensor networks (WSNs) may be used for such a task, resulting in more accurate and autonomous system. However, it is essential save resources to extend the network lifetime. In this paper, we evaluate the impact of reducing data dimension for automatic classification of bioacoustic signals when a WSN is involved. Such a reduction is achieved through a wrapper-based feature subset selection strategy that uses genetic algorithm (GA). We use GA to find the subset of features that maximizes the cost-benefit ratio. In addition, we evaluate the impact of reducing the original feature space, when sampling frequencies are also reduced. Experimental results indicate that we can reduce the number of features, while increasing classification rates (even when smaller sampling frequencies of transmission are used).

TRANSCRIPT

Page 1: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Juan G. ColonnaAfonso D. RibasEduardo F. NakamuraEulanda M. dos Santos

Institute of Computing (IComp)Federal University of Amazon (UFAM)

Page 2: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Introduction - Environmental Motivation

The study of environmental conditions allow:

maintain the quality of life, and to preserve the species.

The loss of species is an irreversible process!The loss of species is an irreversible process!

The variation of species populations enables to:

identify environmental problems in the early stages, and

establish strategies for the conservation of biological diversity.

Page 3: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Introduction - Environmental Motivation

Variations in amphibian populations are related to pollution, deforestation, urbanization, etc.

Frogs can be used as indicators for detecting environmental stress.

Figure: Percentage of threatened species in the red list. Figure adapted from [Stuart et al., 2004].

Page 4: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Introduction – Objectives

Classify frog species of tropical forests based on the vocalizations

using wireless sensor networks and machine learning technique.*

4* Consideration: Restrictions on the hardware.

Page 5: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Introduction - Challenges

Develop a method that does not need human intervention.

Characterize the spectral frequency of frog.

Extract and select the optimal set of features.

Define the classification technique.

Get the minimum set of features using genetic algorithm.

Obtain the cost of processing characteristics.

Correlate the processing cost and success rate.

Maximize the benefit cost rate.

5

WSN and Machine Learning

Page 6: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Related Work

6

Author Animal Features Classifier Results WSN

Taylor et al. [1996] Bufo marinus Spectrograma C4.5 60% No

Hu et al. [2005] Bufo marinus Spectrograma C4.5 60% Yes

Yen & Fu [2002]* 4 frog WaveletFisher’s

MLP 71% No

Clemins [2005] elephant MFCCsPLP

HMMDTW

69%73%

No

Cai et al. [2007] 14 bird MFCCs ANN 81% - 86% Yes

Huang et al. [2009]* 5 frog S - B - ZC k-NNSVM

83% - 100%82% - 100%

No

Vaca-Castaño & Rodriguez [2010]*

10 bird20 frog

MFCCsPCA

k-NN 86%91%

Yes

Han et al. [2011]* 9 frog S - Hs - Hr k-NN 83% - 100% No

* Work implemented and used in the comparisons.

Page 7: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Our approach

7

Figure: Parametrization of vocalizations.

Figure: Anuran classification stages. Figure: Pre-processing steps.

Page 8: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Features

8

Figure: Mel-Fourier Cepstral Coefficients (MFCCs).

Figure: Wavelet Transform with Lifting Scheme.

Page 9: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Obtain the features

9

Figure: Feature extraction.

Page 10: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Spectrogram

10

Figure: Audio sample (wave form and spectrogram) for the Adenomera andreae..

Page 11: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

11

Features

Feature Complexity order Computational cost

Pitch O(L) 3L − 1

B O(Nlog(N)) 2M + 2M + Nlog(N)

12 MFCC’s O(Nlog(N)) Nlog(N) + N + mR

S O(Nlog(N)) 2M + Nlog(N)

H1 O(L) L + i

H2 O(L) L + i

ZC O(L) L

E O(L) L

Pw O(L) L

Page 12: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Comparison between MFCCs and Wavelet

12

Features k-NN

0.4 0.5 0.6

Wavelet FeaturesDaubechies Transform

96.35%(3) 97.86%(1) 98.22%(1)

Wavelet FeaturesHaar Transform

96.70%(1) 97.90%(1) 98.38%(1)

MFCCs 99.19%(9) 99.36%(2) 99.19%(1)

Table: Success rate in relation to alpha, using cross-validation fold = 10.

Applying the Wilcoxon test, with 95% significance level (α = 0.5), we conclude that the MFCCs have better performance.

Page 13: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Comparison between MFCCs and Wavelet

13

Objective: To determine the optimal subset of features by applying GA.

Page 14: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Comparison between MFCCs and Wavelet

14

Features Classificationbefore GA

Crossover 50%Mutation 40%

Success rate Crossover 60%Mutation 20%

Success rate

9 features with Db

97.86%(1) 1,2,3,5 93.73% 1,2,3,4,5,6,8,9 96.83%

9 featureswith Haar

97.90%(1)* 2,3,4,5,6,8,9 96.47% 1,2,3,4,5,6,7,8,9 97.90%*

12 MFCCs 99.36%(2)* 1,2,3,4,5,6,7,11 99.08% 1,2,3,4,5,6,7,8,911,12

99.33%*

Page 15: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Case of Study

fs = 44.1kHz

fs =5.5kHz

fs = 11kHz

Page 16: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Conclusions

We indicated how best set of features to choose the 12 MFCCs.

You can optimize costs by using 8 MFCCs, although the method loses generality.

The MFFCs have:

✔ Better success rate;✔ Constant cost, regardless of hardware, and✔ Immunity to environmental and quantization noise.

16

Page 17: Feature Subset Selection for Automatically Classifying Anuran Calls Using Sensor Networks

Questions?

17

Thanks