feature selection using maximum feature tree embedded with

14
Research Article Feature Selection Using Maximum Feature Tree Embedded with Mutual Information and Coefficient of Variation for Bird Sound Classification Haifeng Xu , Yan Zhang , Jiang Liu , and Danjv Lv College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming 650224, China Correspondence should be addressed to Yan Zhang; [email protected] Received 27 September 2020; Revised 30 December 2020; Accepted 1 February 2021; Published 13 February 2021 Academic Editor: Paolo Spagnolo Copyright © 2021 Haifeng Xu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e classification of bird sounds is important in ecological monitoring. Although extracting features from multiple perspectives helps to fully describe the target information, it is urgent to deal with the enormous dimension of features and the curse of dimensionality. us, feature selection is necessary. is paper proposes a scoring feature method named MICV (Mutual In- formation and Coefficient of Variation), which uses the coefficient of variation and mutual information to evaluate each feature’s contribution to classification. And then, a method named ERMFT (Eliminating Redundancy Based on Maximum Feature Tree) based on two neighborhoods to eliminate redundancy to optimize features is explored. ese two methods are combined as the MICV-ERMFT method to select the optimal features. Experiments are conducted to compare eight different feature selection methods with two sounds datasets of bird and crane. Results show that the MICV-ERMFT method outperforms other feature selection methods in the accuracy of the classification and is less time-consuming. 1. Introduction Birds are sensitive to changes in habitats and surroundings, and they are a good indicator of biodiversity and the eco- system [1]. Because birds generally have a wide range of movement and cannot be observed promptly, bird sounds are one of the important ways to identify them [2]. Bird sounds are a class of environmental sounds. Some famous feature extraction methods used in audio signal processing include Mel-Frequency Cepstral Coefficients (MFCC) [3] in the frequency domain and Short-Time Fourier Transform (STFT) [4] and Wavelet Transform (WT) in the time domain [5]. Furthermore, Tsau et al. [6] suggested a method that extracts features from Code Excited Linear Prediction (CELP) bit streams. Researchers have been extracting features from multiple aspects to retrieve enough information to describe the target. However, the curse of dimensionality occurs as the numbers of the features and samples grow. It also in- creases the time cost of analyzing data, affects the models’ generalization, and reduces the effectiveness of solving problems [7]. To avoid the curse of dimensionality, selecting a subset of features from the feature pool is necessary. e feature selection process in pattern recognition is composed of feature scoring and feature optimization. Feature scoring, the key to feature selection, finds the most distinguishable features in the classification space. Generally, feature scoring methods can be grouped into four classes: similarity-based, information-theory-based, statistics-based, and sparse-learning-based [8]. So far, researchers have proposed many different feature scoring methods [9]. For example, in unsupervised feature selection, Nonnegative Laplacian is used to estimate the feature contribution [10]. Constraint Score is applied in feature scoring in environ- mental sound classification [11]. e ReliefF-based feature selection algorithm is employed to select features in auto- matic bird species identification [12]. PCA is used as a feature reduction technology to realize bird sounds’ auto- matic recognition [13]. Hindawi Mathematical Problems in Engineering Volume 2021, Article ID 8872248, 14 pages https://doi.org/10.1155/2021/8872248

Upload: others

Post on 09-May-2022

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Feature Selection Using Maximum Feature Tree Embedded with

Research ArticleFeature Selection Using Maximum Feature Tree Embedded withMutual Information and Coefficient of Variation for BirdSound Classification

Haifeng Xu Yan Zhang Jiang Liu and Danjv Lv

College of Big Data and Intelligent Engineering Southwest Forestry University Kunming 650224 China

Correspondence should be addressed to Yan Zhang zydyr163com

Received 27 September 2020 Revised 30 December 2020 Accepted 1 February 2021 Published 13 February 2021

Academic Editor Paolo Spagnolo

Copyright copy 2021 Haifeng Xu et al -is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

-e classification of bird sounds is important in ecological monitoring Although extracting features from multiple perspectiveshelps to fully describe the target information it is urgent to deal with the enormous dimension of features and the curse ofdimensionality -us feature selection is necessary -is paper proposes a scoring feature method named MICV (Mutual In-formation and Coefficient of Variation) which uses the coefficient of variation and mutual information to evaluate each featurersquoscontribution to classification And then a method named ERMFT (Eliminating Redundancy Based on Maximum Feature Tree)based on two neighborhoods to eliminate redundancy to optimize features is explored -ese two methods are combined as theMICV-ERMFT method to select the optimal features Experiments are conducted to compare eight different feature selectionmethods with two sounds datasets of bird and crane Results show that the MICV-ERMFT method outperforms other featureselection methods in the accuracy of the classification and is less time-consuming

1 Introduction

Birds are sensitive to changes in habitats and surroundingsand they are a good indicator of biodiversity and the eco-system [1] Because birds generally have a wide range ofmovement and cannot be observed promptly bird soundsare one of the important ways to identify them [2]

Bird sounds are a class of environmental sounds Somefamous feature extraction methods used in audio signalprocessing include Mel-Frequency Cepstral Coefficients(MFCC) [3] in the frequency domain and Short-TimeFourier Transform (STFT) [4] and Wavelet Transform(WT) in the time domain [5] Furthermore Tsau et al [6]suggested a method that extracts features from CodeExcited Linear Prediction (CELP) bit streams Researchershave been extracting features from multiple aspects toretrieve enough information to describe the targetHowever the curse of dimensionality occurs as thenumbers of the features and samples grow It also in-creases the time cost of analyzing data affects the modelsrsquo

generalization and reduces the effectiveness of solvingproblems [7] To avoid the curse of dimensionalityselecting a subset of features from the feature pool isnecessary

-e feature selection process in pattern recognition iscomposed of feature scoring and feature optimizationFeature scoring the key to feature selection finds the mostdistinguishable features in the classification space Generallyfeature scoring methods can be grouped into four classessimilarity-based information-theory-based statistics-basedand sparse-learning-based [8] So far researchers haveproposed many different feature scoring methods [9] Forexample in unsupervised feature selection NonnegativeLaplacian is used to estimate the feature contribution [10]Constraint Score is applied in feature scoring in environ-mental sound classification [11] -e ReliefF-based featureselection algorithm is employed to select features in auto-matic bird species identification [12] PCA is used as afeature reduction technology to realize bird soundsrsquo auto-matic recognition [13]

HindawiMathematical Problems in EngineeringVolume 2021 Article ID 8872248 14 pageshttpsdoiorg10115520218872248

Meanwhile feature optimization the second phase offeature selection selects a subset of features characterized bylow redundancy and high contribution to the classificationfrom the feature sequence ordered by scores FilterWrapper and Embedded are three types of methods used toselect a subset of features and many studies have proposedvarious feature optimization algorithms based on thesemethods Binary Dragonfly Optimization Algorithm PSO(Particle Swarm Optimization) and Artificial Bee Colonyare some examples Specifically S-shaped and V-shapedtransfer function can be used to map continuous searchspace to discrete search space [14] Mutual information canbe combined with PSO to eliminate redundant features [15]In some research the gradient enhanced Decision Tree [16]is used to evaluate feature contribution and Artificial BeeColony is applied to optimize the features [17] Pearsoncorrelation coefficient is a common evaluation metric usedin literature which evaluates the correlation between fea-tures and is followed by Artificial Ant Colony to select high-quality features [18]

Most feature scoring methods such as Constraint Scoreand Laplacian are based on the correlation and differencesamong spatial distances between features Although thesealgorithms have low time complexity the diversity of thefeatures is neglected Specifically units of the features areusually different Some algorithms calculate the mutualinformation between the feature sample and the label from aprobabilistic and statistical perspective [15] However thelabel is generally a discrete variable while features arecontinuous variables In recent years many studies regardfeature selection as an optimization process and combinefeature selection with intelligent searching methods[9 19ndash22] -e multiobjective optimization problem of alarge dataset has a high time and space complexity A re-duction in the featuresrsquo dimensions usually decreases in theclassification modelrsquos sensitivity and generalization

Regarding the issues mentioned above from an infor-mation theory perspective this paper proposes a featurescoring method MICV (Mutual Information and Coefficientof Variation) MICV utilizes the characteristics of mutualinformation and coefficient of variation and aims tominimizeintraclass distance andmaximize interclass distance A featureoptimization method ERMFT (Eliminating RedundancyBased on Maximum Feature Tree) is suggested based on aminimum spanning tree concept Experiment results showthat the MICV-ERMFT method can effectively reduce thedata dimension and improve the classification modelrsquos per-formance Compared with eight feature evaluation methodsthe MICV-ERMFT method has significant improvement inthe performance on the same dataset in this paper

2 Materials and Methods

In bird soundsrsquo recognition there exists a variety of methodsto extract features and classify the sounds For exampleHuman Factor Cepstral Coefficients are used to extract bird

sound features and classification and recognition are per-formed by the maximum likelihood method [23] Zottessoet al [24] suggest a method that extracts bird song featuresbased on the spectrogram and texture descriptors and usesthe dissimilarity framework for classification and recogni-tion In this paper the classification process of bird sounds isdivided into three stages feature extraction feature selec-tion and classification recognition Feature selection is se-lected as the research focus -e proposed classificationprocess of bird sounds based on MICV-ERMFT is shown inFigure 1

Stage 1 Preprocess the bird soundsrsquo audio data (removenoises and converse the channel) and use MFCC andCELP to extract features from the preprocessed dataand construct dataset DMampC (dataset formed by themerger of MFCC and CELP features)Stage 2 Apply the MICV method on DMampC evaluatethe contribution and score each feature Sort thefeature sequence in ascending order denote as F andcalculate the Pearson correlation coefficient for thefeatures and build a maximum feature tree T -enapply the ERMFT method to eliminate redundantfeatures and construct a new dataset DMampCprimeStage 3 Build a classification model on DMampCprime andanalyze the classification results

21 Feature Extraction Birds make sounds in the same wayas humans do [25 26] -e frequency of human languageused for daily communication ranges from 180Hz to 6 kHzand the most used frequency range for bird calls is from 05to 6 kHz [25 27] Under this assumption we process thefeatures of bird sounds in a way similar to processing thehuman language MFCC (Mel-Frequency Cepstral Coeffi-cient) and CELP (Code Excited Linear Prediction) are ap-plied to the raw bird sounds data to extract features in thispaper

211 MFCC MFCC [3] is a human-hearing-based non-linear feature extraction method -e process is shown inFigure 2

Step 1 A single-frame short-term signal xw(i n) is ob-tained by separating frames and adding a window functionto the original audio signal x(n) Adding a windowfunction reduces the frequency spectrum leakage -ispaper selects the 20 s as a frame and uses the Hammingwindow

Step 2 To observe the distribution of xw(i n) in frequencydomain FFT (fast Fourier transform) is used to transformthe signal from the time domain to frequency domainnamed X(i k)

2 Mathematical Problems in Engineering

X(i k) FFT xi(m)1113858 1113859 (1)

Step 3 Calculate the energy of the spectral line per frame

E(i k) [X(i k)]2 (2)

Step 4 Calculate the energy of E(i k) through the Mel filter

S(i m) 1113944nminus1

k0E(i k)Hm(k) 0lemltM (3)

where i is the i-th frame k is the k-th spectral line in thespectrum and Hm(k) is the analysis window with a samplelength of k

Step 5 Take the logarithm of the energy of the Mel filter andcalculate the DCT (Discrete Cosine Transform)

mfcc(i m) 1113944Mminus1

m0log[S(i m)]cos

πn(2m minus 1)

2M1113888 1113889 (4)

where m is them-th Mel filter i is the i-th frame and n is thespectral line after the DCT

In this paperMFCC uses 13-dimensional static coefficients(1-dimensional log energy coefficient and 12-dimensional DCTcoefficients) as extraction parameters [3 28] -e resultingsample has 13 features

212 CELP -e CELP feature extraction method is derivedfrom LPC (Linear Predictive Coding) based on a com-pression coding tech G7231 -e LPC is extracted from the0th to 23rd bits from the bit coding in each frame formingthe 10-dimensional LPC Another 2-dimensional featurethe lag of pitch is extracted from the 24th to the 42nd bitstream in each frame -e extraction of CELP is shown inFigure 3

Endpoint detection is performed after the original audiofile is preprocessed -en each audio is divided into severalsound segments Each sound segment is considered as asample in the experiment For each frame features areextracted using MFCC (13 dimensions) and CELP (12 di-mensions) -e sampling rate is 16 kHz audio is a singlechannel Each sample contains several frames For each

Bird sounds audio Preprocessing

MFCC CELP

DM amp C

DM amp Cacute

According to MICV method

scoring features

Feature extraction

Calculate Pearson correlation coefficient

matrix

Build the maximum feature tree T

Ascending order of feature score sequence F

ERMFT(remove redundancy)

Building classification

model

F T

Classified evaluation

MICV-ERMFT

All audio

Stage 1

Stage 3 Stage 2

Figure 1 Bird sound classification model based on MICV-ERMFT feature selection

Framing and windowing FFT

Energy of spectral

line

Mel filterenergy

CalculateDCT

x(n) MFCC(in)S(im)E(ik)X(ik)xw(in)

Figure 2 MFCC schematic

Mathematical Problems in Engineering 3

detection segment (including many frames) the meanmedian and variance of each feature are calculated to obtain75-dimensional data -e feature extraction process isshown in Figure 4

22 Feature ScoringMethodMICV Based on the principle ofsmall distance within classes and large distance betweenclasses features that are easy to distinguish are selected Tocalculate the degree of feature differentiation mutual in-formation MIEC (Mutual Information for Interclass) is usedto measure the interclass distance and the coefficient ofvariation CVAC (Coefficient of Variation for Intraclass) isused to measure the intraclass distance

-e MIEC and CVAC methods are combined to cal-culate the classification contribution degree of features -ecalculation equation is

micvf λmiecf +(1 minus λ)cvacf (5)

Because intraclass distance and interclass distance havedifferent weights the coefficient λ(0lt λlt 1) is introduced toadjust the weights

221 MIEC Mutual information measures the correlationor the dependency between two variables For two discreterandom variables X and Y mutual information I(X Y) iscalculated as

I(X Y) 1113944y∊Y

1113944x∊X

p(x y)logp(x y)

p(x)p(y)1113888 1113889 (6)

In equation (6) p(x y) is the joint probability densityfunction of x and y p(x) and p(y) are the marginalprobability density functions of x and y

Generally when mutual information is used to selectfeatures variables X and Y represent the feature vector andlabel vector In this paper X and Y represent two vectors ofdifferent classes under the same feature Given feature spaceF and classification space C the interclass mutual infor-mation of f-th feature miecf is calculated as

miecf 1113944i∊C

1113944j∊ Ci

I(i j)(7)

In equation (7) i and j(ine j) are the samples of f-thfeature in i-th class and j-th class miecf is the interclassmutual information of f-th feature in F -e interclassdifference feature f is greater when the miecf is smaller andvice versa

222 CVAC In statistics the variation (CV) coefficientmeasures the variation between two or more samples or thedispersion between them -e expression is

Cv σμ

(8)

where μ and σ are the mean and standard deviation of thesamples Given feature space F and classification space C theintraclass coefficient of variation of feature f cvacf is cal-culated as

cvacf 1113944C

i1Cvi (9)

In equation (9) Cvi represents the CV of samples in classi -e feature f has a higher cohesion when cvacf is smaller

23 Feature Selection Method MICV-ERMFT After scoringthe features using the MICV method high-quality featuresare selected MICV-ERMFT is used to eliminate redundantfeatures in the feature array sorted by scores -e process isshown in Algorithm 1

231 Build Maximum Feature Tree -e maximum featuretree is derived from the minimum spanning tree For anundirected graph G(V E) each edge has a weight w aminimum spanning tree is a subset of edges Ersquo that connectall the vertices V with no cycle and the total weight of edgesin Ersquo is minimum In a maximum feature tree features arerepresented as vertices and weights of the edges are decidedby Pearson correlation coefficient P(FrFc) represents thecorrelation coefficient between features Fr and Fc which iscalculated as

P FrFc( ) 1113936i Fri minus Fr( 1113857 Fci minus Fc( 1113857

1113936i Fri minus Fc( 11138572

1113969

1113936i Fci minus Fc( 11138572

1113969 (10)

I FrFc( ) minus log2 1 minus P

2FrFc( )1113874 1113875

2

(11)

In equation (10) Fri represents the i-th sample of featurer Fr is the feature rrsquos mean value of all samples In equation(11) I(FrFc) is the correlation coefficient between features rand c Algorithm BMFT (building the max feature tree) usesequations (10) and (11) to calculate the correlation coeffi-cient matrix and construct the maximum feature tree De-tails are described in Algorithm 2

BitTorrent CodingFeature extraction from each frame

bit stream

LPC featuresAudio signal

LPC0~

LPC2

12-dimensional CELPACL0

ACL2

Lag of pitch

Figure 3 CELP feature extraction process

4 Mathematical Problems in Engineering

232 Remove Redundant Features Based on TwoNeighborhoods ERFTN (Eliminate Redundant Featuresbased on Two Neighborhoods) is based on eliminating re-dundancy using the concept of two neighborhoods Oneexample with a maximum feature tree T and feature se-quence F sorted using the MICV method is demonstrated inFigure 5

As shown in Figure 5 given max feature TF f2 f1 f3 f4 f5 f7 f9 f8 f6 f101113864 1113865 sorted withMICV method in ascending order the steps of the ERFTNalgorithm are listed as Algorithm 3 -e final feature subsetof F is f2 f3 f7 f101113864 1113865

3 Experiments and Results Analysis

31 Experimental Dataset Currently there are many web-sites dedicated to sharing bird sounds from around theworld such as Avibase [29] and Xeno-Canto [30] Re-cordings of bird sounds are collected and annotated on thesewebsites -e tapes include various types of voice expres-sions (multiple calls and songs) of various individualsrecorded in their natural environment -e dataset used forthis paper comes from the Avibase which is a collection ofMP3 or WAV audio files -ese audio files are unified intothe 16 kHz sampling rate and monochannel Since the audiofiles are not all bird sounds the bird sounds in the audio areseparated through the voice activity detection (VAD)[25 31] and then the MFCC and CELP features areextracted according to the process shown in Figure 4

-e experiments used two datasets including birdsounds and crane sounds We have selected six different birdspecies from different genera in bird sounds which contains433 samples -e crane sound dataset includes 343 samplesfrom seven species ofGrus-e dataset information is shownin Tables 1 and 2

32 9e Experiment of MICV Scoring Method To verify theproposed methodrsquos effectiveness two separate experimentsare conducted to test the MICV scoring method and MICV-ERMFT feature selection method -e classifiers used in theexperiments include Decision Tree (J48) SVM BayesNet(NB) and Random Forests (RFs) -e feature scoringmethod is compared with ConstraintScore (CS) [11] and sixother feature scoring methods provided by Weka [32] in-cluding Correlation (Cor) GainRatio (GR) InfoGain (IG)

One-R (OR) ReliefF (RF) and SymmetricalUncert (SU) inexperiments

321 Classifier Performance Evaluation Kappa F1 scoreand accuracy rate were used as evaluation indicators

(1) Kappa Cohenrsquos Kappa coefficient is a statistical measurethat indicates the interrater reliability (and also intraraterreliability) for qualitative (categorical) items

Kappa po minus pe

1 minus pe

(12)

where po is the overall classification accuracy which iscalculated by the number of correctly classified samplesdivided by the total number of samples Based on theconfusion matrix assume the numbers of real samples ineach class are a1 a2 an1113864 1113865 the numbers of predictedsamples are b1 b2 bn1113864 1113865 and pe is calculated as

pe a1 times b1 + a2 times b2 + middot middot middot + an times bn

nlowast n (13)

(2) F1 Score It is an index used to measure the accuracy ofclassification models in statistics while taking into accountthe accuracy and recall of classification models As shown inequation (14) precision represents the precision rate andrecall represents the recall rate

Birdsounds Preprocessing Dataset

(75dim)

MFCC1 MFCC13MFCC2n

n CELP1 CELP12CELP2

Each detection fragment contains the number of n frames and the mean variance and

median of n samples are calculated for each feature

Figure 4 Extraction process of bird sounds feature

1

2

9

3

4

5 6

7

8

10

2

1

3

4

5

7

9

8

6

10

Score based on MICV FMaximum feature tree T

Min

Max

Figure 5 Schematic of the ERFTN

Mathematical Problems in Engineering 5

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 2: Feature Selection Using Maximum Feature Tree Embedded with

Meanwhile feature optimization the second phase offeature selection selects a subset of features characterized bylow redundancy and high contribution to the classificationfrom the feature sequence ordered by scores FilterWrapper and Embedded are three types of methods used toselect a subset of features and many studies have proposedvarious feature optimization algorithms based on thesemethods Binary Dragonfly Optimization Algorithm PSO(Particle Swarm Optimization) and Artificial Bee Colonyare some examples Specifically S-shaped and V-shapedtransfer function can be used to map continuous searchspace to discrete search space [14] Mutual information canbe combined with PSO to eliminate redundant features [15]In some research the gradient enhanced Decision Tree [16]is used to evaluate feature contribution and Artificial BeeColony is applied to optimize the features [17] Pearsoncorrelation coefficient is a common evaluation metric usedin literature which evaluates the correlation between fea-tures and is followed by Artificial Ant Colony to select high-quality features [18]

Most feature scoring methods such as Constraint Scoreand Laplacian are based on the correlation and differencesamong spatial distances between features Although thesealgorithms have low time complexity the diversity of thefeatures is neglected Specifically units of the features areusually different Some algorithms calculate the mutualinformation between the feature sample and the label from aprobabilistic and statistical perspective [15] However thelabel is generally a discrete variable while features arecontinuous variables In recent years many studies regardfeature selection as an optimization process and combinefeature selection with intelligent searching methods[9 19ndash22] -e multiobjective optimization problem of alarge dataset has a high time and space complexity A re-duction in the featuresrsquo dimensions usually decreases in theclassification modelrsquos sensitivity and generalization

Regarding the issues mentioned above from an infor-mation theory perspective this paper proposes a featurescoring method MICV (Mutual Information and Coefficientof Variation) MICV utilizes the characteristics of mutualinformation and coefficient of variation and aims tominimizeintraclass distance andmaximize interclass distance A featureoptimization method ERMFT (Eliminating RedundancyBased on Maximum Feature Tree) is suggested based on aminimum spanning tree concept Experiment results showthat the MICV-ERMFT method can effectively reduce thedata dimension and improve the classification modelrsquos per-formance Compared with eight feature evaluation methodsthe MICV-ERMFT method has significant improvement inthe performance on the same dataset in this paper

2 Materials and Methods

In bird soundsrsquo recognition there exists a variety of methodsto extract features and classify the sounds For exampleHuman Factor Cepstral Coefficients are used to extract bird

sound features and classification and recognition are per-formed by the maximum likelihood method [23] Zottessoet al [24] suggest a method that extracts bird song featuresbased on the spectrogram and texture descriptors and usesthe dissimilarity framework for classification and recogni-tion In this paper the classification process of bird sounds isdivided into three stages feature extraction feature selec-tion and classification recognition Feature selection is se-lected as the research focus -e proposed classificationprocess of bird sounds based on MICV-ERMFT is shown inFigure 1

Stage 1 Preprocess the bird soundsrsquo audio data (removenoises and converse the channel) and use MFCC andCELP to extract features from the preprocessed dataand construct dataset DMampC (dataset formed by themerger of MFCC and CELP features)Stage 2 Apply the MICV method on DMampC evaluatethe contribution and score each feature Sort thefeature sequence in ascending order denote as F andcalculate the Pearson correlation coefficient for thefeatures and build a maximum feature tree T -enapply the ERMFT method to eliminate redundantfeatures and construct a new dataset DMampCprimeStage 3 Build a classification model on DMampCprime andanalyze the classification results

21 Feature Extraction Birds make sounds in the same wayas humans do [25 26] -e frequency of human languageused for daily communication ranges from 180Hz to 6 kHzand the most used frequency range for bird calls is from 05to 6 kHz [25 27] Under this assumption we process thefeatures of bird sounds in a way similar to processing thehuman language MFCC (Mel-Frequency Cepstral Coeffi-cient) and CELP (Code Excited Linear Prediction) are ap-plied to the raw bird sounds data to extract features in thispaper

211 MFCC MFCC [3] is a human-hearing-based non-linear feature extraction method -e process is shown inFigure 2

Step 1 A single-frame short-term signal xw(i n) is ob-tained by separating frames and adding a window functionto the original audio signal x(n) Adding a windowfunction reduces the frequency spectrum leakage -ispaper selects the 20 s as a frame and uses the Hammingwindow

Step 2 To observe the distribution of xw(i n) in frequencydomain FFT (fast Fourier transform) is used to transformthe signal from the time domain to frequency domainnamed X(i k)

2 Mathematical Problems in Engineering

X(i k) FFT xi(m)1113858 1113859 (1)

Step 3 Calculate the energy of the spectral line per frame

E(i k) [X(i k)]2 (2)

Step 4 Calculate the energy of E(i k) through the Mel filter

S(i m) 1113944nminus1

k0E(i k)Hm(k) 0lemltM (3)

where i is the i-th frame k is the k-th spectral line in thespectrum and Hm(k) is the analysis window with a samplelength of k

Step 5 Take the logarithm of the energy of the Mel filter andcalculate the DCT (Discrete Cosine Transform)

mfcc(i m) 1113944Mminus1

m0log[S(i m)]cos

πn(2m minus 1)

2M1113888 1113889 (4)

where m is them-th Mel filter i is the i-th frame and n is thespectral line after the DCT

In this paperMFCC uses 13-dimensional static coefficients(1-dimensional log energy coefficient and 12-dimensional DCTcoefficients) as extraction parameters [3 28] -e resultingsample has 13 features

212 CELP -e CELP feature extraction method is derivedfrom LPC (Linear Predictive Coding) based on a com-pression coding tech G7231 -e LPC is extracted from the0th to 23rd bits from the bit coding in each frame formingthe 10-dimensional LPC Another 2-dimensional featurethe lag of pitch is extracted from the 24th to the 42nd bitstream in each frame -e extraction of CELP is shown inFigure 3

Endpoint detection is performed after the original audiofile is preprocessed -en each audio is divided into severalsound segments Each sound segment is considered as asample in the experiment For each frame features areextracted using MFCC (13 dimensions) and CELP (12 di-mensions) -e sampling rate is 16 kHz audio is a singlechannel Each sample contains several frames For each

Bird sounds audio Preprocessing

MFCC CELP

DM amp C

DM amp Cacute

According to MICV method

scoring features

Feature extraction

Calculate Pearson correlation coefficient

matrix

Build the maximum feature tree T

Ascending order of feature score sequence F

ERMFT(remove redundancy)

Building classification

model

F T

Classified evaluation

MICV-ERMFT

All audio

Stage 1

Stage 3 Stage 2

Figure 1 Bird sound classification model based on MICV-ERMFT feature selection

Framing and windowing FFT

Energy of spectral

line

Mel filterenergy

CalculateDCT

x(n) MFCC(in)S(im)E(ik)X(ik)xw(in)

Figure 2 MFCC schematic

Mathematical Problems in Engineering 3

detection segment (including many frames) the meanmedian and variance of each feature are calculated to obtain75-dimensional data -e feature extraction process isshown in Figure 4

22 Feature ScoringMethodMICV Based on the principle ofsmall distance within classes and large distance betweenclasses features that are easy to distinguish are selected Tocalculate the degree of feature differentiation mutual in-formation MIEC (Mutual Information for Interclass) is usedto measure the interclass distance and the coefficient ofvariation CVAC (Coefficient of Variation for Intraclass) isused to measure the intraclass distance

-e MIEC and CVAC methods are combined to cal-culate the classification contribution degree of features -ecalculation equation is

micvf λmiecf +(1 minus λ)cvacf (5)

Because intraclass distance and interclass distance havedifferent weights the coefficient λ(0lt λlt 1) is introduced toadjust the weights

221 MIEC Mutual information measures the correlationor the dependency between two variables For two discreterandom variables X and Y mutual information I(X Y) iscalculated as

I(X Y) 1113944y∊Y

1113944x∊X

p(x y)logp(x y)

p(x)p(y)1113888 1113889 (6)

In equation (6) p(x y) is the joint probability densityfunction of x and y p(x) and p(y) are the marginalprobability density functions of x and y

Generally when mutual information is used to selectfeatures variables X and Y represent the feature vector andlabel vector In this paper X and Y represent two vectors ofdifferent classes under the same feature Given feature spaceF and classification space C the interclass mutual infor-mation of f-th feature miecf is calculated as

miecf 1113944i∊C

1113944j∊ Ci

I(i j)(7)

In equation (7) i and j(ine j) are the samples of f-thfeature in i-th class and j-th class miecf is the interclassmutual information of f-th feature in F -e interclassdifference feature f is greater when the miecf is smaller andvice versa

222 CVAC In statistics the variation (CV) coefficientmeasures the variation between two or more samples or thedispersion between them -e expression is

Cv σμ

(8)

where μ and σ are the mean and standard deviation of thesamples Given feature space F and classification space C theintraclass coefficient of variation of feature f cvacf is cal-culated as

cvacf 1113944C

i1Cvi (9)

In equation (9) Cvi represents the CV of samples in classi -e feature f has a higher cohesion when cvacf is smaller

23 Feature Selection Method MICV-ERMFT After scoringthe features using the MICV method high-quality featuresare selected MICV-ERMFT is used to eliminate redundantfeatures in the feature array sorted by scores -e process isshown in Algorithm 1

231 Build Maximum Feature Tree -e maximum featuretree is derived from the minimum spanning tree For anundirected graph G(V E) each edge has a weight w aminimum spanning tree is a subset of edges Ersquo that connectall the vertices V with no cycle and the total weight of edgesin Ersquo is minimum In a maximum feature tree features arerepresented as vertices and weights of the edges are decidedby Pearson correlation coefficient P(FrFc) represents thecorrelation coefficient between features Fr and Fc which iscalculated as

P FrFc( ) 1113936i Fri minus Fr( 1113857 Fci minus Fc( 1113857

1113936i Fri minus Fc( 11138572

1113969

1113936i Fci minus Fc( 11138572

1113969 (10)

I FrFc( ) minus log2 1 minus P

2FrFc( )1113874 1113875

2

(11)

In equation (10) Fri represents the i-th sample of featurer Fr is the feature rrsquos mean value of all samples In equation(11) I(FrFc) is the correlation coefficient between features rand c Algorithm BMFT (building the max feature tree) usesequations (10) and (11) to calculate the correlation coeffi-cient matrix and construct the maximum feature tree De-tails are described in Algorithm 2

BitTorrent CodingFeature extraction from each frame

bit stream

LPC featuresAudio signal

LPC0~

LPC2

12-dimensional CELPACL0

ACL2

Lag of pitch

Figure 3 CELP feature extraction process

4 Mathematical Problems in Engineering

232 Remove Redundant Features Based on TwoNeighborhoods ERFTN (Eliminate Redundant Featuresbased on Two Neighborhoods) is based on eliminating re-dundancy using the concept of two neighborhoods Oneexample with a maximum feature tree T and feature se-quence F sorted using the MICV method is demonstrated inFigure 5

As shown in Figure 5 given max feature TF f2 f1 f3 f4 f5 f7 f9 f8 f6 f101113864 1113865 sorted withMICV method in ascending order the steps of the ERFTNalgorithm are listed as Algorithm 3 -e final feature subsetof F is f2 f3 f7 f101113864 1113865

3 Experiments and Results Analysis

31 Experimental Dataset Currently there are many web-sites dedicated to sharing bird sounds from around theworld such as Avibase [29] and Xeno-Canto [30] Re-cordings of bird sounds are collected and annotated on thesewebsites -e tapes include various types of voice expres-sions (multiple calls and songs) of various individualsrecorded in their natural environment -e dataset used forthis paper comes from the Avibase which is a collection ofMP3 or WAV audio files -ese audio files are unified intothe 16 kHz sampling rate and monochannel Since the audiofiles are not all bird sounds the bird sounds in the audio areseparated through the voice activity detection (VAD)[25 31] and then the MFCC and CELP features areextracted according to the process shown in Figure 4

-e experiments used two datasets including birdsounds and crane sounds We have selected six different birdspecies from different genera in bird sounds which contains433 samples -e crane sound dataset includes 343 samplesfrom seven species ofGrus-e dataset information is shownin Tables 1 and 2

32 9e Experiment of MICV Scoring Method To verify theproposed methodrsquos effectiveness two separate experimentsare conducted to test the MICV scoring method and MICV-ERMFT feature selection method -e classifiers used in theexperiments include Decision Tree (J48) SVM BayesNet(NB) and Random Forests (RFs) -e feature scoringmethod is compared with ConstraintScore (CS) [11] and sixother feature scoring methods provided by Weka [32] in-cluding Correlation (Cor) GainRatio (GR) InfoGain (IG)

One-R (OR) ReliefF (RF) and SymmetricalUncert (SU) inexperiments

321 Classifier Performance Evaluation Kappa F1 scoreand accuracy rate were used as evaluation indicators

(1) Kappa Cohenrsquos Kappa coefficient is a statistical measurethat indicates the interrater reliability (and also intraraterreliability) for qualitative (categorical) items

Kappa po minus pe

1 minus pe

(12)

where po is the overall classification accuracy which iscalculated by the number of correctly classified samplesdivided by the total number of samples Based on theconfusion matrix assume the numbers of real samples ineach class are a1 a2 an1113864 1113865 the numbers of predictedsamples are b1 b2 bn1113864 1113865 and pe is calculated as

pe a1 times b1 + a2 times b2 + middot middot middot + an times bn

nlowast n (13)

(2) F1 Score It is an index used to measure the accuracy ofclassification models in statistics while taking into accountthe accuracy and recall of classification models As shown inequation (14) precision represents the precision rate andrecall represents the recall rate

Birdsounds Preprocessing Dataset

(75dim)

MFCC1 MFCC13MFCC2n

n CELP1 CELP12CELP2

Each detection fragment contains the number of n frames and the mean variance and

median of n samples are calculated for each feature

Figure 4 Extraction process of bird sounds feature

1

2

9

3

4

5 6

7

8

10

2

1

3

4

5

7

9

8

6

10

Score based on MICV FMaximum feature tree T

Min

Max

Figure 5 Schematic of the ERFTN

Mathematical Problems in Engineering 5

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 3: Feature Selection Using Maximum Feature Tree Embedded with

X(i k) FFT xi(m)1113858 1113859 (1)

Step 3 Calculate the energy of the spectral line per frame

E(i k) [X(i k)]2 (2)

Step 4 Calculate the energy of E(i k) through the Mel filter

S(i m) 1113944nminus1

k0E(i k)Hm(k) 0lemltM (3)

where i is the i-th frame k is the k-th spectral line in thespectrum and Hm(k) is the analysis window with a samplelength of k

Step 5 Take the logarithm of the energy of the Mel filter andcalculate the DCT (Discrete Cosine Transform)

mfcc(i m) 1113944Mminus1

m0log[S(i m)]cos

πn(2m minus 1)

2M1113888 1113889 (4)

where m is them-th Mel filter i is the i-th frame and n is thespectral line after the DCT

In this paperMFCC uses 13-dimensional static coefficients(1-dimensional log energy coefficient and 12-dimensional DCTcoefficients) as extraction parameters [3 28] -e resultingsample has 13 features

212 CELP -e CELP feature extraction method is derivedfrom LPC (Linear Predictive Coding) based on a com-pression coding tech G7231 -e LPC is extracted from the0th to 23rd bits from the bit coding in each frame formingthe 10-dimensional LPC Another 2-dimensional featurethe lag of pitch is extracted from the 24th to the 42nd bitstream in each frame -e extraction of CELP is shown inFigure 3

Endpoint detection is performed after the original audiofile is preprocessed -en each audio is divided into severalsound segments Each sound segment is considered as asample in the experiment For each frame features areextracted using MFCC (13 dimensions) and CELP (12 di-mensions) -e sampling rate is 16 kHz audio is a singlechannel Each sample contains several frames For each

Bird sounds audio Preprocessing

MFCC CELP

DM amp C

DM amp Cacute

According to MICV method

scoring features

Feature extraction

Calculate Pearson correlation coefficient

matrix

Build the maximum feature tree T

Ascending order of feature score sequence F

ERMFT(remove redundancy)

Building classification

model

F T

Classified evaluation

MICV-ERMFT

All audio

Stage 1

Stage 3 Stage 2

Figure 1 Bird sound classification model based on MICV-ERMFT feature selection

Framing and windowing FFT

Energy of spectral

line

Mel filterenergy

CalculateDCT

x(n) MFCC(in)S(im)E(ik)X(ik)xw(in)

Figure 2 MFCC schematic

Mathematical Problems in Engineering 3

detection segment (including many frames) the meanmedian and variance of each feature are calculated to obtain75-dimensional data -e feature extraction process isshown in Figure 4

22 Feature ScoringMethodMICV Based on the principle ofsmall distance within classes and large distance betweenclasses features that are easy to distinguish are selected Tocalculate the degree of feature differentiation mutual in-formation MIEC (Mutual Information for Interclass) is usedto measure the interclass distance and the coefficient ofvariation CVAC (Coefficient of Variation for Intraclass) isused to measure the intraclass distance

-e MIEC and CVAC methods are combined to cal-culate the classification contribution degree of features -ecalculation equation is

micvf λmiecf +(1 minus λ)cvacf (5)

Because intraclass distance and interclass distance havedifferent weights the coefficient λ(0lt λlt 1) is introduced toadjust the weights

221 MIEC Mutual information measures the correlationor the dependency between two variables For two discreterandom variables X and Y mutual information I(X Y) iscalculated as

I(X Y) 1113944y∊Y

1113944x∊X

p(x y)logp(x y)

p(x)p(y)1113888 1113889 (6)

In equation (6) p(x y) is the joint probability densityfunction of x and y p(x) and p(y) are the marginalprobability density functions of x and y

Generally when mutual information is used to selectfeatures variables X and Y represent the feature vector andlabel vector In this paper X and Y represent two vectors ofdifferent classes under the same feature Given feature spaceF and classification space C the interclass mutual infor-mation of f-th feature miecf is calculated as

miecf 1113944i∊C

1113944j∊ Ci

I(i j)(7)

In equation (7) i and j(ine j) are the samples of f-thfeature in i-th class and j-th class miecf is the interclassmutual information of f-th feature in F -e interclassdifference feature f is greater when the miecf is smaller andvice versa

222 CVAC In statistics the variation (CV) coefficientmeasures the variation between two or more samples or thedispersion between them -e expression is

Cv σμ

(8)

where μ and σ are the mean and standard deviation of thesamples Given feature space F and classification space C theintraclass coefficient of variation of feature f cvacf is cal-culated as

cvacf 1113944C

i1Cvi (9)

In equation (9) Cvi represents the CV of samples in classi -e feature f has a higher cohesion when cvacf is smaller

23 Feature Selection Method MICV-ERMFT After scoringthe features using the MICV method high-quality featuresare selected MICV-ERMFT is used to eliminate redundantfeatures in the feature array sorted by scores -e process isshown in Algorithm 1

231 Build Maximum Feature Tree -e maximum featuretree is derived from the minimum spanning tree For anundirected graph G(V E) each edge has a weight w aminimum spanning tree is a subset of edges Ersquo that connectall the vertices V with no cycle and the total weight of edgesin Ersquo is minimum In a maximum feature tree features arerepresented as vertices and weights of the edges are decidedby Pearson correlation coefficient P(FrFc) represents thecorrelation coefficient between features Fr and Fc which iscalculated as

P FrFc( ) 1113936i Fri minus Fr( 1113857 Fci minus Fc( 1113857

1113936i Fri minus Fc( 11138572

1113969

1113936i Fci minus Fc( 11138572

1113969 (10)

I FrFc( ) minus log2 1 minus P

2FrFc( )1113874 1113875

2

(11)

In equation (10) Fri represents the i-th sample of featurer Fr is the feature rrsquos mean value of all samples In equation(11) I(FrFc) is the correlation coefficient between features rand c Algorithm BMFT (building the max feature tree) usesequations (10) and (11) to calculate the correlation coeffi-cient matrix and construct the maximum feature tree De-tails are described in Algorithm 2

BitTorrent CodingFeature extraction from each frame

bit stream

LPC featuresAudio signal

LPC0~

LPC2

12-dimensional CELPACL0

ACL2

Lag of pitch

Figure 3 CELP feature extraction process

4 Mathematical Problems in Engineering

232 Remove Redundant Features Based on TwoNeighborhoods ERFTN (Eliminate Redundant Featuresbased on Two Neighborhoods) is based on eliminating re-dundancy using the concept of two neighborhoods Oneexample with a maximum feature tree T and feature se-quence F sorted using the MICV method is demonstrated inFigure 5

As shown in Figure 5 given max feature TF f2 f1 f3 f4 f5 f7 f9 f8 f6 f101113864 1113865 sorted withMICV method in ascending order the steps of the ERFTNalgorithm are listed as Algorithm 3 -e final feature subsetof F is f2 f3 f7 f101113864 1113865

3 Experiments and Results Analysis

31 Experimental Dataset Currently there are many web-sites dedicated to sharing bird sounds from around theworld such as Avibase [29] and Xeno-Canto [30] Re-cordings of bird sounds are collected and annotated on thesewebsites -e tapes include various types of voice expres-sions (multiple calls and songs) of various individualsrecorded in their natural environment -e dataset used forthis paper comes from the Avibase which is a collection ofMP3 or WAV audio files -ese audio files are unified intothe 16 kHz sampling rate and monochannel Since the audiofiles are not all bird sounds the bird sounds in the audio areseparated through the voice activity detection (VAD)[25 31] and then the MFCC and CELP features areextracted according to the process shown in Figure 4

-e experiments used two datasets including birdsounds and crane sounds We have selected six different birdspecies from different genera in bird sounds which contains433 samples -e crane sound dataset includes 343 samplesfrom seven species ofGrus-e dataset information is shownin Tables 1 and 2

32 9e Experiment of MICV Scoring Method To verify theproposed methodrsquos effectiveness two separate experimentsare conducted to test the MICV scoring method and MICV-ERMFT feature selection method -e classifiers used in theexperiments include Decision Tree (J48) SVM BayesNet(NB) and Random Forests (RFs) -e feature scoringmethod is compared with ConstraintScore (CS) [11] and sixother feature scoring methods provided by Weka [32] in-cluding Correlation (Cor) GainRatio (GR) InfoGain (IG)

One-R (OR) ReliefF (RF) and SymmetricalUncert (SU) inexperiments

321 Classifier Performance Evaluation Kappa F1 scoreand accuracy rate were used as evaluation indicators

(1) Kappa Cohenrsquos Kappa coefficient is a statistical measurethat indicates the interrater reliability (and also intraraterreliability) for qualitative (categorical) items

Kappa po minus pe

1 minus pe

(12)

where po is the overall classification accuracy which iscalculated by the number of correctly classified samplesdivided by the total number of samples Based on theconfusion matrix assume the numbers of real samples ineach class are a1 a2 an1113864 1113865 the numbers of predictedsamples are b1 b2 bn1113864 1113865 and pe is calculated as

pe a1 times b1 + a2 times b2 + middot middot middot + an times bn

nlowast n (13)

(2) F1 Score It is an index used to measure the accuracy ofclassification models in statistics while taking into accountthe accuracy and recall of classification models As shown inequation (14) precision represents the precision rate andrecall represents the recall rate

Birdsounds Preprocessing Dataset

(75dim)

MFCC1 MFCC13MFCC2n

n CELP1 CELP12CELP2

Each detection fragment contains the number of n frames and the mean variance and

median of n samples are calculated for each feature

Figure 4 Extraction process of bird sounds feature

1

2

9

3

4

5 6

7

8

10

2

1

3

4

5

7

9

8

6

10

Score based on MICV FMaximum feature tree T

Min

Max

Figure 5 Schematic of the ERFTN

Mathematical Problems in Engineering 5

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 4: Feature Selection Using Maximum Feature Tree Embedded with

detection segment (including many frames) the meanmedian and variance of each feature are calculated to obtain75-dimensional data -e feature extraction process isshown in Figure 4

22 Feature ScoringMethodMICV Based on the principle ofsmall distance within classes and large distance betweenclasses features that are easy to distinguish are selected Tocalculate the degree of feature differentiation mutual in-formation MIEC (Mutual Information for Interclass) is usedto measure the interclass distance and the coefficient ofvariation CVAC (Coefficient of Variation for Intraclass) isused to measure the intraclass distance

-e MIEC and CVAC methods are combined to cal-culate the classification contribution degree of features -ecalculation equation is

micvf λmiecf +(1 minus λ)cvacf (5)

Because intraclass distance and interclass distance havedifferent weights the coefficient λ(0lt λlt 1) is introduced toadjust the weights

221 MIEC Mutual information measures the correlationor the dependency between two variables For two discreterandom variables X and Y mutual information I(X Y) iscalculated as

I(X Y) 1113944y∊Y

1113944x∊X

p(x y)logp(x y)

p(x)p(y)1113888 1113889 (6)

In equation (6) p(x y) is the joint probability densityfunction of x and y p(x) and p(y) are the marginalprobability density functions of x and y

Generally when mutual information is used to selectfeatures variables X and Y represent the feature vector andlabel vector In this paper X and Y represent two vectors ofdifferent classes under the same feature Given feature spaceF and classification space C the interclass mutual infor-mation of f-th feature miecf is calculated as

miecf 1113944i∊C

1113944j∊ Ci

I(i j)(7)

In equation (7) i and j(ine j) are the samples of f-thfeature in i-th class and j-th class miecf is the interclassmutual information of f-th feature in F -e interclassdifference feature f is greater when the miecf is smaller andvice versa

222 CVAC In statistics the variation (CV) coefficientmeasures the variation between two or more samples or thedispersion between them -e expression is

Cv σμ

(8)

where μ and σ are the mean and standard deviation of thesamples Given feature space F and classification space C theintraclass coefficient of variation of feature f cvacf is cal-culated as

cvacf 1113944C

i1Cvi (9)

In equation (9) Cvi represents the CV of samples in classi -e feature f has a higher cohesion when cvacf is smaller

23 Feature Selection Method MICV-ERMFT After scoringthe features using the MICV method high-quality featuresare selected MICV-ERMFT is used to eliminate redundantfeatures in the feature array sorted by scores -e process isshown in Algorithm 1

231 Build Maximum Feature Tree -e maximum featuretree is derived from the minimum spanning tree For anundirected graph G(V E) each edge has a weight w aminimum spanning tree is a subset of edges Ersquo that connectall the vertices V with no cycle and the total weight of edgesin Ersquo is minimum In a maximum feature tree features arerepresented as vertices and weights of the edges are decidedby Pearson correlation coefficient P(FrFc) represents thecorrelation coefficient between features Fr and Fc which iscalculated as

P FrFc( ) 1113936i Fri minus Fr( 1113857 Fci minus Fc( 1113857

1113936i Fri minus Fc( 11138572

1113969

1113936i Fci minus Fc( 11138572

1113969 (10)

I FrFc( ) minus log2 1 minus P

2FrFc( )1113874 1113875

2

(11)

In equation (10) Fri represents the i-th sample of featurer Fr is the feature rrsquos mean value of all samples In equation(11) I(FrFc) is the correlation coefficient between features rand c Algorithm BMFT (building the max feature tree) usesequations (10) and (11) to calculate the correlation coeffi-cient matrix and construct the maximum feature tree De-tails are described in Algorithm 2

BitTorrent CodingFeature extraction from each frame

bit stream

LPC featuresAudio signal

LPC0~

LPC2

12-dimensional CELPACL0

ACL2

Lag of pitch

Figure 3 CELP feature extraction process

4 Mathematical Problems in Engineering

232 Remove Redundant Features Based on TwoNeighborhoods ERFTN (Eliminate Redundant Featuresbased on Two Neighborhoods) is based on eliminating re-dundancy using the concept of two neighborhoods Oneexample with a maximum feature tree T and feature se-quence F sorted using the MICV method is demonstrated inFigure 5

As shown in Figure 5 given max feature TF f2 f1 f3 f4 f5 f7 f9 f8 f6 f101113864 1113865 sorted withMICV method in ascending order the steps of the ERFTNalgorithm are listed as Algorithm 3 -e final feature subsetof F is f2 f3 f7 f101113864 1113865

3 Experiments and Results Analysis

31 Experimental Dataset Currently there are many web-sites dedicated to sharing bird sounds from around theworld such as Avibase [29] and Xeno-Canto [30] Re-cordings of bird sounds are collected and annotated on thesewebsites -e tapes include various types of voice expres-sions (multiple calls and songs) of various individualsrecorded in their natural environment -e dataset used forthis paper comes from the Avibase which is a collection ofMP3 or WAV audio files -ese audio files are unified intothe 16 kHz sampling rate and monochannel Since the audiofiles are not all bird sounds the bird sounds in the audio areseparated through the voice activity detection (VAD)[25 31] and then the MFCC and CELP features areextracted according to the process shown in Figure 4

-e experiments used two datasets including birdsounds and crane sounds We have selected six different birdspecies from different genera in bird sounds which contains433 samples -e crane sound dataset includes 343 samplesfrom seven species ofGrus-e dataset information is shownin Tables 1 and 2

32 9e Experiment of MICV Scoring Method To verify theproposed methodrsquos effectiveness two separate experimentsare conducted to test the MICV scoring method and MICV-ERMFT feature selection method -e classifiers used in theexperiments include Decision Tree (J48) SVM BayesNet(NB) and Random Forests (RFs) -e feature scoringmethod is compared with ConstraintScore (CS) [11] and sixother feature scoring methods provided by Weka [32] in-cluding Correlation (Cor) GainRatio (GR) InfoGain (IG)

One-R (OR) ReliefF (RF) and SymmetricalUncert (SU) inexperiments

321 Classifier Performance Evaluation Kappa F1 scoreand accuracy rate were used as evaluation indicators

(1) Kappa Cohenrsquos Kappa coefficient is a statistical measurethat indicates the interrater reliability (and also intraraterreliability) for qualitative (categorical) items

Kappa po minus pe

1 minus pe

(12)

where po is the overall classification accuracy which iscalculated by the number of correctly classified samplesdivided by the total number of samples Based on theconfusion matrix assume the numbers of real samples ineach class are a1 a2 an1113864 1113865 the numbers of predictedsamples are b1 b2 bn1113864 1113865 and pe is calculated as

pe a1 times b1 + a2 times b2 + middot middot middot + an times bn

nlowast n (13)

(2) F1 Score It is an index used to measure the accuracy ofclassification models in statistics while taking into accountthe accuracy and recall of classification models As shown inequation (14) precision represents the precision rate andrecall represents the recall rate

Birdsounds Preprocessing Dataset

(75dim)

MFCC1 MFCC13MFCC2n

n CELP1 CELP12CELP2

Each detection fragment contains the number of n frames and the mean variance and

median of n samples are calculated for each feature

Figure 4 Extraction process of bird sounds feature

1

2

9

3

4

5 6

7

8

10

2

1

3

4

5

7

9

8

6

10

Score based on MICV FMaximum feature tree T

Min

Max

Figure 5 Schematic of the ERFTN

Mathematical Problems in Engineering 5

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 5: Feature Selection Using Maximum Feature Tree Embedded with

232 Remove Redundant Features Based on TwoNeighborhoods ERFTN (Eliminate Redundant Featuresbased on Two Neighborhoods) is based on eliminating re-dundancy using the concept of two neighborhoods Oneexample with a maximum feature tree T and feature se-quence F sorted using the MICV method is demonstrated inFigure 5

As shown in Figure 5 given max feature TF f2 f1 f3 f4 f5 f7 f9 f8 f6 f101113864 1113865 sorted withMICV method in ascending order the steps of the ERFTNalgorithm are listed as Algorithm 3 -e final feature subsetof F is f2 f3 f7 f101113864 1113865

3 Experiments and Results Analysis

31 Experimental Dataset Currently there are many web-sites dedicated to sharing bird sounds from around theworld such as Avibase [29] and Xeno-Canto [30] Re-cordings of bird sounds are collected and annotated on thesewebsites -e tapes include various types of voice expres-sions (multiple calls and songs) of various individualsrecorded in their natural environment -e dataset used forthis paper comes from the Avibase which is a collection ofMP3 or WAV audio files -ese audio files are unified intothe 16 kHz sampling rate and monochannel Since the audiofiles are not all bird sounds the bird sounds in the audio areseparated through the voice activity detection (VAD)[25 31] and then the MFCC and CELP features areextracted according to the process shown in Figure 4

-e experiments used two datasets including birdsounds and crane sounds We have selected six different birdspecies from different genera in bird sounds which contains433 samples -e crane sound dataset includes 343 samplesfrom seven species ofGrus-e dataset information is shownin Tables 1 and 2

32 9e Experiment of MICV Scoring Method To verify theproposed methodrsquos effectiveness two separate experimentsare conducted to test the MICV scoring method and MICV-ERMFT feature selection method -e classifiers used in theexperiments include Decision Tree (J48) SVM BayesNet(NB) and Random Forests (RFs) -e feature scoringmethod is compared with ConstraintScore (CS) [11] and sixother feature scoring methods provided by Weka [32] in-cluding Correlation (Cor) GainRatio (GR) InfoGain (IG)

One-R (OR) ReliefF (RF) and SymmetricalUncert (SU) inexperiments

321 Classifier Performance Evaluation Kappa F1 scoreand accuracy rate were used as evaluation indicators

(1) Kappa Cohenrsquos Kappa coefficient is a statistical measurethat indicates the interrater reliability (and also intraraterreliability) for qualitative (categorical) items

Kappa po minus pe

1 minus pe

(12)

where po is the overall classification accuracy which iscalculated by the number of correctly classified samplesdivided by the total number of samples Based on theconfusion matrix assume the numbers of real samples ineach class are a1 a2 an1113864 1113865 the numbers of predictedsamples are b1 b2 bn1113864 1113865 and pe is calculated as

pe a1 times b1 + a2 times b2 + middot middot middot + an times bn

nlowast n (13)

(2) F1 Score It is an index used to measure the accuracy ofclassification models in statistics while taking into accountthe accuracy and recall of classification models As shown inequation (14) precision represents the precision rate andrecall represents the recall rate

Birdsounds Preprocessing Dataset

(75dim)

MFCC1 MFCC13MFCC2n

n CELP1 CELP12CELP2

Each detection fragment contains the number of n frames and the mean variance and

median of n samples are calculated for each feature

Figure 4 Extraction process of bird sounds feature

1

2

9

3

4

5 6

7

8

10

2

1

3

4

5

7

9

8

6

10

Score based on MICV FMaximum feature tree T

Min

Max

Figure 5 Schematic of the ERFTN

Mathematical Problems in Engineering 5

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 6: Feature Selection Using Maximum Feature Tree Embedded with

F1 2 middotprecision middot recallprecision + recall

(14)

(3) Accuracy -e accuracy is calculated based on theequation

accurary n

M (15)

In equation (15) n represents the correct number ofclassifications and M represents the number of all samples

Each dataset is divided into 70 training set and 30 testset Each experiment is repeated 10 times to average somebiased results

322 MICV λ Parameter Setting In equation (5) use λ toadjust the weight coefficients of MIEC and CVAC -eexperiments set λ isin 01 02 03 04 5 06 07 08 09 andcalculate the MICV with J48 classifier When the highest

Kappa is reached the ratio of the number of selected featuresto the total features is listed in Table 3 A lower ratio in-dicates a better performance Table 3 shows that better re-sults can be obtained when λ is set at 01 or 03 or 02 In thefollowing experiments in this paper λ is set to 01

323 Compare MIEC CVAC and MICV -e selectedfeature set has a decisive effect on the classification modelFeatures with higher scores normally lead to more positiveclassification performance -e experiments sort the featuresequence in ascending order according to feature scoresobtained from MIEC CVAV and MICV respectively InFigure 6 in most cases the red curves are more stable toascend which shows that with the increase of featuresgradually the classification modelrsquos performance will beimproved especially in Figure 6(a) CVAC and MIECmethods have obvious fluctuations in Figures 6(a) and 6(b)To sum up combining MIEC and CVAC works better thanusing them alone

Name MICV-ERMFT feature selectionInput Dataset D (m number of samples n number of features)Step

(1) Calculate MICV using equation (5) for each feature in D

(2) Sort the MICV feature sequence in ascending order to obtain F According to F select data in D and gradually add one and usethe base classifier to score Delete the feature that led to the decline of the index and obtain the feature sequence Flowast and map Flowast

to D to get dataset Dlowast(3) Calculate Pearson correlation coefficient matrix P for the feature vector by Dlowast(4) Apply algorithm BMFT (Algorithm 2) to construct a maximum feature tree T for P(5) Apply two-neighborhood based redundancy eliminating algorithm ERFTN (Algorithm 3) on Flowast denote the result array as Flowastlowast(6) Map Flowastlowast to Dlowast to get dataset Dlowastlowast

Output new dataset Dlowastlowast

ALGORITHM 1 MICV-ERMFT feature selection

Name BMFT (building max feature tree)Input Correlation coefficient matrix Pntimesn n is the number of featuresStep

(1) Initialize root T 1 (2) Set the elements on the main diagonal as minus1 set the value as minus1 on the main diagonal to eliminate the influence from the feature

itself(3) while |T|le 2n + 1 do lowastSince P is a Strongly Connected Graph and T records the adjacency relationship of the elements in P

there are 2n+1 elements in T including the initial node lowast(4) D P(1 nT) lowastD records the correlation coefficients of the neighboring nodes of the nodes in T (D is a column vector mapped

from T to P) lowast(5) D(T1 n) minus1 lowast-enodes that have been visited are recorded in T-is operation is equivalent to deleting the accessed data inDlowast

(6) rowid FindIndex[max(D)]row Find the maximum value of all nodes adjacent to visit id and record the row index(7) T Tcup Tend rowid1113864 1113865lowast T end row id records the adjacency relationship for example T end and row id are adjacent nodes lowast(8) end while(9) return T

Output Maximum feature tree T

ALGORITHM 2 BMFT

6 Mathematical Problems in Engineering

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 7: Feature Selection Using Maximum Feature Tree Embedded with

324 Experiment of MICV Results and Analysis In thissection the proposed MICV is tested on the Birds datasetand the Crane dataset -e results of the experiment inFigures 7 and 8 show that at the same number of selectedfeatures the Kappa value of the MICV method is basicallyhigher than that of other methods As the number of featuresincreases the Kappa value of the MICV method can con-verge earlier and remains relatively stable compared withother methods MICV is more effective compared with theresults of other feature evaluation methods

Tables 4 and 5 record the best classification results(Kappa accuracy and F1 scores) for each feature scoringsequence as well as the number of features used to obtainthis value -e bold one on the left side of ldquo|rdquo in each row inthe table indicates that the method has the least number offeatures than other methods and the bold on the right

indicates that the method has the highest evaluation indi-cator score Table 4 shows that in bird dataset MICVmethods had the highest Kappa value under four differentclassifiers In J48 NB and RFs classifiers MICV methodshad the lowest number of features and the highest score ofevaluation indicators in most cases As shown in Table 5 theperformances of MICV in J48 NB and RFs classifiers aresignificant

In summary the MICV method is more effective inselecting optimal features than the other seven methods-emethod can also get a good modeling effect by using a lowerdimension

33 Experiment of MICV-ERMFT Feature Selection In thesecond part of the experiment features are evaluated using CS

Name ERFTN (Eliminate Redundant Features based on Two Neighborhoods)Input T Max feature Tree by Algorithm BMFT F Features sorted with MICV methodStep

(1) Get the first element x in F(2) V y|y isin T1113864 y is the adjacent vertices of x (3) UpdateF by deleting all vertices in V that is F FV(4) Choose the next unvisited element as x(5) Repeat (2) to (4) until all the elements in F are visited(6) Output F as the final feature subset

Output F

ALGORITHM 3 ERFTN

Number of features

45

50

55

60

65

70

75

80

85

90

Acc (

)

Birds dataset

CVACMIECMICV

0 10 20 30 40 50 60 70 80

(a)

Number of features

Acc (

)

CVACMIECMICV

45

50

55

60

65

70

75

80Crane dataset

0 10 20 30 40 50 60 70 80

(b)

Figure 6 Experimental results of MIEC CVAC and MICV feature selection methods (a) Birds dataset and (b) Crane dataset

Mathematical Problems in Engineering 7

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 8: Feature Selection Using Maximum Feature Tree Embedded with

and six other Weka methods including Cor GR IG OR RFand SU

331 Procedure of Experiment -e procedure is demon-strated in Figure 9 Eight different methods (MICV and theseven other methods mentioned above) are used to evaluateeach featurersquos classification contribution and score thefeatures After sorting the features in an ascending order

based on the scores the ERMFT method is then used toeliminate redundant features resulting in a feature subset FprimeFprime is thenmapped to Dataset resulting in Datasetrsquo J48 SVMBayesNet (NB) and Random Forests (RFs) are the experi-mentrsquos classifiers For each independent dataset it is dividedinto 70 training set and 30 test set Each experiment isrepeated ten times and the average Kappa is calculatedAlso the DRR (Dimensionality Reduction Rate) as anevaluation indicator is introduced

Number of features

Kapp

aBirds dataset

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Birds dataset

Kapp

a

1

09

08

07

06

05

04

03

02

01

0

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 7 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Birds dataset) (a) J48 (b) SVM (c)RFs (d) NB

8 Mathematical Problems in Engineering

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 9: Feature Selection Using Maximum Feature Tree Embedded with

Crane dataset

Number of features

Kapp

a

CorGRIGOR

RFSUCSMICV

1

09

08

07

06

05

04

03

02

01

00 10 20 30 40 50 60 70 80

(a)

Crane dataset

CorGRIGOR

Kapp

a

09

08

07

06

05

04

03

02

01

ndash01

0

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(b)

Crane dataset

Kapp

a

09

08

07

06

05

04

03

02

01

Number of features

CorGRIGOR

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(c)

Crane dataset

Kapp

a

08

07

06

05

04

03

02

01

ndash01

0

CorGRIGOR

Number of features

RFSUCSMICV

0 10 20 30 40 50 60 70 80

(d)

Figure 8 Experimental results of MICV and other 7 feature evaluation methods in different classifiers (Crane dataset) (a) J48 (b) SVM(c) RFs (d) NB

Table 1 Bird sound dataset information

Latin name Eng name Genus Number of samples RatePhalacrocorax carbo Great cormorant Phalacrocorax 36 831Numenius phaeopus Whimbrel Numenius 90 2079Aegithina nigrolutea White-tailed iora Aegithina 120 2772Chrysolophus amherstiae Lady Amherstrsquos Chrysolophus 68 1570Falco tinnunculus Common kestrel Falco 61 1409Tadorna ferruginea Ruddy shelduck Tadorna 58 1339

Mathematical Problems in Engineering 9

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 10: Feature Selection Using Maximum Feature Tree Embedded with

Table 2 Crane sounds dataset information

Latin name Eng name Genus Number of samples Rate ()Grus vipio White-naped crane

Grus

24 700Grus canadensis Sandhill crane 39 1137Grus virgo Demoiselle crane 60 1749Grus grus Common crane 62 1808Grus monacha Hooded crane 62 1808Grus japonensis Red-crowned crane 29 845Grus nigricollis Tibetan crane 67 1953

MICV

Cor

GR

IG

RF

OR

SU

CS

Feature score

Pearson product-moment correlation coefficient matrix T

Scoring sequence F

In addition to MICV ascending other methods

descending

Dataset

BMFTTprime

ERFTN

Feature subset Fprime

Datasetprime

Map Fprime to Dataset

ERMFT

Figure 9 Flowchart of experiment of MICV-ERMFT feature selection

Birds dataset

J48 RFs SVM NBClassifier

Kapp

a

CorGRIG

ORRFSU

CSMICVORI

1

09

08

07

06

05

04

03

02

01

0

(a)

ClassifierJ48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Met

hod

of fe

atur

e sele

ctio

n

Birds dataset

36

35

35

38

35

37

36

38

35

38

35

37

36

35

35

38

35

37

38

35

35

35

35

35

42

31

42

45

42

32

45

42

32

34

36

38

40

42

44

(b)

Figure 10 Continued

10 Mathematical Problems in Engineering

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 11: Feature Selection Using Maximum Feature Tree Embedded with

DRR 1 minusFnprime

Fn

1113888 1113889lowast 100 (16)

In equation (16) Fnprime is the number of selected features

and Fn is the number of all features of each dataset -elarger the DRR value the stronger the ability to reducedimensions

332 Experiment of MICV-ERMFT Results and AnalysisFigure 10 shows the experimental results obtained fromfour different classifiers using eight different featureevaluation methods combined with ERMFTFigures 10(a) and 10(b) are the results of the Birds datasetFigures 10(c) and 10(d) are the results of the Cranedataset In Figure 10(a) four histograms represent theresults under the four classifiers and 9 elements in thegroup of histograms are Kappa values calculated from theeight methods with ERMFT and the original data (ORI)-e heat map of Figure 10(b) shows the number of se-lected features when the Kappa reaches a certain value ineach method and similarly so do Figures 10(c) and 10(d)In Figure 10(a) it can be clearly observed that the MICV-

ERMFT method has a slightly higher Kappa than othermethods and the J48 classifier in Figure 10(c) is moreeffective Besides the Kappa of the MICV-ERMFTmethod is higher than the original data Looking atFigures 10(a) and 10(b) at the same it is evident that theMICV-ERMFT method achieves a good modeling effectusing a small number of featuresrsquo time comparing withthe other methods Figures 10(c) and 10(d) show a similarresult

In conclusion compared with the other seven methodsthe MICV-ERMFT method demonstrates good abilities indimensionality reduction and feature interpretation

Combining Figures 8(b) and 8(d) with Table 6 it isobvious that the MICV-ERMFT method has a significantdimensionality reduction effect and model performanceeffect for the Birds dataset and the Crane dataset In Table 6Kappa value and DRR performance are very good for J48NB and SVM classmates on Birds dataset Particularly forthe NB classifier the other seven comparison methodsrsquoKappa value does not exceed ORI while the MICV-ERMFTmethod exceeds 04 In the Crane dataset the MICV-ERMFT outperforms other methods Table 7 shows the

Classifier

CorGRIG

ORRFSU

CSMICVORI

Crane dataset

J48 RFs SVM NB

Kapp

a09

08

07

06

05

04

03

02

01

0

(c)

Classifier

Met

hod

of fe

atur

e sele

ctio

n

J48 RFs SVM NB

Cor

GR

IG

OR

RF

SU

CS

MICV

Crane dataset

42

37

37

37

37

37

37

37

36

37

37

37

37

36

42

37

37

40

35

45

35

35

45

35

43

35

43

35

35

32

35

35

34

32

36

38

40

42

44

(d)

Figure 10 Experimental results of MICV-ERMFTmethod (a) Kappa obtained with different classifiers by using different feature evaluationmethods in Birds dataset (b) Heat map of selected feature in (a) (c) Kappa obtained with different classifiers by using different featureevaluation methods in Crane dataset (d) Heat map of selected features in (c)

Table 3 -e ratio of the number of selected features with different values of λ

Datasetλ

01 02 03 04 05 06 07 08 09Birds dataset 0293 0293 0466 0453 0493 0440 0480 0666 0560Crane dataset 0240 0360 0160 0333 0706 0506 0826 0933 0866

Mathematical Problems in Engineering 11

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 12: Feature Selection Using Maximum Feature Tree Embedded with

Table 4 Comparison of Kappa accuracy and F1 scores with feature selection methods in Birds dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 72 | 085 60 | 085 59 | 086 57 | 086 73 | 083 68 | 084 51 | 084 38 | 088NB 68 | 079 37 | 086 36 | 087 39 | 086 70 | 077 36 | 087 58 | 079 46 | 088SVM 63 | 093 44 | 093 52 | 093 61 | 093 64 | 095 46 | 093 65 | 095 51 | 095RFs 71 | 097 50 | 097 36 | 097 73 | 097 72 | 095 70 | 097 53 | 096 30 | 097

Accuracy

J48 72 | 8818 60 | 8818 59 | 8909 57 | 8909 73 | 8636 65 | 8727 52 | 8727 36 | 9096NB 64 | 8363 37 | 8909 36 | 8818 39 | 8909 70 | 8181 36 | 9000 58 | 8363 34 | 9090SVM 63 | 9454 44 | 9445 41 | 9400 61 | 9445 64 | 9636 46 | 9445 65 | 9636 51 | 9363RFs 71 | 9812 50 | 9812 36 | 9812 73 | 9818 72 | 9636 70 | 9812 53 | 9727 30 | 9812

F1 score

J48 72 | 088 60 | 088 59 | 089 57 | 089 73 | 086 65 | 087 51 | 087 38 | 087NB 64 | 083 37 | 089 36 | 089 39 | 089 70 | 081 37 | 090 58 | 083 34 | 090SVM 63 | 094 49 | 094 41 | 094 61 | 094 64 | 096 46 | 094 65 | 096 51 | 093RFs 71 | 096 50 | 098 55 | 098 73 | 098 73 | 096 70 | 098 54 | 097 30 | 098

Table 6 MICV-ERMFT compared to other methods of Kappa and DRR

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORIKappa|DRR ()

Birds

J48 083 | 52 088 | 44 083 | 53 085 | 53 081 | 49 086 | 53 079 | 50 087 | 58 083 | 0NB 076 | 40 074 | 49 077 | 53 075 | 53 074 | 44 076 | 53 076 | 53 081 | 53 077 | 0SVM 092 | 52 092 | 44 089 | 53 087 | 44 092 | 53 088 | 53 092 | 53 093 | 57 092 | 0RFs 095 | 44 095 | 52 095 | 40 093 | 49 096 | 53 094 | 49 094 | 53 093 | 50 093 | 0

Crane

J48 064 | 44 074 | 50 063 | 50 070 | 53 063 | 40 062 | 50 063 | 53 076 | 53 070 | 0NB 073 | 50 075 | 52 071 | 44 073 | 53 072 | 50 071 | 50 069 | 46 076 | 53 075 | 0SVM 083 | 42 078 | 52 073 | 50 082 | 53 077 | 50 073 | 50 084 | 53 084 | 52 084 | 0RFs 085 | 50 083 | 40 083 | 50 084 | 53 084 | 50 083 | 42 084 | 50 088 | 53 088 | 0

Table 5 Comparison of Kappa accuracy and F1 scores with feature selection methods in Crane dataset

Evaluation indicator ClassifierFeature selection method

Cor GR IG OR RF SU CS MICVNumber of features|the highest value

Kappa

J48 73 | 072 69 | 074 71 | 068 70 | 072 68 | 071 71 | 069 22 | 068 25 | 075NB 73 | 075 69 | 075 73 | 075 73 | 075 71 | 075 73 | 075 53 | 079 43 | 079SVM 73 | 084 69 | 084 73 | 084 73 | 084 72 | 086 73 | 084 73 | 084 69 | 084RFs 66 | 089 51 | 088 69 | 089 73 | 089 68 | 089 63 | 088 36 | 090 41 | 090

Accuracy

J48 73 | 7700 69 | 7800 71 | 7300 70 | 7700 68 | 7600 71 | 7300 22 | 7300 18 | 7900NB 73 | 7900 69 | 7900 73 | 7900 73 | 7900 71 | 7900 73 | 7900 53 | 8100 43 | 8300SVM 72 | 8700 68 | 8700 73 | 8700 73 | 8700 72 | 8900 73 | 8700 73 | 8700 69 | 8700RFs 66 | 9100 51 | 9000 69 | 9100 73 | 9100 58 | 9100 63 | 9000 36 | 9000 41 | 9100

F1 score

J48 73 | 077 69 | 078 71 | 073 70 | 077 67 | 077 71 | 073 25 | 073 25 | 079NB 73 | 079 69 | 079 73 | 079 73 | 079 73 | 079 73 | 079 53 | 081 43 | 082SVM 72 | 086 68 | 087 73 | 086 73 | 086 72 | 088 73 | 086 73 | 087 69 | 086RFs 69 | 091 51 | 089 69 | 091 73 | 090 68 | 090 66 | 090 72 | 091 41 | 091

Table 7 -e time used by different feature evaluation methods

Dataset ClassifierMethod

Cor GR IG OR RF SU CS MICV ORI

Birds

J48 21035 18026 27100 45227 23204 27069 23268 21074 31728NB 13208 22001 28666 62036 12039 16001 18569 18257 35810SVM 21580 31500 42689 36028 56244 36028 20789 25104 51568RFs 31829 42626 51698 27853 36952 41524 61236 21568 81732

Crane

J48 19326 23825 28624 22596 32632 41069 26547 11638 51629NB 18624 16527 16549 39326 43829 52806 41026 19628 46258SVM 63426 71869 65826 63429 53440 33651 46458 30824 73496RFs 49637 53746 60689 40547 41968 31906 66504 31869 83048

12 Mathematical Problems in Engineering

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 13: Feature Selection Using Maximum Feature Tree Embedded with

running time cost by the MICV-ERMFT method and theother seven feature selection methods It is not too time-consuming than other methods

In experiments of Birds dataset and Crane datasetKappa metrics using different classifiers with the MICV-ERMFTmethod are generally superior to the other methods-e MICV-ERMFT method remains excellent for the mostpart and is more stable than the other methods althoughother methods surpass the MICV-ERMFTmethod in someclassifiers Besides the MICV-ERMFTmethod improves theKappa value compared to the original data Although theimprovement is minimal in some cases the MICV-ERMFTmethod only uses about half of the characteristic featurescompared to the original data

In conclusion MICV-ERMFT has better performance indimensionality reduction and model performanceimprovement

4 Conclusion

Feature selection is an important preprocessing step in datamining and classification In recent years researchers havefocused on feature contribution evaluation and redundancyreduction and different optimization algorithms have beenproposed to address this problem In this paper wemeasure the contribution of features to the classificationfrom the perspective of probability Combined with themaximum feature tree to remove the redundancy theMICV-ERMFT method is proposed to select the optimalfeatures and applied in the automatic recognition of birdsounds

To verify the MICV-ERMFT methodrsquos effectiveness inautomatic bird sounds recognition two datasets are used inthe experiments data of different genera (Birds dataset) anddata of the same genera (Crane dataset) -e results ofexperiments show that the Kappa indicator of the Birdsdataset reaches 093 and the dimension reduction ratereaches 57 -e Kappa value of the Crane dataset is 088the dimension reduction rate reached 53 and good resultswere obtained

-is study shows that the proposed MICV-ERMFTfeature selection method is effective -e bird audio se-lected in this paper is noise filtered and further researchshould test this methodrsquos performance using a denoisingmethod We will continue to explore the performance ofMICV-ERMFT in the dataset with a larger number offeatures and instances

Data Availability

All the data included in this study are available upon requestby contact with the corresponding author

Disclosure

-e funders had no role in the design of the study in thecollection analyses or interpretation of data in thewriting of the manuscript or in the decision to publishthe results

Conflicts of Interest

-e authors declare no conflicts of interest

Acknowledgments

-is research was funded by the National Natural ScienceFoundation of China under Grants nos 61462078 31960142and 31860332

References

[1] C A Ruiz-Martinez M T Akhtar Y Washizawa andE Escamilla-Hernandez ldquoOn investigating efficient meth-odology for environmental sound recognitionrdquo in Proceedingsof the ISPACS 2013mdash2013 International Symposium on In-telligent Signal Processing and Communication Systemspp 210ndash214 Naha Japan November 2013

[2] P Jancovic and M Kokuer ldquoBird species recognition usingunsupervised modeling of individual vocalization elementsrdquoIEEEACM Transactions on Audio Speech and LanguageProcessing vol 27 no 5 pp 932ndash947 2019

[3] A D P Ramirez J I De La Rosa Vargas R R Valdez andA Becerra ldquoA comparative between mel frequency cepstralcoefficients (MFCC) and inverse mel frequency cepstral co-efficients (IMFCC) features for an automatic bird speciesrecognition systemrdquo in Proceedings of the 2018 IEEE LatinAmerican Conference on Computational Intelligence (LA-CCI) pp 1ndash4 Gudalajara Mexico November 2018

[4] D Griffin and J Jae Lim ldquoSignal estimation from modifiedshort-time Fourier transformrdquo IEEE Transactions onAcoustics Speech and Signal Processing vol 32 no 2pp 236ndash243 1984

[5] S Kadambe and G F Boudreaux-Bartels ldquoApplication of thewavelet transform for pitch detection of speech signalsrdquo IEEETransactions on Information 9eory vol 38 no 2 pp 917ndash924 1992

[6] E Tsau S-H Kim and C-C J Kuo ldquoEnvironmental soundrecognition with CELP-based featuresrdquo in Proceedings of theISSCS 2011mdashInternational Symposium on Signals Circuitsand Systems pp 1ndash4 Iasi Romania July 2011

[7] C Collberg C -omborson and D Low ldquoA taxonomy ofobfuscating transformationsrdquo Technical Reports 148 -eUniversity of Auckland Auckland New Zealand 1997

[8] S Garcıa J Luengo F Herrera S Garcıa J Luengo andF Herrera ldquoFeature selectionrdquo Intelligent Systems ReferenceLibrary vol 72 pp 163ndash193 2015

[9] V Kumar and S Minz ldquoFeature selection a literature reviewrdquoSmart Computing Review vol 4 2014

[10] Y Zhang Q Wang D-W Gong and X-F Song ldquoNon-negative Laplacian embedding guided subspace learning forunsupervised feature selectionrdquo Pattern Recognition vol 93pp 337ndash352 2019

[11] S Zhao Y Zhang H Xu and T Han ldquoEnsemble classifi-cation based on feature selection for environmental soundrecognitionrdquo Mathematical Problems in Engineeringvol 2019 Article ID 4318463 7 pages 2019

[12] S H Zhang Z Zhao Z Y Xu K Bellisario andB C Pijanowski ldquoAutomatic bird vocalization identificationbased on fusion of spectral pattern and texture featuresrdquo inProceedings of the 2018 IEEE International Conference onAcoustics Speech and Signal Processing (ICASSP) pp 271ndash275 Calgary Canada April 2018

Mathematical Problems in Engineering 13

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering

Page 14: Feature Selection Using Maximum Feature Tree Embedded with

[13] A V Bang and P P Rege ldquoRecognition of bird species fromtheir sounds using data reduction techniquesrdquo in Proceedingsof the 7th International Conference on Computer and Com-munication Technology pp 111ndash116 Allahabad India No-vember 2017

[14] M Mafarja I Aljarah A A Heidari et al ldquoBinary dragonflyoptimization for feature selection using time-varying transferfunctionsrdquo Knowledge-Based Systems vol 161 pp 185ndash2042018

[15] Q Wu Z Ma J Fan G Xu and Y Shen ldquoA feature selectionmethod based on hybrid improved binary quantum particleswarm optimizationrdquo IEEE Access vol 7 pp 80588ndash806012019

[16] H W Wang Y Meng P Yin and J Hua ldquoA model-drivenmethod for quality reviews detection an ensemble model offeature selectionrdquo in Proceedings of the Fifteenth WuhanInternational Conference Electric Buses pp 573ndash581 WuhanChina 2016

[17] H Rao X Shi A K Rodrigue et al ldquoFeature selection basedon artificial bee colony and gradient boosting decision treerdquoApplied Soft Computing vol 74 pp 634ndash642 2019

[18] D A A Gnana ldquoLiterature review on feature selectionmethods for high-dimensional datardquo International Journal ofComputer Applications vol 136 no 1 pp 9ndash17 2016

[19] G I Sayed A Darwish and A E Hassanien ldquoA new chaoticwhale optimization algorithm for features selectionrdquo Journalof Classification vol 35 no 2 pp 300ndash344 2018

[20] A E Hegazy M A Makhlouf and G S El-Tawel ldquoImprovedsalp swarm algorithm for feature selectionrdquo Journal of KingSaud UniversitymdashComputer and Information Sciences vol 32no 3 pp 335ndash344 2020

[21] M Khamees A Albakry and K Shaker ldquoMulti-objectivefeature selection hybrid of salp swarm and simulatedannealing approachrdquo in Proceedings of the InternationalConference on New Trends in Information and Communica-tions Technology Applications pp 129ndash142 Baghdad IraqJanuary 2018

[22] M Sadeghi and HMarvi ldquoOptimalMFCC features extractionby differential evolution algorithm for speaker recognitionrdquoin Proceedings of the 2017 3rd Iranian Conference on Intel-ligent Systems and Signal Processing (ICSPIS) pp 169ndash173Shahrood Iran December 2017

[23] A V Bang and P P Rege ldquoAutomatic recognition of birdspecies using human factor cepstral coefficientsrdquo SmartComputing and Informatics vol 77 pp 363ndash373 2018

[24] R H D Zottesso Y M G Costa D Bertolini andL E S Oliveira ldquoBird species identification using spectro-gram and dissimilarity approachrdquo Ecological Informaticsvol 48 pp 187ndash197 2018

[25] J Stastny M Munk and L Juranek ldquoAutomatic bird speciesrecognition based on birds vocalizationrdquo EURASIP Journal onAudio Speech and Music Processing vol 2018 no 1 pp 1ndash72018

[26] S Fagerlund Automatic Recognition of Bird Species by 9eirSounds Helsinki University of Technology Espoo Finland2004

[27] L Ptacek Birds individual automatic recognition PhD thesisUniversity of West Bohemia Pilsen Czechia 2012

[28] A B Labao M A Clutario and P C Naval ldquoClassification ofbird sounds using codebook featuresrdquo in Proceedings of theAsian Conference on Intelligent Information and DatabaseSystems pp 223ndash233 Dong Hoi City Vietnam March 2018

[29] D Lepage ldquoAvibasemdashthe world bird databaserdquo 2020 httpsavibasebsc-eocorgavibasejsp

[30] G A Pereira ldquoXeno-cantomdashsharing birds sounds fromaround the worldrdquo 2003 httpswwwxeno-cantoorg

[31] J Stastny V Skorpil and J Fejfar ldquoAudio data classificationby means of new algorithmsrdquo in Proceedings of the 2013 36thInternational Conference on Telecommunications and SignalProcessing (TSP) pp 507ndash511 Rome Italy July 2013

[32] M Hall E Frank G Holmes B Pfahringer P Reutemannand I H Witten ldquo-e WEKA data mining softwarerdquo ACMSIGKDD Explorations Newsletter vol 11 no 1 pp 10ndash182009

14 Mathematical Problems in Engineering