Statistical study of different models forclassifying Indian Ragas based on
moods
A Project Report Submittedin
Partial Fulfillment of theRequirements for the Degree of
Master of Sciencein
Computer Science
Supervised by
Dr. Carol Romanowski
Department of Computer Science
B. Thomas Golisano College of Computing and Information SciencesRochester Institute of Technology
Rochester, New York
September 2012
ii
The project “Statistical study of different models for classifying Indian Ragas based
on moods” has been examined and approved by the following Examination Committee:
Dr. Carol RomanowskiProfessorProject Committee Chair
Dr. Rajendra K. RajProfessor
Dr. Mark IndelicatoProfessor
iii
Dedication
To my parents and fiancee ...
iv
Acknowledgments
I would like to express my sincere gratitude to my project chair Professor Carol
Romanowski for her guidance and support throughout the project. I would also like to
thank my project Reader Professor Rajendra K. Raj and Observer Professor Mark
Indelicato for helping me at critical junctures in course of the project.
v
Abstract
Statistical study of different models for classifying Indian Ragasbased on moods
Supervising Professor: Dr. Carol Romanowski
Indian Ragas are the most fundamental music produced in India. Ragas are categorized
by the time of day and season they are written for, and the emotions or mood they are
intended to invoke. Existing research on ragas classifies the songs by these moods. This
project will take a Raga Recommender playlist model, which is based on important
musical motifs in each song in addition to mood and time of day, as its base line and
uses different feature selection techniques to find more fine grained results. The results
will be compared statistically to determine which feature selection methods work best
with this dataset. This study of feature selection techniques will help to generalize this
work to various types of music.
vi
Contents
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Problem Description and Motivation . . . . . . . . . . . . . . . . . . . 21.3 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Hypothesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81.5 Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
2 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.1 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92.2 DYANAMC Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 102.3 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.1 Creation of Initial Dataset . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Data Mining Tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.2.1 Weka . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.3 Clustering Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.3.1 Expectation Maximization (EM) Algorithm . . . . . . . . . . . 203.3.2 DBScan (Density based spatial clustering application with noise)
Clustering Algorithm . . . . . . . . . . . . . . . . . . . . . . . 213.4 Clustering Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.4.1 EM Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . 22
vii
3.5 Adding potential Attributes to Initial Dataset . . . . . . . . . . . . . . . 263.6 Creation of Models based on Clustering . . . . . . . . . . . . . . . . . 283.7 Applying Feature Selection Techniques . . . . . . . . . . . . . . . . . 29
3.7.1 Best First Feature Selection Technique . . . . . . . . . . . . . 293.7.2 Greedy Stepwise Feature Selection Technique . . . . . . . . . . 31
3.8 Creating Datasets for Classification . . . . . . . . . . . . . . . . . . . . 353.9 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.9.1 Naive Bayes Classification Algorithm . . . . . . . . . . . . . . 363.10 Final dataset for statistical Analysis . . . . . . . . . . . . . . . . . . . 383.11 Analysis of Results Though Minitab . . . . . . . . . . . . . . . . . . . 38
3.11.1 Minitab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.11.2 Factorial Design Analysis . . . . . . . . . . . . . . . . . . . . 39
4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.1 Current Status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424.2 Results Generalization . . . . . . . . . . . . . . . . . . . . . . . . . . 424.3 Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
A User Manual . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46A.1 Minitab Factorial design Generation . . . . . . . . . . . . . . . . . . . 46
viii
List of Tables
3.1 Ns and Nm settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193.2 Results of EM clustering . . . . . . . . . . . . . . . . . . . . . . . . . 233.3 Results of DBScan clustering . . . . . . . . . . . . . . . . . . . . . . . 253.4 Model and feature selection technique combination . . . . . . . . . . . 283.5 Features selection Results . . . . . . . . . . . . . . . . . . . . . . . . . 323.7 Models for classification . . . . . . . . . . . . . . . . . . . . . . . . . 35
ix
List of Figures
2.1 sample for Ns = 64 from a music piece . . . . . . . . . . . . . . . . . . 122.2 Closest match found on waveform Type 32 . . . . . . . . . . . . . . . 132.3 with Nm = 4 and Ns = 64. . . . . . . . . . . . . . . . . . . . . . . . . 142.4 with Nm = 3 and Ns = 192. . . . . . . . . . . . . . . . . . . . . . . . . 152.5 with Nm = 3 and Ns = 128. . . . . . . . . . . . . . . . . . . . . . . . . 162.6 Architecture of the system . . . . . . . . . . . . . . . . . . . . . . . . 17
3.1 Weka GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.2 EM Clustering algorithm ran in Weka . . . . . . . . . . . . . . . . . . 243.3 Models Created after clustering profiles with two different Clustering
Algorithm results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293.4 Best First Feature Selection on Model 1 . . . . . . . . . . . . . . . . . 303.5 Best First Feature Selection on Model 2 . . . . . . . . . . . . . . . . . 313.6 Greedy Stepwise Feature Selection Techniques applied on Model 1 . . . 333.7 Greedy Stepwise Feature Selection Techniques applied on Model 2 . . . 343.8 Classifier Results of 70% Train data and 30% Test Data . . . . . . . . . 373.9 Final data set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 383.10 Residul graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393.11 Main effects plot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.12 Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
A.1 Create new Minitab Project . . . . . . . . . . . . . . . . . . . . . . . 47A.2 select Stat, DOE, Factoril, Create Factorial Design . . . . . . . . . . . 48A.3 Create Factorial Design . . . . . . . . . . . . . . . . . . . . . . . . . . 49A.4 Generate full factorial design . . . . . . . . . . . . . . . . . . . . . . . 50A.5 Select Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51A.6 Select Factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52A.7 Go to Analyze Factorial design . . . . . . . . . . . . . . . . . . . . . . 53
x
A.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
1
Chapter 1
Introduction
1.1 Background
Indian Classical Ragas (ICR) is the most ancient and the basis of any music produced
in India. Any music created in India is related or based on Indian Ragas in some way.
These Ragas are listened and cherished by millions of people around the globe for cen-
turies. Ragas are enjoyed at different time of the day, month and even year. There are
Ragas for different season, different time of the day like ragas listened at daytime are
different to the Ragas listened at evening or at night. Each Raga has some moods at-
tached to it. The time and place where they are played are one of the very important
aspect of the Raga listening. In Indian classical music around one fifty different Ragas
exist. Majority of them are very famous and enjoyed by most of the listeners. Raga’s
literal meaning in Sanskrit (An Indian native language) is the process of coloring some-
one’s mind or heart. The emotions attached to the Ragas are associated with the notes
in Ragas. Music generated in Indian subcontinent consists of two main components i.e.
Raga and Taal. Each Raga is made up of at least six notes played in such a way that
changing its construction can completely change the emotion attached to it. Emotions
of longing, love, desire, affection, joy, sorrow, happiness and list goes on. Ragas are
divided into different groups based on jati, thaat and time and season. Jatis are usually
referred in groups of two. The Jatis which are used commonly in a number of Ragas are
Audava - Shaudava, Shaudava - Sampurna, Sampurna etc . Bollywood, which is Indian
2
subcontinent counter part of the Hollywood, creates all the music based on Ragas. More
than quarter million songs are created every year.
1.2 Problem Description and Motivation
Ragas can be listened with the moods but there should be a system, which recommends
music based on that mood. The research in this field is very active now a days and
this project will take a raga recommender system as its baseline and move forward to
use feature selection techniques to show more effective results. The Raga recommender
system classifies ragas based on the moods. The recommender uses classification algo-
rithms like K-means and EM algorithm to cluster Ragas in to different categories and
then adds sixteen other features like tempo, frequency etc. to classify those Ragas based
on the moods. The scope of this project is to apply feature selection methods on the
attributes to explore the scope of the recommendation system widen to other music and
improving the accuracy of the Ragas to fall in the respective classes. Feature selection
technique refers to selecting the most important features required for the most accu-
rate classification of data. Feature selection technique also refers to the technique of
discarding the attributes in the data set, which are not required to take a decision for
classification.
This research of music recommendation is very active these days as commercializa-
tion of music requires recommending new pieces of music to the users. The data which
is used for this research is related to the components of music like tempo, frequency,
tonic frequency and other features related specific to Ragas, while classifying we need
to have appropriate number of these features to use them to classify the Ragas.
This research will analyze data and find relations among different Ragas based on
moods. This project proposes to find a relation between different models created with
different number of features used to classify Ragas used in Raga recommender system.
This projects outcome will help research the importance of feature selection methods
3
to find more effective results. For this purpose we will use a number of Ragas and find
excerpts ranging between 12-25 seconds and cluster them. Then we will add different
features to them to classify them based on the moods. Classifying done based on the
different set of features selected will give us different models. These models will be
statistically compared to find the effect of feature selection on the results.
1.3 Related work
There is a lot of work done in this field of study. Recommending music based on moods
is very active. Feature selection is also one of the key fields of study of this research.
Related work from different research is summarized below.
Liu et al. (2002) show the significance of feature selection on classification results
[8]. The authors used six different feature selection heuristics and four different classifi-
cation approaches to compare results of classifications. The important thing seen in this
research results is that the accuracy of all the four classification algorithms is improved
by a very high percentage.
Feature selection methods heuristics used here are based on X2-statistics, t-statistics,
and correlation-based feature selection methods. In these heuristics they have tried to
filter the attributes which were not useful for the classification. The authors observed
that using fewer features they can generate a lower error rate. The research also showed
that using feature selection algorithms can lead to in 100% accuracy in some cases with
SVN and K-NN algorithm. The classifiers gave perfect results in the cases where a very
small set of features was used to do the classification. The authors concluded that in the
data sets there are features which can easily help in classifying the subject, but using
extra features can lead to confusing results.
Piramuthu (2004) has studied different feature selection techniques to find out their
effects on the learning process and his research has clearly shown that poor selection of
features can lead to a poor learning process [11]. If the wrong set of features is used to
4
make a system learn, it will lead to greater inaccuracies in the testing phase and expected
outcomes would not be achieved. The research showed that the induced decision trees
need well-selected features for the learning process, as the wrong set of features could
lead to results that are not very definitive. The author observed that the wrong set of
features in some cases can slow down the learning process and also result in memory
issues for large-scale processes. The number of attributes can be very high and trees
generated using large numbers of attributes could result in memory issues as well. This
situation also increases the cost of obtaining the data.
Silla et al. (2008) researched the impact of the feature selection process on labeling
any piece of music based on its genre [13]. They have used a genetic algorithm based
feature selection technique to select features from a large set of attributes. They found
out that the requirement of feature selection in this research was very essential because
significance of the same features was different if they were extracted from a different
part of the feature vector based on the music signals. The authors noted that some fea-
tures were part of all the feature vectors available but at some point their existence was
the deciding factor for classification. To select that feature for those kinds of segments
was an important task. In their research they found that J48, K-NN and Nave-Bayes
classifiers most benefited from feature selection but SVM did not observe any improved
results.
Vatolkin et al. (2011) have discussed the impact of feature selection on music classi-
fication [14]. Unlike regular feature selection algorithms, they have used multi-objective
feature selection techniques. They found that if objective of feature selection methods is
not only achieving a very high accuracy but increase in the width of the music selection
can also affect the classification process. They found out that the selection of features
depends a lot on the model used for the classification or clustering. This idea of multi-
objective feature selection technique is useful in cases where user interaction is used.
The addition of inputs through users can help in changing the features for classification
5
and results can become more fine grain with increasing the feature selection objectives.
Doraisamy et al. have discussed few feature selection techniques for automatic
Genre Classification of Traditional Malay music [5]. In their research they have com-
pared 5 different filter based feature selection techniques to improve the results of classi-
fication. They have compared correlation-based Feature Selection algorithm, Principal
Component Analysis (PCA), Gain Ratio attribute evaluation, Chi-square Feature Eval-
uation technique and Support Vector Machine Feature Elimination technique. In their
research they have mentioned that wrapper based feature selection methods can gen-
erate better results in their case as compared to the filter based techniques. They have
used filter based results as wrapper based feature selection can be prone to error for
data with large number of attributes. It has been seen in the results that the selection
of feature have drastically improved the classification output with these feature selec-
tion technique. Other that PCA all the 4 feature selection techniques has improved the
results of classification to a significant extent.
Pickens et al. surveys multiple feature selection techniques in his paper which are
mainly used in adhoc music retrieval [10]. It takes into account various music categories
such as monophonic, homophonic and polyphonic. He divides the music into these three
categories and identifies various feature selection methods on each of them. This is done
by dividing the features into shallow structure and deep structure. Shallow structures
comprise of the simple statistical methods of feature selection which are performed on
pitch and tone of the audio. Deep structure techniques are more complicated which use
detailed music theory, artificial intelligence and complex statistical analysis methods.
The paper explains the difference between the three types of music as mentioned above
and addresses the challenges met during the feature selection of each type. An example
is that polyphonic music is more complex than monophonic music in that it has more
dimensions than monophonic music where different pitches could be played together
in an audio. One of the ways this complication is overcome is by using monophonic
6
reduction where to analyze the music better. In this technique the melody of the music
is extracted and at most on mote is used for each time interval.
Pickens in another paper concentrates on the feature selection of polyphonic music
[9]. As explained earlier polyphonic music is complex in terms of information retrieval
as compared to monophonic and homophonic music. It is difficult to identify a next note
as many notes overlap with each other and play simultaneously in multiple dimensions.
Out of the numerous methods available for dealing with overlapping notes the author
uses monophonic slicing. Here he selects at most one note out of the multiple overlap-
ping notes in each time step. He also proposes an extension to the existing homophonic
slicing technique which provides improved feature retrieval. He proposes further testing
of the features by creating collections of audio files.
Fiebrink et al. discuss the accuracy of past feature selection techniques and study
the problem of over fitting [6]. They suggest that the traditional music feature selection
methods exaggerate the accuracy and provide various test methods to analyze the impact
of feature selection particularly on music. They examine the wrapper feature selection
methods and test its benefits. Wrapper feature selection creates a training set by using
only the candidate feature subsets and testing it for classification. Some of the exam-
ple of wrapper feature selection methods is forward selection and genetic algorithms.
They also suggest that the cross validation accuracy used to identify the success of the
methods is not a correct measure as the classification accuracy can differ a lot when
classifying new data. Using a new dataset they compared the accuracy of classification
when feature selections methods were used with the accuracy when features selection
methods were not used. It was found that most of the results were almost equal. For
audio genre classification feature selection methods provided little increase in accuracy
but not a lot of difference.
Hamad et al. focus on the music processing techniques used to retrieve information
from music and compare these techniques to identify the statistical and Linear Predictive
7
Coding features and their quality [1]. They applied various music information retrieval
techniques to extract features such as Fast Fourier Transform and Linear Predictive Cod-
ing features And Pitch. The results were than compared with the results obtained using
wavelet coefficients and it was found that wavelet coefficients provided much greater
quality o f features. They created a dataset with 216 entries which were created from
18 mono audio musical excerpts. They identified that most of the music information re-
trieval methods work better on the music which has low sampling rate. They also found
that using Euclidian distance instead of editing distance. After comparing the various
feature extraction techniques the Haar wavelet approximation technique showed higher
accuracy.
Schmidt et al. addresses the problem of constantly changing dynamic nature of
music [12]. They compare the human perspective of music with quantitative features
of music and identify a range of acoustic features. They perform various experiments
on music based on classification and regression of emotions in the music. They take
into account a number of features such as timbre, loudness and harmony and measure
their difference over time. They use Support vector machines for classification and least
squares and support vector for regression. It was found that regression provided better
accuracy and promising results as compared to classification and chroma and spectral
shape did not perform quite well. Even the traditional method of regression fared well
and they plan to use separate and new datasets with new acoustic features to further test
their results.
Herrera et al. focus on feature selection techniques while identifying the sounds of
drum [7]. They use a dataset comprised of more than six hundred sounds including kick,
snare, toms and cymbals. Fifty descriptors were identified and after applying multiple
feature selection techniques twenty features were selected. They used thee classification
techniques for the same - namely K nearest neighbors, Canonical discriminant analysis
and C4.5. To select relevant descriptors Correlation Based feature selection and ReliefF
8
method were used. They identified that all the three classification techniques provided
very similar results. They concluded that if he initial set of features was good, very
optimal sets could be identified by various wrapper or filtering techniques. Some of the
best features were spectral descriptors such as skewness, kurtosis, centroid etc.
1.4 Hypothesis
The research done by Chakraborty (2011) focused on clustering Indian Ragas based on
moods. The aim was to cluster Ragas using a large number of features. The results
were very clean. My hypothesis is that use of feature selection algorithms can improve
and generalize the results to a significant extent. I worked through my research in two
phases. In my first phase I recreated the results Chakraborty (2011) has created. I looked
for the parameters, which had produced such a fine grain results in that research. In
my second phase I worked on feature selection techniques to create different models of
Chakrabortys research [3]. I used a set of feature selection algorithms on the hooks used
by Chakraborty (2011) in her research. The different models are generated with different
set of selected features and data mining techniques for classification and clustering.
These results are statistically studied and compared to verify the expected outcome.
1.5 Roadmap
The road to this research is as follows. Chapter 2 talked about design and implemen-
tation of the project. Chapter 3 discussed feature selection techniques and data mining
techniques used to cluster and classify Indian Ragas based on mood. Analyses of results
are also done here. Chapter 4 will conclude the results and project.
9
Chapter 2
Solution
2.1 Design
1. Selection of Ragas and extracting short excerpts from the files
• Mp3 files are converted into wav format using an mp3 to wav convertor and
a short excerpt of 5-10 sec including the most repetitive and important part
of the song is extracted using Audacity free software. This excerpt is called
as hook.
• The music that Chakraborty (2011) used to classify Indian Ragas into vari-
ous moods is used here [3]. 32 various Ragas are used to classify them into
various mood categories. These music files are downloaded for research pur-
poses from www.paragchordia.com. The files are collected in mp3 format.
2. Creation of Profiles
• In this case the same procedure is followed as used by Chakraborty (2011)
to create profiles. Chakraborty (2011) used D-Transform Anomaly Detec-
tion/Classification algorithm (developed by Glenn (2009)) to create the pro-
files [4].
• Using the profiles generated by the above-mentioned algorithms, array of
32 variables are obtained. Each of the 32 variables corresponds to one CCO
10
type, where CCO is type of waveform in CCO (Combined Chaotic Oscilla-
tion) matrix, a library of 1-32 types of waveforms used to find closest match
for the profiles.
3. Selection of features subset
• This step is the focus of this research. Here, various subsets of the features
along with the profiles generated in step 2 will be used to create various
models and data mining techniques will be implemented to classify this data
in to different moods. In this study best first and greedy forward selection
techniques are used.
4. Recreating results found by Chakraborty (2011)
• In this step the same enviorment is created as used by Chakraborty (2011)
in her study and we have tried to recreate same results. The aim of this step
is to find any leaker in the study as the results generated by Chakraborty
(2011) were very clean.
5. Comparing models with different features selection
• Different models are created with different settings with best first and greedy
forward feature selection techniques and they are compared to the model
created in step 4 These models in turn will be compared against each other
as well to see the effect of the feature selection methods on the results.
2.2 DYANAMC Algorithm
DYNAMAC is short for Dynamics-based algorithmic compression. The concept behind
the DYNAMAC algorithm is to identify data which represent a segment of digital se-
quence such that this data can be used to form a matching chaotic oscillation [2]. The
11
idea is to generate a profile which is smaller in size as compared to the digital segment
and signifies the sequence in the form of waveforms. These waveforms can then be
compared and matched to an initial set to discover patterns.
DYNAMAC algorithm takes the hooks of ragas as input to generate a profile which
holds the information about the raga music piece. Hook is a small segment of music
piece which is recurring in the raga sample and can be used to recognize the raga. To
create hook, a repetitive pattern is identified in the music piece and extracted using
Audacity. DYNAMAC algorithm, then, uses these hooks and generates profile for each.
These profiles are then used for clustering the ragas.
The algorithm takes values of Ns and Nm and the hook in wav format as input and
then divides it into number of samples. Nm is a constant and Ns is the number of bits
in which is the length of the samples. Each hook is broken into segment of length Ns.
To find the total number of samples in a file, the length of file in terms of seconds is
multiplied by 44.1KHz sample rate. For instance, a hook which is 5 seconds long in
wav format will be broken into 5 * 44.1 KHz = 220550 samples.
The figure above shows the music piece broken into number of samples. The length
of sample size is 64 and this is the small sample which is used by the algorithm and
compared to CCO matrix.
Combined Chaotic Oscillation matrix (CCO matrix) has a set of 32 waveforms pre-
defined. Each waveform type has 2562 points. These waveforms can be compared to
input waveform in order to find the match with accepted error tolerance. The closest
match can be referred as the set of values which represent the segment in a compressed
form. Each sample derived from the hook is then used to search through CCO matrix
to identify the match with least error. This match gives the value of Nt. Nt is the type
number in CCO matrix to which the match belongs. The length of this piece is repre-
sented as Nc which is equal to the value Nm* Ns. The values of Ns and Nm determine
how to traverse each waveform type in CCO matrix library. For instance if the value of
12
Figure 2.1: sample for Ns = 64 from a music piece
Ns and Nm is 64 and 4 respectively, the algorithm considers sample sizes of 64 (64*1),
128 (64*2), 192 (64*3) and 256 (64*4).
As the value of Nc depends upon Ns and Nm, the length Nc changes in each iteration
with change in value of Nm. The iterations start with Nm = 1 but for next iteration the
value of Nm will be 2 and hence the length will be longer. However to compare it with
sample extracted from the hook, the waveform is rescaled. This can be explained using
values of Ns = 64 and Nm = 4. The first iteration with Nm=1 will find the starting
position and identify length Nc as 64. For second iteration, Nm=2, the length will be
64*2 = 128. In order to compare this sample with the 64 bit sample from the song,
the waveform is rescaled to 64 bit. Similarly, two more waveforms are generated for
values Nm=3 and Nm=4 of length 192 and 256 respectively which are rescaled to 64 for
comparison. In this manner, for every iteration the points remain the same but different
13
waveform shapes are generated.
Figure 2.2: Closest match found on waveform Type 32
The figure above shows how the sample is compared to each waveform type and
once the match is found, it identifies the location of match. In this case, the match is
found for waveform type 32 at position Ni.
The process is repeated for each sample derived from the hook. The algorithm
then generates a compressed file which contains sequences of bits that represents the
information about the song such as the type of waveform, the position in CCO matrix,
number of channels, etc. The type of waveform can be used to create a histogram for
further analysis. This can be done by identifying the type and then counting how many
fall in each type.
Next step is to use the compressed file generated by DYNAMAC to derive a profile
that can be used to cluster the ragas. For this Anomaly Detection/Classification algo-
rithm is used. A MATLAB code is used which reads each of the dya files generated by
DYANAMC and creates a new file which has a single row with 32 columns. Each col-
umn has a number which refers to the number of matches for every waveform in CCO
14
matrix. Since CCO has 32 waveforms, the file has 32 columns. This 1* 32 matrix can
be used to generate histogram to understand how the sample is spread in CCO matrix
i.e. how the different samples fall in each of 32 types of waveform in CCO matrix. By
changing value of Ns and Nm, different samples can be generated. The idea is to find a
combination which results in an evenly distributed waveform. The figures below show
histograms generated for different combinations of values of Ns and Nm for same music
piece.
Figure 2.3: with Nm = 4 and Ns = 64.
The comparison of the histograms in Figure 2.3, Figure 2.4 and Figure 2.5 shows that
the smaller value of Ns and larger value of Nm gives a uniformly distributed waveform
which implies that matches found with CCO matrix are evenly spread out for each
waveform.
For generating profiles, multiple computer machines were used to lessen the dura-
tion. It was observed that the difference in processors affect the time taken for profile
generation. The machines with i7 processor executed the algorithm faster than the ones
with dual cores. Also, the duration was also influenced by size of the music piece.
15
Figure 2.4: with Nm = 3 and Ns = 192.
Smaller size music pieces took much less time than the longer ones. Different combi-
nations of Ns and Nm were used randomly. For some music pieces the combination of
Ns = 256 and Nm = 3 was used. For some the values were set as 128 and 3 for Ns and
Nm respectively. For most of the files, the profile was generated with Ns = 64 and Nm
= 4. Higher the value of Ns, longer was the time taken for generating profiles.
These profiles were then used in the next stage to create the dataset for data mining
of the ragas using different clustering algorithms.
16
Figure 2.5: with Nm = 3 and Ns = 128.
17
2.3 Architecture
Figure 2.6: Architecture of the system
18
Chapter 3
Results
This chapter will walk through the environment set up and results creating phase. Pro-
files generated in chapter 2 are clustered with two different clustering algorithms. The
results of these two clustering algorithms and other 14 features are fed to the profiles
along with values of Nm and Ns. Two feature selection algorithms are applied to this
dataset to create the final dataset. This final dataset is used to classify ragas based on
moods. The above-mentioned settings are statistically compared to find the best setting
for classifying Ragas based on moods.
In this chapter all the above-mentioned algorithms will also be discussed. Expecta-
tion Maximization, DBScan, Best First and greedy forward selection feature selection
techniques are also discussed.
3.1 Creation of Initial Dataset
In this phase the dataset is created for initial clustering setup. There are 32 attributes of
CCO matrix and to those 32 attributes generated three new attributes are added. These 3
new attributes will create the initial dataset. In this initial dataset there are 35 attributes.
• 32 features generated by CCO matrix
• Value of Ns
• Value of Nm
19
Ns Nm64 464 6
128 4128 6
Table 3.1: Ns and Nm settings
• Name of the Raga
Different values of Ns and Nm are used to create profiles, out of 7 different values of
Ns and Nm 2 discrete values of Ns and Nm are used here.
The profiles are created with these values of Ns and Nm. Clustering algorithms are
used on these values.
3.2 Data Mining Tool
3.2.1 Weka
Weka is free software, which is most commonly used tool to perform different data min-
ing operations on the datasets. Weka is used for pre-processing, clustering, classifying
and attribute selection etc. operations. This is a java based tool and works independently
of the platforms. It is used with command prompt as well as has a very simple interface
GUI as well. The learning curve for this tool is very easy and simple. The detail result
obtained by this tool helps to understand the data and the results of the operations done
on the data. It has a vast number data mining algorithms, which comes, out of the box
with it. Results are visualized with very self-understanding plots and graphs.
20
Figure 3.1: Weka GUI
3.3 Clustering Algorithms
Clustering is a data mining technique used to group like wise instances available in a
dataset to find patterns among them. Two Different clustering algorithms are used here
to group Ragas. Expectation Maximization clustering algorithm, which is a distance-
based algorithm, is used. The second clustering algorithm used is DBScan, which is a
density based clustering algorithm.
3.3.1 Expectation Maximization (EM) Algorithm
Expectation Maximization commonly known, as EM is a simple distance based clus-
tering algorithm. In this algorithm an iterative approach is used to find the maximum
21
likelihood estimation. This algorithm works perfectly fine with incomplete data. Maxi-
mum likelihood is calculated based on the estimation of the model variables so that the
lie data will have more affinity towards each other.
In this experiment cross validation process is used to find the number of clusters. In
cross validation the initial number of cluster is taken as 1. After selecting the number of
cluster to start the experiment next step is to select the number of iterations for which
algorithm will run. The standard number of folds cross validation process uses is 10.
This experiment is also used with 10 folds. This decides the likelihood of the instance
to cluster it self with another instance of the data set. 10 iterations are used to find the
likelihood. In the following iterations if log likelihood increases the number 2 is added
to the number of clusters. This process is repeated until log likelihood starts to become
stable.
3.3.2 DBScan (Density based spatial clustering application with noise)Clustering Algorithm
DBScan algorithm is a density based clustering algorithm. In this algorithm the density
distribution of nodes available in data set is calculated and based on that distribution
clusters are created. This algorithm is vastly used for quantitative data. This algorithm is
used here in this study after looking at the nature of qualitative data. DBScan algorithm
start from a random point in the data set and it tries to find affinity to the other data
points one at a time and looks for the same density distributed points to cluster them
together. If a point is not found like density to the any existing group of like density
distribution point that point is discarded as noise.
In this experiment DBScan algorithm is used to find cluster of Ragas. Same data
set used for EM clustering is provided to the clustering tool to find the like density-
distribution based clusters.
22
3.4 Clustering Results
3.4.1 EM Clustering
EM Clustering in Weka is used to find the initial cluster of Ragas. Settings explained
in above Table 3.1 are used to generate Clustering on initial data set of Ragas. In this
4 different set of profiles are created and results for these profiles and these results are
recorded and compared to Chakraborty (2011) results [3].
There are 5 clusters recorded for these values. The seed values are used here were
42. The results obtained from these are averaged to find the overall results. The data
set contained total of 32 Ragas and the clusters are formed this way. Different values of
Ns and Nm are used to find the results. These results has shown best results for Ns=64
and Nm=4 setting, but the other settings are not perfect but very close to the expected
results when compared to Chakraborty (2011) results.
Table 3.2 shows the initial set of results for the clustering done on initial dataset
for different settings of Ns and Nm. These results are used as a new attribute to the
other fourteen attributes, used by Chakraborty (2011), mentioned in the later part of this
report under the final data set creation section.
Table 3.3 shows the results generated by DBScan algorithm to cluster the initial
dataset created based on profiles.
23
Ragas Ns = 64 Nm= 4 Ns = 64 Nm=6 Ns = 128 Nm=4 Ns = 128 Nm=6Bhageshri 1 4 2 5Bhaairavi 1 4 2 5Kaushi Bhairavi 1 4 2 5Puriya Dhanshri 1 4 2 2Todi 1 3 2 5Shree 1 4 2 5Darbari Kanada 2 2 2 3Gaud Malhar 2 2 1 3Kedar 2 2 1 3Khamaj 2 2 1 3Marwa 2 2 1 3Multani 2 3 1 3Rageshri 2 2 1 3Bhimpalasi 3 4 3 3Bilaskani Todi 3 1 3 4Kaushi Kanhra 3 1 3 4Malkauns 3 1 3 4Desh 3 1 5 4Tilakkamod 4 5 4 1Ahir Bhairav 4 5 4 1Bhatiyar 4 5 4 1Bihag 4 5 4 1Gaud Sarang 4 5 5 1Khamaj and Raagmala 4 5 4 1Komal Rishabh Asavari 4 5 5 1Bhairav 5 1 5 2Darbari 5 3 1 2Hameer 5 3 5 2Jaijaiwante 5 3 5 2Jaunpuri 5 3 5 4Maru Bihag 5 3 5 2Yaman 5 3 5 2
Table 3.2: Results of EM clustering
24
Figure 3.2: EM Clustering algorithm ran in Weka
25
Ragas Ns = 64 Nm= 4 Ns = 64 Nm=6 Ns = 128 Nm=4 Ns = 128 Nm=6Bhageshri 1 3 5 1Bhaairavi 4 Noise 5 NoiseKaushi Bhairavi Noise 4 1 1Puriya Dhanshri 1 Noise 5 1Todi 1 3 Noise 1Shree 4 4 3 3Darbari Kanada Noise Noise 3 NoiseGaud Malhar 2 Noise 3 NoiseKedar Noise 2 3 NoiseKhamaj 1 1 Noise 3Marwa 2 2 3 3Multani 1 3 4 3Rageshri 2 Noise 4 3Bhimpalasi Noise 3 4 NoiseBilaskani Todi 3 1 Noise 2Kaushi Kanhra 3 Noise 4 2Malkauns Noise 1 4 2Desh 4 1 5 2Tilakkamod 4 5 5 2Ahir Bhairav 4 2 5 5Bhatiyar 3 5 Noise 5Bihag 4 2 4 5Gaud Sarang Noise 5 5 1Khamaj and Raagmala 4 Noise Noise 5Komal Rishabh Asavari 4 5 5 NoiseBhairav 5 1 5 1Darbari Noise 3 1 1Hameer Noise 3 Noise NoiseJaijaiwante 5 3 Noise 1Jaunpuri 5 3 1 1Maru Bihag 5 Noise 1 NoiseYaman 5 Noise 1 1
Table 3.3: Results of DBScan clustering
26
3.5 Adding potential Attributes to Initial Dataset
The clustering results generated in section 3.4 are used with other 14 attributes of Ragas
to create a dataset. These 14 attributes were used by Chakraborty (2011) in her research
to attach moods to the Ragas. In this experiment feature selection algorithms are used
to select only the important features, which are significant to classify Ragas based on
various moods. Ragas name, Name of the file, Name of the Artist, Instrument used and
Tonic frequency features are taken from pragchordia.com [7]. Raga Guide is another
source for the remaining features, 9 other features are taken from this source. The total
of 16 features is described as follows.
• Name of the File: Its the name of the song file used in the data set.
• Name of the Raga: its the name of the Raga on which song is based.
• Name of the Artist: Its the name of the Artist who sang the song.
• Tonic frequency: Tonic frequency is described as the inverse of number of time
the voice chords of an instrument displaces to allow the sound pass through them.
Basically its the inverse of tonic cycle. Each Raga can have multiple tonic fre-
quencies. Usually in the experiments average tonic frequencies are used.
• Average tonic frequency: Its the average of all possible tonic frequencies a par-
ticular Raga can consist of. This average gives the better range for understanding
the tonic frequency for a Raga. As an example raga Yaman has 3 different tonic
frequencies and their average is 282.142.
• Thaat: Thaat is described as order in which notes are played to produce a partic-
ular type of music. This change in order changes the nature of the Raga.
• Time: This feature consists of the information regarding what time of the day a
particular Raga is listened. The time of the day is changes to numerical values
27
as shown below to prepare for this experiment. Time of the day is divided into
10 numerical representations. Early Morning represents 1, midnight represents 2,
morning represents 3 and so on.
• Group: Ragas in Indian classical music are grouped together on the basis of the
time of the day into 2 groups. The time range is form 12 am to 12 pm and 12pm
to 12 am. The Ragas, which are sung in the day, are called Poorva ragas and
their night counterparts are called Uttar Ragas. Most Significant Note: Ragas
are consisting of number of notes; the note, which can change the Raga, is the
most significant node. This note is called Vadi. Each Raga has a note that is the
most significant node. Second Most Significant Node: This feature is exactly the
same like the above feature just that in this second note is significant for the Raga
identification.
• Thaat Feature: This feature refers to the attributes present in a thaat. These at-
tributes are converted to numerical values for the data mining purposes as they are
used in numerical representation like western note.
• Jaati: Jaati usually refer to the group to which a particular Raga belongs. This
is calculated based on mathematical calculations of frequency of ascending notes
used in a continuous sequence of the vocal music. Jaati is a very important fea-
ture in identification of Ragas. Some of the Raga Jaatis are Audava-Sampurna,
Shaudava-Shaudava etc. They are also converted to numerical forms to use in the
music mining experiments.
• Thaat Notes: Notes on which Thaats are based called thaat notes. It is basically
the sequence of Indian Classical notes like Sa, Re, Ga, Ma, Dha, Ni and Sa in an
order. As the order changes Ragas get changed.
• Clustering Results EM: This is the cluster values given to profiles of the Ragas
28
by Expectation Maximization algorithm. 5 Clusters were created based on the
expected maximization of likelihood of the Clustered profiles.
• Clustering Results of DBScan: These are the cluster results of the DBScan algo-
rithm ran on Raga profiles.
3.6 Creation of Models based on Clustering
After considering these 16 features as used by Chakraborty (2011), feature selection
techniques are applied on the dataset to select the optimal features.
Model Feature Selection AlgorithmModel 1 Best FirstModel 2 Greedy StepwiseModel 1 Best FirstModel 2 Greedy Stepwise
Table 3.4: Model and feature selection technique combination
Two different models are generated with 2 different settings. In first setting cluster-
ing results of EM algorithm are included with the other 14 above-mentioned features.
In the second model clustering results of DBScan algorithm are included with other14
features.
These two models are ran with two feature selection techniques best first feature
selection techniques and greedy stepwise feature selection technique.
29
Figure 3.3: Models Created after clustering profiles with two different Clustering Algo-rithm results
3.7 Applying Feature Selection Techniques
Two different feature selection techniques are used in this experiment best first and
greedy stepwise feature selection technique.
3.7.1 Best First Feature Selection Technique
Best First feature selection algorithm uses the best first search model. In this algorithm
heuristic model is created an evaluation function which estimates the significance of an
attribute or feature. This estimation is based on the information obtained till the point
and the end result required.
30
Figure 3.4: Best First Feature Selection on Model 1
After running best first feature selection techniques 8 out of 15 features are selected
on Model 1 and Model 2. The features which are selected are FileName, AvgTonicFre-
quency, Thaat, Time of the day, Group, SecondMostImpNote, jaati and cluster values.
It has been noted that the features like FileName, Name of the Raga and Artist name can
behave like leakers in the data. Presence of leakers in the data can temper the learning
of the model. To avoid that problem these features are removed from the data. The final
selected features are AvgTonicFrequency, Thaat, TimeOfDay, Group, SecondMostImp-
Note, Jaati and cluster values.
31
Figure 3.5: Best First Feature Selection on Model 2
3.7.2 Greedy Stepwise Feature Selection Technique
Greedy stepwise algorithm implements the technique of steepest ascent technique. It
starts with a single random feature and increments to each feature and keeps on finding
solutions. If it finds a better solution with the new feature the change is made to the
solution. This goes on until no improvements can be found or if there is a decrease
in the evaluation. In this way it can start from any random attribute list and create a
ranking based on the order the attributes were selected. Not only does it add a best
feature during each iteration but it also removes the worst feature if required.
32
Model FeatureSelectionTechnique
Features Selected Features ignored /leakers
Final Features to be added tothe model
Model 1 Best First FileName, Avg-TonicFreq,Thaat,TimeOf-Day,Group,Secondimp-Note, Jaati,EM Clustering
File Name AvgTonicFreq, Thaat, Time-OfDay, Group, Secondimp-Note, Jaati, EM Clustering
Model 2 Best First FileName, Avg-TonicFreq,Thaat,TimeOf-Day,Group,SecondimpNote,Jaati, DB-Scan Clustering
File Name AvgTonicFreq, Thaat,TimeOfDay, Group, Sec-ondimpNote, Jaati, DB-Scan Clustering
Model 3 DBScan FileName, Raag,TonicFrequency,Thaat,TimeOfDay,Feature,Group,MostImpNote,SecondMostImp-Note, Jaati,Thaat Note,EM Clustering
File Name, Raag TonicFrequency,Thaat,TimeOfDay, Fea-ture,Group, MostImpNote,SecondMostImpNote, Jaati,Thaat Note, EM Clustering
Model 4 DBScan FileName, Raag,TonicFrequency,Thaat,TimeOfDay,Feature,Group,MostImpNote,SecondMostImp-Note, Jaati,Thaat Note, DB-Scan Clustering
File Name, Raag TonicFrequency,Thaat,TimeOfDay, Fea-ture,Group, MostImp-Note, SecondMostImpNote,Jaati, Thaat Note, DB-Scan Clustering
Table 3.5: Features selection Results
33
Figure 3.6: Greedy Stepwise Feature Selection Techniques applied on Model 1
34
Figure 3.7: Greedy Stepwise Feature Selection Techniques applied on Model 2
35
3.8 Creating Datasets for Classification
Moods are added to the datasets as the class variable for classification. There are 16
different settings are used to run the results. These settings are shown in the following
table.
Ns Nm Model Feature Selection Technique64 4 Model 1 Best First64 4 Model 1 Greedy64 4 Model 2 Best First64 4 Model 2 Greedy64 6 Model 1 Best First64 6 Model 1 Greedy64 6 Model 2 Best First64 6 Model 2 Greedy
128 4 Model 1 Best First128 4 Model 1 Greedy128 4 Model 2 Best First128 4 Model 2 Greedy128 6 Model 1 Best First128 6 Model 1 Greedy128 6 Model 2 Best First128 6 Model 2 Greedy
Table 3.7: Models for classification
36
3.9 Classification
The dataset generated in section 3.6 is classified with Naive Bayes classification al-
gorithm, 70% of the data is used to train the model and 30% data test purpose and
classification error is recorded in percentage.
3.9.1 Naive Bayes Classification Algorithm
Nave Bayes classifier is based on Bayes theorem where using experience or training
set prior probabilities are created and used for further testing. Then this prior probabil-
ity is combined with likelihood and a posterior probability is created. This method is
called is Bayes’ theorem and the probability is called Bayesian probability. Nave Bayes
classifier assumes that all attributes independently contribute to the classification. Nave
Bayes classifier can provide good accuracy specially when used in supervised learning.
Although Nave Bayes classifier does not perform better than new approaches such as
random forest it is still observed that Nave Bayes performs very well in many complex
real world situations.
Classification output shown in Figure 3.8 represents the results of one of the 16
combinations among Table 3.7. It has been shown that the percentag error in this clas-
sification output is 7.8947%. These results are calculated for all the above mentioned
settings of Table 3.7, percentage error and root mean squared error is calculated for each
of the setting. In the next section a new table is created to show the percentage error.
37
Figure 3.8: Classifier Results of 70% Train data and 30% Test Data
38
3.10 Final dataset for statistical Analysis
In this section dataset is created with the results of nave Bayes classifier in terms of
percentage error for all the setting mentioned in Figure 3.9.
Figure 3.9: Final data set
These results are then given to Minitab, a statistical tool to analyze results, is dis-
cussed in section 3.11.
3.11 Analysis of Results Though Minitab
3.11.1 Minitab
Minitab is a statistical tool which is used to perform a number of statistical functions
such as basic statistics like goodness of fit test, sample and paired t tests, normality
39
tests, regression analysis, data and file management, creating graphics such as line plots,
contour plots, graphs etc., analysis of variance, statistical process control methodologies
such as run chart, pareto chart, tolerance intervals etc. It is also used measurement
systems analysis, multivariate analysis and reliability/survival analysis.
3.11.2 Factorial Design Analysis
A 24 factorial design with two replicates is used to analyze the 4 independent variables
using percent error as the response variable. The factorial design analysis shows the
factors which should be included in the optimal model as per this experiment are value
of Ns, Nm, Clustering algorithm and Feature selection technique.
Figure 3.10: Residul graphs
The p-value for these factors are less than 0.05 as shown in Figure 3.12. The p-
values for Ns * Nm, Nm * Clustering Algo, Ns * ClusteringAlgo * Feature Selection
40
technique and Nm * Clustering Algo * Feature Selection Technique are also less than
0.05, which shows that there are two way and three way interactions which are also very
important to the model.
Figure 3.10 Residual plots represents the 4 different graphs used to check ANOVA
assumptions. The Normal Probability distribution plot shows the error percentage is
normally distributed. The histogram show the bell curve refers to the residuals for per-
centage error is distributed evenly and fit is also not creating a funnel shape, so the
assumption of constant variance is not violated
Figure 3.11: Main effects plot
Figure 3.11 represents the values of the factors which will give the least mean
squared error. As per this experiment the optimal values are Ns= 64, Nm= 6, Cluster-
ing Algo = Expectation Maximization and Feature Selection Algorithm is Best first.For
these values the mean squared error is minimum.
41
Figure 3.12: Output
42
Chapter 4
Conclusions
4.1 Current Status
Music files are converted to profiles with 4 combinations of Ns (64, 128) and Nm (4,
6) through Dynamac algorithm. Each profile is along with value of Ns, Nm and name
of the Raga are clustered with two different clustering algorithm, Expectation Maxi-
mization and DBScan algorithm. Clusters are generated and these clusters are added to
the 14 other features to classify the Ragas. On these 15 features two different feature
selection techniques, best first and greedy stepwise, are applied to select the features.
Already know mood feature in turn are fed to these selected attributes. Four different
models are created with this combination of two cluster results and result of two feature
selection methods. The models are then run through Minitab for analysis of factorial
design to check the significance of the independent variables on the classification error
percentage.
4.2 Results Generalization
This project took Chakraborty (2011) as its baseline and moved on to find a model
which can generalize the approach used in her project and to improve it by introducing
feature selection techniques. The results are compared statistically finding the optimal
combination of settings for Ns, Nm, clustering algorithm and feature selection technique
that can help other studies in this field.
43
4.3 Future work
• Find more suitable values of Ns and Nm to explore the scope of the results to a
wider range.
• Include more feature selection techniques.
• Use of other form of music to find out the scope of this music mining study.
• To use more values of Ns and Nm the Dynamac code needs to be tuned for per-
formance, as it takes a significant amount of time to create each profile. Paral-
lelization of code will help to overcome this issue.
• Use of multiple clustering and classification algorithms.
• Increasing the number of factors studied in the factorial experiment to see the
scope of minimizing the error percentage of classification.
44
Bibliography
[1] M.M. Al-Qutt, A.M. Hamad, M.A. Salem, and M.H.A. Aziz. Performance com-parison for feature selection in musical information retrieval. In Computer Engi-neering Systems (ICCES), 2011 International Conference on, pages 231 –236, 292011-dec. 1 2011.
[2] M. Eastman C. M. Glenn and G. Paliwal. A new digital image compression al-gorithm based on nonlinear dynamical systems. IADAT International Conferenceon Multimedia Image Processing and Computer Vision Conference Proceedings,March 2005.
[3] D. Chakraborty. Data mining of indian raga : Developing a recommender system.In Masters Project; Rochester Institute of Technology, Oct 2011.
[4] chordia parag. http://paragchordia.com/data/gtraagdb.
[5] Shyamala Doraisamy, Shahram Golzari, Noris Mohd. Norowi, Md Nasir Su-laiman, and Nur Izura Udzir. A study on feature selection and classification tech-niques for automatic genre classification of traditional malay music. In Juan PabloBello, Elaine Chew, and Douglas Turnbull, editors, ISMIR, pages 331–336, 2008.
[6] Rebecca Fiebrink and Ichiro Fujinaga. Feature selection pitfalls and music classi-fication. pages 340–341, October 2006.
[7] P. Herrera, A. Yeterian, and F. Gouyon. Automatic classification of drum sounds:A comparison of feature selection methods and classification techniques. InC. Anagnostopoulou, M. Ferrand, and A. Smaill, editors, Music and Artificial In-telligence. Springer, 2002.
[8] Huiqing Liu, Jinyan Li, and Limsoon Wong. A comparative study on featureselection and classification methods using gene expression profiles and proteomicpatterns. Genome Informatics, 13:51–60, 2002.
45
[9] Jeremy Pickens. Feature selection for polyphonic music retrieval.
[10] Jeremy Pickens. A survey of feature selection techniques for music informationretrieval, 2001.
[11] Selwyn Piramuthu. Evaluating feature selection methods for learning in data min-ing applications. In Proceedings of the Thirty-First Annual Hawaii InternationalConference on System Sciences-Volume 5 - Volume 5, HICSS ’98, pages 294–,Washington, DC, USA, 1998. IEEE Computer Society.
[12] Erik M. Schmidt, Douglas Turnbull, and Youngmoo E. Kim. Feature selectionfor content-based, time-varying musical emotion regression. In Proceedings ofthe international conference on Multimedia information retrieval, MIR ’10, pages267–274, New York, NY, USA, 2010. ACM.
[13] C.N. Silla, A.L. Koerich, and C. Kaestner. Feature selection in automatic musicgenre classification. In Multimedia, 2008. ISM 2008. Tenth IEEE InternationalSymposium on, pages 39 –44, dec. 2008.
[14] Igor Vatolkin, Mike Preuss, and Gunter Rudolph. Multi-objective feature selectionin music genre and style recognition tasks. In Proceedings of the 13th annualconference on Genetic and evolutionary computation, GECCO ’11, pages 411–418, New York, NY, USA, 2011. ACM.
46
Appendix A
User Manual
A.1 Minitab Factorial design Generation
• Open Minitab
• Create new Minitab Project (Figure A.1)
• Go to Stat, DOE, Factorial, Create Factorial Design (Figure A.2 and A.3)
• Select “Generate full factorial design” , No of factors as 4, click on design and
select “Number of Replicates for corner points” as 2 (Figure A.4)
• Select Factors as shown in Figure A.5
• Click OK and model is created, Add one more column and add % error in that
error (Figure A.6)
• Go to Analyze Factorial design and analyze the model. (Figure A.7)
• Results are generated (Figure A.8)
47
Figure A.1: Create new Minitab Project
48
Figure A.2: select Stat, DOE, Factoril, Create Factorial Design
49
Figure A.3: Create Factorial Design
50
Figure A.4: Generate full factorial design
51
Figure A.5: Select Factors
52
Figure A.6: Select Factors
53
Figure A.7: Go to Analyze Factorial design
54
Figure A.8: Results