draft activity recognition from accelerometer data

17
DRAFT Activity recognition from accelerometer data Raghunandan Palakodety 2* Informatik IV, Universit¨at Bonn, Germany [email protected] Abstract In the last decade, there has been an ever increasing research activity on human activity recognition. The research in classifying tasks ranging from simple to most complex such as eye tracking, introduced many architectures using inertial sensors along with on-board motion sensors in a smartphone. To understand more about the activities, analyzing the data that emanate from various sensors can be conducive to formulate mathematical mod- els for classifying independent data. For classifying selected activities such as standing, walking, running and idle, we have chosen accelerometer sensor to formulate a model for those activities. To achieve the mentioned tasks, we have developed an application for classifying above activities in real-time. The application uses accelerometer data to con- struct real-time instances and uses machine learning algorithms to train various classifiers, later the classifiers’ performances are compared. Thus, with Random Forest classifier we reach a performance of 83.4901%. 1 Introduction This paper describes a prevalent approach for classifying selected activities such as standing, walking, running and idle using a gamut of algorithms, in particular ensemble learning method. An application was developed on Android OS platform to achieve the above mentioned task of classification. The architecture for the application includes collecting raw data from accelerome- ter sensor which subsequently undergoes three phases namely, pre-processing, feature selection, feature extraction. A mathematical model is then fitted with the training data obtained from the above mentioned phases. Later, the application classifies data obtained, also known as test data in the classification phase. The evaluation phase of the architecture, analyzes the performance of classifiers by their accuracies and in particular using confusion matrices. This paper is structured into sections in which section 1.1 elucidates selected research activities in human activity recognition domain and section 2 describes the general idea and the pro- posed architecture for this lab task. Section3 and 4 explains the reasons behind pre-processing and smoothing. In section5 and section6, we describe the method of extracting features from raw data and constructing feature vectors respectively. In Section7, we describe the necessity and use of cross-validation technique followed by confusion matrices. Section8 contains the description and pseudo code of a particular classifier we use for classification which is dealt subsequently in section9. In section10, we evaluate the efficiency of a single Classification and Regression Tree or CART decision tree as compared to a procedure known as bagging. The results of this lab work are illustrated in section11 in which, detailed accuracies of activities classified by various classifiers are discussed using various estimators. The paper ends with a section12 in which we present approaches for augmenting the existing classification rates. * Dr. Welke and chair for reviewing the paper 1

Upload: raghu-palakodety

Post on 07-Aug-2015

70 views

Category:

Science


8 download

TRANSCRIPT

Page 1: Draft activity recognition from accelerometer data

DRAFT

Activity recognition from accelerometer data

Raghunandan Palakodety2∗

Informatik IV, Universitat Bonn, [email protected]

Abstract

In the last decade, there has been an ever increasing research activity on human activityrecognition. The research in classifying tasks ranging from simple to most complex suchas eye tracking, introduced many architectures using inertial sensors along with on-boardmotion sensors in a smartphone. To understand more about the activities, analyzing thedata that emanate from various sensors can be conducive to formulate mathematical mod-els for classifying independent data. For classifying selected activities such as standing,walking, running and idle, we have chosen accelerometer sensor to formulate a model forthose activities. To achieve the mentioned tasks, we have developed an application forclassifying above activities in real-time. The application uses accelerometer data to con-struct real-time instances and uses machine learning algorithms to train various classifiers,later the classifiers’ performances are compared. Thus, with Random Forest classifier wereach a performance of 83.4901%.

1 Introduction

This paper describes a prevalent approach for classifying selected activities such as standing,walking, running and idle using a gamut of algorithms, in particular ensemble learning method.An application was developed on Android OS platform to achieve the above mentioned task ofclassification. The architecture for the application includes collecting raw data from accelerome-ter sensor which subsequently undergoes three phases namely, pre-processing, feature selection,feature extraction. A mathematical model is then fitted with the training data obtained fromthe above mentioned phases. Later, the application classifies data obtained, also known astest data in the classification phase. The evaluation phase of the architecture, analyzes theperformance of classifiers by their accuracies and in particular using confusion matrices.This paper is structured into sections in which section 1.1 elucidates selected research activitiesin human activity recognition domain and section 2 describes the general idea and the pro-posed architecture for this lab task. Section3 and 4 explains the reasons behind pre-processingand smoothing. In section5 and section6, we describe the method of extracting features fromraw data and constructing feature vectors respectively. In Section7, we describe the necessityand use of cross-validation technique followed by confusion matrices. Section8 contains thedescription and pseudo code of a particular classifier we use for classification which is dealtsubsequently in section9. In section10, we evaluate the efficiency of a single Classification andRegression Tree or CART decision tree as compared to a procedure known as bagging. Theresults of this lab work are illustrated in section11 in which, detailed accuracies of activitiesclassified by various classifiers are discussed using various estimators. The paper ends with asection12 in which we present approaches for augmenting the existing classification rates.

∗Dr. Welke and chair for reviewing the paper

1

Page 2: Draft activity recognition from accelerometer data

DRAFT

1.1 Related Work

Many commercial applications for user activity recognition tasks have been developed and areavailable for various mobile platforms. Fitness applications such as PaceDJ analyzes the paceof the the jogger and prompts a song of particular beats per minute to be played [7]. Face-Lock enables user to unlock the phone and applications using one of the concepts of machinelearning. Apart from these commercial applications which are available on various applica-tion delivery systems, research on potential participatory and opportunistic applications areunderway. Visage, a face interpretation engine for smartphone applications lets users to driveapplications based on their facial responses, comes under a new class of face-aware applicationsfor smartphones. The modus operandi of this engine fuses data streams [21] from the handset’sfront camera and built in motion sensors to deduce the user’s 3D head poses (through pitch,roll, yaw of user’s head with respect to the phone co-ordinate system) and visage expressions.Ongoing research at OPPORTUNITY at ETH Zurich IFE -Wearable computing, is focused ondeveloping mobile systems to recognize human activity and user context with dynamically vary-ing sensor setups, using goal oriented, cooperative sensing [13]. Such systems are also knownas opportunistic, since they take advantage of sensing modalities that are available instead ofleveraging the user to deploy specific, application dependent sensor systems.Many such activity recognition tasks are implemented independently with a gamut of architec-tures and designs. To date, there is no single exhaustive tutorial on a framework that presentthe design, implementation, and evaluation of Human Activity Recognition systems. The pa-per [5], introduces a concept of activity recognition chain, also known as ARC, as a generalpurpose framework for designing and evaluating activity recognition systems. The frameworkcomprises components for data acquisition, pre-processing, data segmentation, decision fusion,and performance evaluation. At the end of the paper by Mulling et al [5], a problem settingof recognizing different hand gestures using inertial sensors attached to lower and upper arm ispresented and the implementation of each component in ARC framework for the recognitionproblem is discussed. It is demonstrated how different design decisions and their impact onoverall recognition performance fare.Another article [11], introduced a comprehensive approach for context aware applications thatutilizes the multi-modal sensors in smartphone handsets. Their proposed system not only rec-ognizes different kinds of contexts with high accuracy, but it is also able to optimize the powerconsumption of sensors that can be activated and deactivated at appropriate times. Furthermore, a novel feature selection algorithm is proposed for accelerometer classification moduleof their architecture. In essence, their feature selection algorithm is based on measuring thequality of a feature using two estimators, relevancy (or classification power) and redundancy(or similarity between two selected features) [11]. Recently, another sophisticated solution wasproposed by [9] to find the location of the phone in order to support context-aware applications.The solution gives 97% accuracy in detecting the position of the phone as well as being able tocorrelate with the type of location the user is in. Their implementation includes features suchas extraction of Mel Frequency Cepstral Coefficients or MFCC, Delta Mel Frequency Cepstralor DMFCC and Band Energy features.

2 Concept

2.1 Background

The android platform supports three broad categories of sensors which are motion sensors,environment sensors and position sensors. Of these, motion sensors category are extremely

Page 3: Draft activity recognition from accelerometer data

DRAFT

conducive for monitoring device movement, such as tilt, shake rotation or swing. This categoryincludes accelerometers, gravity sensors, gyroscopes and rotational vector sensors. Two of thethese sensors are always hardware based (the accelerometer and gyroscope), and the rest can beeither hardware-based or software based (the gravity, linear acceleration, and rotation vectorsensors). On some android enabled devices, the software-based sensors derive their data fromthe accelerometer and magnetometer, while others also use gyroscope to derive their data [2].All of the motion sensors return multi-dimensional arrays of sensor values for each SensorEvent.During a single sensor event the accelerometer returns acceleration force data for the threecoordinate axes, and the gyroscope returns rate of rotation data for the three coordinate axes.Android supports a list of motion sensors of which TYPE LINEAR ACCELERATION, a virtual sensorwhen registered, gives linear acceleration of mobile device (excluding the gravity)[2]. The datavalues received from this sensor are in the form of a three-valued vector of floating pointnumbers that represents accelerations of the smart phone along the x, y, z (as shown in thefigure 3 [8]) axes without the gravity vector. The acceleration values were recorded in metersper second squared. When the phone is laid flat on the surface, accelerations along x, y, zaxes readings after smoothing are approximately, [0.0023937826, 0.01905214, 0.020536471] andshown in figure 4.Many such vectors thus received when the device is displaced, are termed as raw data. Theraw data obtained(prior to smoothing or any moving average) goes through the noise reductionstage, where the signal is smoothed, then preprocessed where each sampled smoothed vectorwas combined into a single magnitude and further fed into feature extraction stage, whereessential attributes or features are extracted for a window of size 64 such samples. At the endof this stage, each feature vector is of length 64.For the task of classifying activities, the requirement of collecting a dataset needs to be fulfilled.The training data or the ground truths are constructed by collecting data for the activities inconsideration. The training phase includes training a classifier (after choosing a best modelbased on predictive performance) with training data and saving the model (serialization [20])for further use in classification stage. In the classification stage, the test data is collected real-time and the model is put into use to discern among the activities. The life cycle of trainingphase of the application is shown as flowchart in the figure 1 and the classification stage infigure 2.

The following phases Preprocessing, Smoothing, Feature Extraction, Cross Validation, Train-ing, and Classification will be elucidated in detail, in the following sections 3, 4, 4.1, 5, 7, 8, 9. In addition, Evaluation of classifiers will be discussed in results section.

2.2 Design

The design of the application went through several changes until it was decided to have allthe processing and classification on the device, instead of relying on a central server. Theapplication uses Linear Acceleration sensor provided by vendor Google Inc, of range 19.6 andresolution 0.01197 in sensor’s values unit. In order to choose a right model for classification, atechnique of competitive evaluation of models is adopted where, 3252 samples were collectedand analyzed over three machine learning solutions to determine the class label.

2.3 Architecture

All the phases in this processing pipeline as shown in figure 5, Preprocessing, Smoothing, FeatureExtraction, Cross Validation, Training, and Classification are performed exploiting the compu-

Page 4: Draft activity recognition from accelerometer data

DRAFT

Figure 1: Life cycle of training the classifierFigure 2: Life cycle of classification of ac-tivities

Figure 3: Axes of device [8]Figure 4: Accelerations along the axes aftersmoothing

tational capabilities of the handset, unlike many papers where the certain processing pipelinewas managed by a server.

3 Preprocessing

During preprocessing stage of the pipeline, each sampled acceleration vector was combined intoa single magnitude. Since the task of ambulatory activity classification task do not require

Page 5: Draft activity recognition from accelerometer data

DRAFT

Figure 5: Perceived Architecture

orientations of the phone, and hence deemed that none of the activities required distinction ofdirectional accelerations [7], the euclidean magnitude of the accelerations along three individualaxes are calculated as follows

a =√x2 + y2 + z2

Accelerations from individual axes are useful where directional information is relevant, in thecase of determining the location of the phone, determining the movement of limbs or trackinghead movements and differentiating between martial art movements [7].

4 Smoothing

The device sends out values at a frequency 16.023 Hz. There were two primary sources of noisein the received signal. As illustrated in [7], it was observed in this lab task that the first wasirregular sampling rate and another being the inherent noise in discrete physical sampling of acontinuous function. Besides incurring into irregular sampling rates issue, additional noise wasperiodically introduced to the signal just from the nature of activity patterns performed. Forexample, a slight change in the orientation of the handset, is a frequent occurrence even in thecase of Idle/Other activity, but results in an anomalous peak in the resultant signal as shownin the figure 6. This type of noise was handled by running the values of accelerations along theindividual axes through a smoothing algorithm, using Low-Pass Filter.

Figure 6: Acceleration magnitude chart without smoothing

Page 6: Draft activity recognition from accelerometer data

DRAFT

4.1 Low-Pass Filter

The device’s sensor readings contribute noise data due to high sensitivity of its hardware. Dueto environmental factors such as jerks or vibrations as mentioned earlier, a considerable amountof noise is added to these signals. These high frequency signals(noise) cause the readings to hopbetween considerable high and low values. A Low-Pass Filter can be helpful to omit those highfrequencies in the input signal by applying a suitable threshold to the filter output reading.The following equation can be termed as a recurrence relation,

yi =α ∗ xi + (1− α) ∗ yi−1

where yi is an array of smoothed values from previous iteration, which is subsequently usedto compute the merged acceleration in preprocessing stage as described in section 3. Thisprocess occurs whenever onSensorChanged() is triggered. The α here is a smoothing factorsuch that, 0 ≤ α ≤1. This constant affects the weight or momentum, and measures howdrastic the new value affects the current smoothed value [1]. A smaller value of α implies moresmoothing results. α = 0.15 is used for smoothing the incoming values. Figure 7 shows aplot of merged accelerations after smoothing and calculated mean frequency for 125 elementsis displayed in the text view.

Figure 7: Acceleration magnitude chart after smoothing using LPF

5 Feature Extraction

Feature extraction is the conversion of raw sensor data into more computationally-efficient andlower dimensional forms termed as features. The raw sensor data obtained is first segmentedinto several windows, and features such as frequencies are extracted from the window of sam-ples. This stage of Feature Extraction from sensor readings serve as inputs into classificationalgorithms for recognizing user’s activity.As mentioned in the [14], the window size is a cardinal parameter that influences both compu-tation and power consumption of sensing algorithms. A window size of size 64 was chosen forfeature extraction stage which proved conducive in extracting frequency domain features. Incontrast to heuristic features, time and frequency domain features can characterize the infor-mation within the time varying signal and do not reveal specific information about the user’sactivity. Unlike time domain, the frequency domain features require an additional stage oftransforming the received data (in time domain) from the previous stages of the pipeline. Thisstage involves generating frequency domain features using very fast and efficient versions of the

Page 7: Draft activity recognition from accelerometer data

DRAFT

Fast Fourier Transform.onSensorChanged() produces sensor samples (x, y, z) in a time series (each time onSensor-Changed is called). These samples are then smoothed and preprocessed which computes mag-nitude m, from the sensor samples. The work flow of this training phase buffers up 64 con-secutive magnitudes (m0..m63) before computing the Fast Fourier Transform or FFT resultinga feature vector (f0..f63) or a vector of Fourier coefficients. f0 and f63 are the low and high-est frequency coefficients. FFT transforms a time series of amplitude over time to magnitudeacross frequency. Since, the x, y, z accelerometer readings and the magnitude are time domainvariables, it is required to transform these time domain data into the frequency domain as theycan represent the distribution in a compact manner that the classifier will use to build a deci-sion tree model in further phase of this pipeline. While computing the fourier coefficients, themaximum (MAX) magnitude of the (m0..m63) is also included as an attribute for the featurevector for that particular window of 64 samples [6].Also, recorded are the class labels or the user supplied label (e.g., walking, standing, running oridle) to label the feature vector. Hence, the features determining an instance or feature vectorare (f0..f63), MAX magnitude and label. At the end of Feature extraction pipeline, a featurevector consists of following features or attributes,

(f0..f63), MAX magnitude, label

5.1 Signal processing in feature extraction

Fast Fourier Transform is a discrete Fourier transform algorithm which reduces the numberof computations needed for N points from 2N2 to 2N lgN by means of Danielson-Lanczoslemma [17] if the number of samples or points N is a power of two. The Cooley-Tukey FFTalgorithm first rearranges the input elements in bit-reversed order, then builds the outputtransform (decimation in time). The basic idea is to break up a transform of length N into twotransforms of length N/2 using the identity,

N−1∑n=0

an =

(N/2)−1∑n=0

a2ne−2∗π∗i∗(2n)∗k/N +

(N/2)−1∑n=0

a2n+1e−2∗π∗i∗(2n+1)∗k/N

=

N/2−1∑n=0

aevenn e−2∗π∗i∗(n)∗k/(N/2) + e−2∗π∗i∗k∗N(N/2)−1∑n=0

aoddn e−2∗πi∗n∗k/(N/2)

which is known as Danielson-Lanczos lemma.Prior to applying FFT to the time domain data, a window function has been chosen for spectralanalysis or frequency domain analysis. In other words, calculating signal’s frequency compo-nents that make up the signal. Blackman windowing function of window length 64 was im-plemented. The following windowing function is an example of higher-order generalized cosinewindows,

w(n) =∑Kk=0 ak cos

(2πknN

)From the above equation, for an exact Blackman window, α = 0.16 was chosen. The con-

stants are computed as follows

Page 8: Draft activity recognition from accelerometer data

DRAFT

Standing Walking Running Others

703 1057 824 668

Table 1: Class distribution

a0 = 1−α2 , a1 = 1

2 and a2 = α2

and the equation,

w(n) = 0.42− 0.5 cos((2πn)/(N − 1)) + 0.08 cos((4πn)/(N − 1))

5.2 Fourier Co-efficient representation in magnitude

It is important not to ignore the issue of time (phase) shifts when using Fourier analysis, sincethe calculated Fourier co-efficients are very sensitive to time (phase) shifts. It is proven thattime (phase) shifting does not impact the Fourier series magnitude. Using this lemma, Fourierco-efficients can be represented in terms of magnitude and phase as follows,

Xn = An + jBn and |Xn| =√A2n +B2

n

where Xn is the Fourier transform of a discrete signal and An andBn are Fourier co-efficients.

6 Training data

The training data is constructed from the x, y, z sensor data readings as explained in section5. After smoothing, the magnitude of these readings are collected in a blocked queue of size2048. A background process is established by using AsyncTask (provided by the Androidframework), to use 64 buffered magnitude values before computing the Fast Fourier Transformor FFT of these magnitude values.The FFT using Blackman windowing function, computes the 64 FFT -coefficients. These coeffi-cients along with maximum value of magnitude across all 64 individual magnitudes and label ofactivity, serve as attributes or features for each feature vector or instance of the training data,for that particular activity. The data is then saved as a file upon quitting the data acquisitionactivity.The file is of type Attribute-Relation File Format or arff. This file format is then analyzed byWaikato Environment for Knowledge Analysis or WEKA, a knowledge analysis and machinelearning tool for further estimator values to peruse.In the journal [18], evaluations were performed on datasets with both balanced and imbalancedclass distributions. It is observed in [18], where balanced class distribution data set of trainingsamples for each activity is collected, and this procedure was followed. Skewed set of samplesper activity were collected and subsequently analyzed in further stages of the pipeline. Thedistribution of the training data is shown as bar plot in figure 8 and in table 1.

7 Cross-validation Results

Cross validation is a non-bayesian technique (for model selection) for choosing the model withsmallest empirical error on the validation set [12]. The technique involves randomly permuting

Page 9: Draft activity recognition from accelerometer data

DRAFT

Figure 8: Barplot of training data

the data, then splitting the data into k equally sized subsets and performing k rounds ofvalidation. The figure 9 shows k rounds of validation, training sets and validation subsets.For each round, an empirical error for the model is calculated and averaged after the end ofall k rounds. The model from a set of models with minimum error is considered to retrain themodel with entire data set. Cross-validation is also defined as a model validation technique forassessing how the results of a statistical analysis will generalize to an independent data set.It is mainly used in settings where the task or goal is prediction, and to estimate accuracyof a predictive model in practice. In a prediction problem, a model or a classifier is usuallysupplied with a dataset of known data or training data on which training the model takes placeand a dataset of unknown data or testing dataset is supplied to test the model. The goal ofcross-validation is to define a subset of data or a validation set to test the model in the trainingphase, in order to scrutinize the problems such as over-fitting. It also provides an insight onhow the model generalizes to an independent dataset or test dataset. Since, the training data isa labeled data set, supervised learning algorithms are used to infer a function from the labeleddata. Classifiers assessed in cross validation are Naive Bayes, KStar which are of type Bayesianand Lazy learning respectively. The confusion matrices for the afore mentioned models arecalculated and shown in the tables 2 and 3. The collected data for the selected activities havealso been visualized using R.A total set 3252 instances were used to obtain a model using Random Forest, Naive Bayes

and KStar. A 10 fold cross validation technique was employed to assess model validation orto estimate the results of each model (except Random Forest) generalizing to an independentdata set. From the results and confusion matrices, ensemble learning method, for a test setof 957 instances, Random Forest achieves a correct classification rate of 83.4901%, while lazylearning method KStar and Bayesian classifier Naive Bayes resulted in 82.7798% and 68.9114%respectively.A variant of cross-validation known as Leave one out cross validation or LOOC can also be

Page 10: Draft activity recognition from accelerometer data

DRAFT

Figure 9: k-Fold Cross Validation

Confusion Matrix : Naive BayesClassLabel

Standing Walking Running Others

Standing 158 91 35 419Walking 157 751 43 106Running 1 151 671 1Others 4 2 1 661

Table 2: Confusion Matrix for Naive Bayes

used to compute the mean squared error which is shown below as an equation. k-Fold cross-validation procedure is LOOC when the parameter k is made equal to the number of instancesor data set. Under this setting, the bias becomes low, whereas the variance becomes high.

RMSE =

√√√√ 1

n

n∑i=1

(yi − yi)2,

where y is the observed value, y is the predicted value and n is the number of folds. In case ofLOOC it is k = n.

7.1 Random Subspace

In the case of Random Forest of 10 trees, each constructed while considering 7 randomfeatures gives an out of bag error estimate 0.1018. For 66 attribute feature vector, the number

of attributes considered for splitting at each node in a tree is calculated as follows,K = bln(numberofattributes) + 1c

From the above equation, the number of features or attributes is 7. This means, that the

Page 11: Draft activity recognition from accelerometer data

DRAFT

Confusion Matrix : KStarClassLabel

Standing Walking Running Others

Standing 544 67 5 87Walking 212 832 4 9Running 35 122 667 0Others 16 2 1 649

Table 3: Confusion Matrix for KStar

random subset of 7 features define subspace of 65 dimensional space.

7.2 Out-of-bag Error Estimate in Random Forests

While using the random forests, cross validation technique to get an unbiased estimate error isnot required [4]. The error is calculated internally, as each tree is constructed using a differentbootstrap sample 8.1 from the original data. About one-third of the cases are left out of thebootstrap sample and not used in the construction of kth tree.Out of 3252 size of bootstrap sample, 2068 unique statistical units are included in the bootstrapsample and the rest (approximately 1/3rd) serve as out of bag individuals. This can be visualizedusing R’s association rules and frequent item set mining package, arules. The package provides ageneric function sample which produces sample (indices) of the specified size from the elementsprovided using either with or without replacement.The oob out-of-bag data is used to get a running unbiased estimate of the classification erroras trees are added to the forest. In this way, a test set classification is obtained for each case inabout one-third of the trees. At the end of the run, considering j to be the class that receivedmost of the votes every time case n was out-of-bag. The proportion of times that j is not equalto the true class of n averaged over all the cases is the out-of-bag error estimate[4].

8 Training the Classifier

Random forests comprise CART like procedure and bootstrap aggregation or bagging along withrandom subspaces method. In the supervised learning setting, B = 500 trees are constructedor fitted, of which for each tree is grown on a bootstrap sample Di. Each sample generatedfrom the data set D (N points are sampled uniformly), with replacement from the set D istermed as bagging. A tree Ti is grown using Di, such that at each node of the tree, a randomsubset of features m or attributes is chosen and splitting is done only on those m features.From section 5, each feature vector or instance constitutes 65 features, m � 65. At the endof constructing or growing B trees, given an unknown instance or a feature vector, a majorityvote is considered in case of classifying activity labels (not regression).

8.1 Bootstrap sampling

Bootstrap or re-sampling is a technique for improving the quality of estimators.

Bootstrap Algorithm

Page 12: Draft activity recognition from accelerometer data

DRAFT

• Generate N bootstrap samples D1, D2, D3, .....,DN . Each bootstrap sample is obtainedby sampling n times with replacement from sample dataset. (Instances or data pointscan appear multiple times in any Di).

• Evaluate the estimator on each bootstrap sample :

Si = S(Di)that is estimate S pretending that Bi is the data.

• Compute the bootstrap estimate of S by averaging over all bootstrap samples.

ˆSBS = 1|N |∑Ni=1 Si

• For each b = 1, 2, 3, ...., B , generate a bootstrap sample Bb. In detail: For i = 1, 2, 3, ..., n:

– Sample an index j ∈ {1,2,3, ....,n}

– Set xbi = xj and add it to Bb

• For each b, compute mean and variance estimates:

µb = 1n

∑ni=1 x

bi and σ2 = 1

n

∑ni=1(xbi − µb)2

• Compute the bootstrap estimate:

σ2BS = 1

B

∑Bb=1 σ

2b

8.2 Bagging or Bootstrap Aggregation

Training phase

1. Initialize the parameters

• D = ∅, the ensemble.

• B, the number of classifiers to train.

2. For k = 1, ., ., ., ., B

• Take a bootstrap sample Bk from B.

• Build a classifier Dk using Bk as the training set.

• Add the classifier to the current ensemble, D = D ∪Dk.

3. Return D.

Classification phase

4. Run D1, ., ., ., DB on the input x.

5. The class with maximum number of votes is chosen as the label for x.

Page 13: Draft activity recognition from accelerometer data

DRAFT

9 Classification

The random forest classifier is trained on the data set and the test data is obtained from abackground service when the application starts and simultaneously applies the model to classifythe those data. Using Weka machine learning library and R data analysis library, estimatorswere analyzed and plotted. The plot in the figure 10 shows error against number of treesconstructed for various activities. It can be observed as the number of trees are added to theforest, the error decreases. Section 11 elucidates details on the results.

Figure 10: Overall error of the model

10 Comparison of non-linear predictive methods : Tree,Bagging, Random Forest

Tree is a non linear method and is considered the basic building block for Bagging and RandomForest

10.1 Tree

A single tree based prediction using CART as such, partitions the feature space into a setof rectangles, on which predictions are assigned. Estimating VC dimension [3], multivariatesplitting criteria [10] and comparison of pruning methods [16] is beyond the scope of this paper.R[22] offers a package rpart which essentially implements CART. The tree shown in figure 11uses variable fft coef 0000 for construction of the tree with root node error 0.67497.

Page 14: Draft activity recognition from accelerometer data

DRAFT

Figure 11: Tree : Variable used in tree construction fft coef 0000

Confusion Matrix : Random ForestClassLabel

Standing Walking Running Others

Standing 126 3 4 3Walking 90 380 6 0Running 1 26 35 0Others 24 1 0 258

Table 4: Confusion Matrix for Random Forest

10.2 Bagging

As mentioned in section 8.2 grows multiple trees, each grown on a different bootstrap samplecomputed following the method in section 8.1. Since, the task is of type classification (certainlynot regression), a majority vote across all the trees is considered for prediction of the newsample.

11 Results

For the random forest classifier, the confusion matrix 4 and the detailed accuracy includingMatthew’s correlation coefficient or MCC for all four activities computed are shown in the ta-ble 5. Random forest of 10 trees, each constructed while considering 7 random features resultsin out of bag error estimate 0.1018. Since, cross validation is ignored in the case of Randomforests, from the table 5, there still exists a possibility of over fitting on the training data.From the results of various classifiers, random forest certainly outperformed in this task. Al-though, the ’curse of dimensionality’ exists because 65 features for a feature vector, the resultsprove to be satisfactory when compared to the rest of the classifiers. Out of 957 instancessupplied as test set, it is observed that 799 were correctly classified and rest 158 instances wereincorrectly classified. The lazy learning algorithm, KStar classified 653 instances correctly andleaving 304 instances incorrectly classified while Naive Bayes has displayed a poor performanceof correctly classifying 564 instances and 393 instances incorrectly. One of the key statistic in

Page 15: Draft activity recognition from accelerometer data

DRAFT

Detailed accuracy by classClassLabel

TPRate

FPRate

Precision Recall F-Measure

MCC

Standing 0.926 0.140 0.523 0.523 0.523 0.633Walking 0.798 0.062 0.927 0.927 0.927 0.744Running 0.565 0.011 0.778 0.778 0.778 0.643Others 0.912 0.004 0.989 0.989 0.989 0.930WeightedAvg.

0.835 0.053 0.878 0.835 0.845 0.776

Table 5: Detailed accuracy by class

the results shown for random forest in table 5, is the MCC which serves as measure of qualityof binary classifications, where a perfect prediction value is +1 and -1 for total disagreementbetween observed and predicted values.To evaluate the performance of the classifiers Naive Bayes, KStar, and Random Forest for thisskewed dataset, it is necessary to peruse the cost sensitive analysis measure, ROC curve of thementioned classifiers. The area under the curve or AUC from the plots in figure 12, illustratesthe classification power of the classifier. It can be understood that more the area under thecurve, the better classification prowess the classifier possess. The point perfect classificationoccurs at upper-left hand corner of the graph, at which True positive rate is 100% and falsepositive rate is 0% which is impractical. From the plot in figure 12 it is conspicuous, for classlabel standing, the roc space for Random Forest is greater than the rest of its counter parts andappears fairly close to optimal.

Figure 12: Multiple ROC curves for class label ’Standing’

Page 16: Draft activity recognition from accelerometer data

DRAFT

12 Conclusion

To achieve more effective classification values, Boosting can be used to improve the accuracy.Much sophisticated techniques such as using temporal probabilistic models has been shown toperform well in activity recognition and generally outperform non-temporal models [19]. Suchalgorithms are used in advanced sensor fusion techniques where the tasks of classification aremore complex. The advent of incremental classifiers such as Hoeffding tree [15], an example ofVery Fast Decision Tree or VFDT, considerably learns better in case of massive data streams. Inthe feature selection algorithm, there are various measures such as InfoGain and Ranker searchmethod to rank features of the vector, thus reducing the search space significantly. Allowingvarious new measures and classification techniques can result in high classification rates andhence the accuracy.

References

[1] Low pass filter smoothing sensor data with a low-pass filter. http://blog.thomnichols.org/

2011/08/smoothing-sensor-data-with-a-low-pass-filter.

[2] Motion Sensors api guides. http://developer.android.com/guide/topics/sensors/sensors_

motion.html.

[3] Ozlem Asian, Olcay Taner Yildiz, and Ethem Alpaydin. Calculating the vc-dimension of decisiontrees. In Computer and Information Sciences, 2009. ISCIS 2009. 24th International Symposiumon, pages 193–198. IEEE, 2009.

[4] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.

[5] Andreas Bulling, Ulf Blanke, and Bernt Schiele. A tutorial on human activity recognition usingbody-worn inertial sensors. ACM Computing Surveys, to appear 2014.

[6] Andrew T. Campbell. Smartphone Programming. http://www.cs.dartmouth.edu/~campbell/

cs65/myruns/myruns_manual.html#chap:labs:3, 2013. [Online; accessed 21-June-2014].

[7] Sauvik Das, LaToya Green, Beatrice Perez, Michael Murphy, and A Perring. Detecting useractivities using the accelerometer on android smartphones. The Team for Research in UbiquitousSecure Technology, TRUST-REU Carnefie Mellon University, 2010.

[8] Android Developer. Sensor Coordinate System. http://developer.android.com/guide/topics/sensors/sensors_overview.html, 2014. [Online; accessed 10-June-2014].

[9] Irina Diaconita, Andreas Reinhardt, Frank Englert, Delphine Christin, and Ralf Steinmetz. Doyou hear what i hear? using acoustic probing to detect smartphone locations. In Proceedings ofthe 1st Symposium on Activity and Context Modeling and Recognition (ACOMORE), pages 1–9,Mar 2014.

[10] Richard O Duda, Peter E Hart, and David G Stork. Pattern classification. John Wiley & Sons,2012.

[11] Manhyung Han, La The Vinh, Young-Koo Lee, and Sungyoung Lee. Comprehensive contextrecognizer based on multimodal sensors in a smartphone. Sensors, 12(9):12588–12605, 2012.

[12] Trevor Hastie, Robert Tibshirani, and Jerome Friedman. The elements of statistical learning,volume 2. Springer.

[13] Clemens Holzmann and Michael Haslgrubler. A self-organizing approach to activity recognitionwith wireless sensors. In Proceedings of the 4th International Workshop on Self-Organizing Systems(IWSOS 2009), ETH Zurich, Switzerland, December 2009. Springer LNCS.

[14] Seyed Amir Hoseini-tabatabaei, Alexander Gluhak, and Rahim Tafazolli. A survey on smartphone-based systems for opportunistic user context recognition. ACM Computing Surveys, 45(3), 2013.

[15] Geoff Hulten, Laurie Spencer, and Pedro Domingos. Mining time-changing data streams. In ACMSIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, pages 97–106. ACM Press, 2001.

Page 17: Draft activity recognition from accelerometer data

DRAFT

[16] Oded Maimon and Lior Rokach. Data Mining and Knowledge Discovery Handbook. Springer-VerlagNew York, Inc., Secaucus, NJ, USA, 2005.

[17] B. P.; Teukolsky S. A.; Press, W. H.; Flannery and W. T Vetterling. Numerical recipes in fortran:The art of scientific computing, 2nd ed. , cambridge univ. Press, Cambridge, pages 407–411, 1989.

[18] Muhammad Shoaib, Hans Scholten, and P.J.M. Havinga. Towards physical activity recognitionusing smartphone sensors. In 10th IEEE International Conference on Ubiquitous Intelligence andComputing, UIC 2013, pages 80–87, Los Alamitos, CA, USA, December 2013. IEEE ComputerSociety.

[19] T.L.M. van Kasteren, G. Englebienne, and B.J.A. Krse. Human activity recognition from wirelesssensor network data: Benchmark and software. In Liming Chen, Chris D. Nugent, Jit Biswas,and Jesse Hoey, editors, Activity Recognition in Pervasive Intelligent Environments, volume 4 ofAtlantis Ambient and Pervasive Intelligence, pages 165–186. Atlantis Press, 2011.

[20] Weka. Serialization. http://weka.wikispaces.com/Serialization, 2009. [Online; accessed 19-July-2014].

[21] Xiaochao Yang, Chuang-Wen You, Hong Lu, Mu Lin, Nicholas D Lane, and Andrew T Camp-bell. Visage: A face interpretation engine for smartphone applications. In Mobile Computing,Applications, and Services, pages 149–168. Springer, 2013.

[22] Yanchang Zhao. Decision Trees. http://www.rdatamining.com/examples/decision-tree, 2014.[Online; accessed 10-August-2014].