a hybrid sofm-svr with a filter-based feature selection for stock market forecasting huang, c. l....

30
A hybrid SOFM-SVR A hybrid SOFM-SVR with a filter-based with a filter-based feature selection feature selection for stock market for stock market forecasting forecasting Huang, C. L. & Tsai, Huang, C. L. & Tsai, C. Y. C. Y. Expert Systems with Applicati ons 2008

Upload: cory-richardson

Post on 03-Jan-2016

229 views

Category:

Documents


0 download

TRANSCRIPT

A hybrid SOFM-SVR with a A hybrid SOFM-SVR with a filter-based feature selectionfilter-based feature selectionfor stock market forecastingfor stock market forecasting

Huang, C. L. & Tsai, C. Y. Huang, C. L. & Tsai, C. Y.

Expert Systems with Applications 2008

Introduction

Stock market price index prediction is regarded as a challenging task of the finance.

Support vector regression (SVR) has successfully solved prediction problems in many domains, including the stock market.

Introduction

filter-based feature selection to choose important input attributes

SOFM algorithm to cluster the training samples

SVR to predict the stock market price index Using a real future dataset – Taiwan index

futures (FITX) to predict the next day’s price index

Introduction SOFM+SVR : to improve the prediction

accuracy of the traditional SVR method and to reduce its long training time,

SOFM+SVR+filter-based feature selection : improvement in training time, prediction accuracy, and the ability to select a better feature subset is achieved.

SVRSVR

Unlike pattern recognition problems where the desired outputs are discrete values (e.g., Boolean)

support vector regression (SVR) deals with ‘real valued’ functions

Self-organizing Feature Maps; SOFMSelf-organizing Feature Maps; SOFM

SOFMSOFM

1 2

3 4

Training the SOFM-SVR model

1. 1. Scaling the training set 2.Clustering the training dataset 3.Training the Individual SVR Models for

Each Cluster

Training the SOFM-SVR model

Parameters OptimizationParameters Optimization

setting of the SVR parameters can improve the SVR prediction accuracy

Using RBF kernel and ε-insensitive loss function, three parameters, C, r, and ε, should be determined in the SVR model

The grid search approach is a common method to search for the C, r, and ε values.

Grid Search Approach

Evaluating the SOFM-SVR model with test set

Scale the test set based on the scaling equation according to the attribute rage of the training set

Find the cluster to which the test sample in the test set

Calculate the predicted value for each sample in the test set

Calculate the prediction accuracy for the test set

SOFM-SVR model

SOFM-SVR combined with filter-based feature selection

X is Certain input variable (i.e. feature) Y is response variable (i.e. label) n is the number of training samples

SOFM-SVR filter-based feature selection

Performance measures

Ai is the actual value of sample i Fi is a predicted value of sample i n is the number of samples.

Experimental data set

SOFM-SVR with various numbers of clusters in dataset #1

Accuracy measures with various numbers of clusters

Wilcoxon sign rank test

Wilcoxon sign rank test on the prediction errors for the SOFM-SVR withvarious numbers of clusters

Results of SOFM-SVR using three clusters

Results of SOFM-SVR with selected features

Original Feature VS. Original Feature Original Feature VS. Original Feature

Original FeatureOriginal Feature

Original FeatureOriginal Feature

Wilcoxon sign rank test

Important FeatureImportant Feature

MA10: 10-day moving average. MACD9: 9-day moving average convergence/ divergence. +DI10: directional indicator up. -DI10: directional indicator down. K10: 10-day stochastic index K PSY10: 10-day psychological line. D9: 9-day stochastic index D

Relative importance of the selected features

Wilcoxon sign rank test: SOFM-SVR vs. single SVR

MAPE comparison: SOFM-SVR vs. single SVRs.

Training time comparisons: SOFM-SVR vs. single SVRs.

Conclusion

Hybrid SOFM-SVR with filter based feature selection to improve the prediction accuracy and to reduce the training time for the financial daily stock index prediction

Further research directions are using optimization algorithms (e.g., genetic algorithms) to optimize the SVR parameters and performing feature selection using a wrapper-based approach that combines SVR with other optimization tools

Thank YouThank You