slide->title; ?>

19
1 Machine Learning for Stock Machine Learning for Stock Selection Selection Robert J. Yan Robert J. Yan Charles X. Ling Charles X. Ling University of Western Ontario, University of Western Ontario, Canada Canada {jyan, cling}@csd.uwo.ca {jyan, cling}@csd.uwo.ca

Upload: shyam108

Post on 22-Nov-2014

556 views

Category:

Technology


0 download

DESCRIPTION

slide->imported == 1 || $this->owner_pres == 0) echo "readonly"; ?>>slide->description; ?>

TRANSCRIPT

Page 1: slide->title; ?>

1

Machine Learning for Stock SelectionMachine Learning for Stock Selection

Robert J. YanRobert J. YanCharles X. LingCharles X. Ling

University of Western Ontario, CanadaUniversity of Western Ontario, Canada{jyan, cling}@csd.uwo.ca{jyan, cling}@csd.uwo.ca

Page 2: slide->title; ?>

2

OutlineOutline

IntroductionThe stock selection taskThe Prototype Ranking methodExperimental resultsConclusions

Page 3: slide->title; ?>

3

IntroductionIntroductionObjective:

– Use machine learning to select a small number of “good” stocks to form a portfolio

Research questions:– Learning in the noisy dataset– Learning in the imbalanced dataset

Our solution: Prototype Ranking– A specially designed machine learning method

Page 4: slide->title; ?>

4

OutlineOutline

IntroductionThe stock selection taskThe Prototype Ranking methodExperimental resultsConclusions

Page 5: slide->title; ?>

5

Stock Selection TaskStock Selection TaskGiven information prior to week t, predict

performance of stocks of week t– Training set

Predictor 1 Predictor 2 Predictor 3 Goal

Stock ID Return of week t-1

Return of week t-2

Volume ratio of t-2/t-1

Return of week t

Learning a ranking function to rank testing data– Select n highest to buy, n lowest to short-sell

Page 6: slide->title; ?>

6

OutlineOutline

IntroductionThe stock selection taskThe Prototype Ranking methodExperimental resultsConclusions

Page 7: slide->title; ?>

7

Prototype RankingPrototype Ranking

Prototype Ranking (PR): special machine learning for noisy and imbalanced stock data

The PR SystemStep 1. Find good “prototypes” in training dataStep 2. Use k-NN on prototypes to rank test data

Page 8: slide->title; ?>

8

Step 1: Finding PrototypesStep 1: Finding Prototypes

Prototypes: representative points– Goal: discover the underlying

density/clusters of the training samples by distributing prototypes in sample space

– Reduce data sizeprototypes

prototype neighborhood

samples

Page 9: slide->title; ?>

10

Finding prototypes using competitive learning

General competitive learning Step 1: Randomly initialize a set of prototypes Step 2: Search the nearest prototypes Step 3: Adjust the prototypes Step 4: Output the prototypes

Hidden density in training is reflected in prototypes

Page 10: slide->title; ?>

11

Modifications for Stock dataModifications for Stock data

In step 1: Initial prototypes organized in a tree-structure– Fast nearest prototype searching

In step 2: Searching prototypes in the predictor space– Better learning effect for the prediction tasks

In step 3: Adjusting prototypes in the goal attribute space– Better learning effect in the imbalanced stock data

In step 4, prune the prototype tree– Prune children prototypes if they are similar to the parent– Combine leaf prototypes to form the final prototypes

Page 11: slide->title; ?>

12

Step 2: Predicting Test DataStep 2: Predicting Test Data

The weighted average of k nearest prototypesOnline update the model with new data

Page 12: slide->title; ?>

13

OutlineOutline

IntroductionThe stock selection taskThe Prototype Ranking methodExperimental resultsConclusions

Page 13: slide->title; ?>

14

DataData

CRSP daily stock database– 300 NYSE and AMEX stocks, largest market cap– From 1962 to 2004

Page 14: slide->title; ?>

15

Testing PRTesting PR

Experiment 1: Larger portfolio, lower average return, lower risk – diversification

Experiment 2: is PR better than Cooper’s method?

Page 15: slide->title; ?>

16

Results of Experiment 1Results of Experiment 1

00. 20. 40. 60. 8

11. 21. 41. 61. 8

0 10 20 30 40 50 60 70 80 90 100 110Stock Number i n Portf ol i o

Wee

kly

Ave

rage

Ret

urn

(%)

22. 5

33. 5

44. 5

5

0 10 20 30 40 50 60 70 80 90 100 110

St ock Number i n Por t f ol i o

Wee

kly

Std

.(%)

Average Return(1978-2004)

Risk (std)(1978-2004)

Page 16: slide->title; ?>

17

Experiment 2: Comparison to Experiment 2: Comparison to Cooper’s methodCooper’s method

Cooper’s method (CP): A traditional non-ML method for stock selection…

Compare PR and CP in 10-stock portfolios

Page 17: slide->title; ?>

18

Results of Experiment 2 Results of Experiment 2 Measures: Average Return (Ret.)

Sharpe Ratio (SR): a risk-adjusted return: SR= Ret. / Std.

00.20.40.60.8

11.21.41.6

Ret.(%) SR

PR 10-stock portfolioCP 10-stock portfolio

Page 18: slide->title; ?>

20

OutlineOutline

IntroductionThe stock selection taskThe Prototype Ranking methodExperimental resultsConclusions

Page 19: slide->title; ?>

21

ConclusionsConclusionsPR: modified competitive learning and k-NN

for noisy and imbalanced stock dataPR does well in stock selection

– Larger portfolio, lower return, lower risk– PR outperforms the non-ML method CP

Future work: use it to invest and make money!