risk prediction using quantified news · risk prediction using quantified news vu anh huynh 1...

21
Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang 2 1 Department of Aeronautics and Astronautics, MIT 2 Department of Nuclear Science and Engineering, MIT Acknowledgement: Dan diBartolomeo, Anish Shah, Louis Scott Northfields 18th Annual Summer Seminar June 7, 2013 Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 1 / 16

Upload: others

Post on 20-Mar-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Risk Prediction Using Quantified News

Vu Anh Huynh1 Liang Zhang2

1Department of Aeronautics and Astronautics, MIT2Department of Nuclear Science and Engineering, MIT

Acknowledgement: Dan diBartolomeo, Anish Shah, Louis Scott

Northfields 18th Annual Summer Seminar

June 7, 2013

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 1 / 16

Page 2: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Outline

1 Problem Definition

2 Approach

3 Data

4 Regression Model

5 Results

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 2 / 16

Page 3: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Problem Definition

AAPL

Risk Prediction Using Quantified News

Given time-series of prices and intraday volatility of a security,

Given time-series of quantified news for the same security,

Build a model to predict the next intraday volatility.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 3 / 16

Page 4: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Problem Definition

AAPL

Risk Prediction Using Quantified News

Given time-series of prices and intraday volatility of a security,

Given time-series of quantified news for the same security,

Build a model to predict the next intraday volatility.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 3 / 16

Page 5: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Mathematical Formulation

Risk Prediction Using Quantified News

{Pt

}0t

: Price time-series,

{�t

}0t

: Intraday volatility time-series:

�t

=p252⇡/8 log(P

high

/Plow

)

{nt

}0t

where n

t

2 RK : News time-series where n

t

= 0 for dateswithout news.

Find a model F such that:

g(�t+1) = F

⇣{P⌧}0⌧t

, {�⌧}0⌧t

, {n⌧}0⌧t

⌘+ ✏

t+1, (1)

where g : R ! R is a function that describes a property of �t+1.

The choice of the function g is important.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 4 / 16

Page 6: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Mathematical Formulation

Risk Prediction Using Quantified News

{Pt

}0t

: Price time-series,

{�t

}0t

: Intraday volatility time-series:

�t

=p252⇡/8 log(P

high

/Plow

)

{nt

}0t

where n

t

2 RK : News time-series where n

t

= 0 for dateswithout news.

Find a model F such that:

g(�t+1) = F

⇣{P⌧}0⌧t

, {�⌧}0⌧t

, {n⌧}0⌧t

⌘+ ✏

t+1, (1)

where g : R ! R is a function that describes a property of �t+1.

The choice of the function g is important.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 4 / 16

Page 7: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Approach: Using regression and model selection

There are many approaches to construct such a model F .

Each approach is chosen based on the question that we pose aboutthe prediction ability.

In this work, we use regression and model selection to answer thequestion:What is the numerical value of tomorrow’s intraday volatility?

(i.e. g(x)=x)

�t+1 = F

⇣{P⌧}0⌧t

, {�⌧}0⌧t

, {n⌧}0⌧t

⌘+ ✏

t+1.

We will compare performance against a base model, which is a

random walk model.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 5 / 16

Page 8: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Approach: Using regression and model selection

There are many approaches to construct such a model F .

Each approach is chosen based on the question that we pose aboutthe prediction ability.

In this work, we use regression and model selection to answer thequestion:What is the numerical value of tomorrow’s intraday volatility?

(i.e. g(x)=x)

�t+1 = F

⇣{P⌧}0⌧t

, {�⌧}0⌧t

, {n⌧}0⌧t

⌘+ ✏

t+1.

We will compare performance against a base model, which is a

random walk model.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 5 / 16

Page 9: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Approach: Using regression and model selection

There are many approaches to construct such a model F .

Each approach is chosen based on the question that we pose aboutthe prediction ability.

In this work, we use regression and model selection to answer thequestion:What is the numerical value of tomorrow’s intraday volatility?

(i.e. g(x)=x)

�t+1 = F

⇣{P⌧}0⌧t

, {�⌧}0⌧t

, {n⌧}0⌧t

⌘+ ✏

t+1.

We will compare performance against a base model, which is a

random walk model.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 5 / 16

Page 10: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Data: RavenPack – analytics engine converting news text to quantitative data

News data provided by Ravenpack {nt

}0t

More than half a million pieces of news per month (nearly 1000 newsper hour).

For each news, detailed information is provided, such as type,relevance, event sentiment, novelty, etc. (see next slide)

Entity Relevance ESS CSS AEV NIP AEV

t1company A 100 85 55 77 12 68

t2company B ... ... ... ... ... ...

t3company A ... ... ... ... ... ...

t4company C ... ... ... ... ... ...

t5company B ... ... ... ... ... ...

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 6 / 16

Page 11: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Data

News Data Field Descriptions (highlights)

Entity: company names.

RELEVANCE(N): 0-100, indicates how closely related the entity isto the underlying news, can be converged to number of news per day.

ESS: 0-100, represents the news sentiment, i.e. higher values indicatepostive sentiment, while lower values below 50 show negativesentiment.

AES: 0-100, the percentage of positive events measured over a rolling90 day window.

AEV: the count of events measured over a rolling 90 day window.

ENS: 0-100, represents how ”new” or novel a news is.

CSS: 0-100, sentiment score combined various analysis techniques.

NIP: 0-100, the degree of impact a news flash has on the market overthe following 2-hr period.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 7 / 16

Page 12: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Regression Model Selection: General Result and Fundamental Limit

Consider nested models: F1 ✓ F2 ✓ F3 ✓ ... used for training toestimate parameters of the models,

Split data into training set and testing set,

Report training error and testing error for each model Fi

:

Implication: Balance between training error and complexity of a model.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 8 / 16

Page 13: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Regression Model Selection: General Result and Fundamental Limit

Consider nested models: F1 ✓ F2 ✓ F3 ✓ ... used for training toestimate parameters of the models,

Split data into training set and testing set,

Report training error and testing error for each model Fi

:

Implication: Balance between training error and complexity of a model.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 8 / 16

Page 14: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

An Example of Nested Models

1log(�

t+1) = log(�t

) + ✏t+1

2log(�

t+1) = log(�t

) + N

t

+ ✏t+1

3log(�

t+1) = log(�t

) + N

t

+ CSS

t

+ ✏t+1

4log(�

t+1) = log(�t

) ⇤ Nt

+ CSS

t

+ ✏t+1

5log(�

t+1) =⇣log(�

t

) + CSS

t

⌘⇤ N

t

+ ✏t+1

6log(�

t+1) =⇣log(�

t

) + CSS

t

⌘⇤ N

t

+ ENS

t

+ ✏t+1

7log(�

t+1) =⇣log(�

t

) + CSS

t

⌘⇤ N

t

+ ENS

t

+ NIP

t

+ ✏t+1

8log(�

t+1) =⇣log(�

t

)+CSS

t

⌘⇤N

t

+ENS

t

⇤NIPt

+AES

t

+AEV

t

+✏t+1

9log(�

t+1) =⇣log(�

t

) + CSS

t

+ ENS

t

⌘⇤ N

t

+ ENS

t

⇤ NIPt

+ AES

t

⇤ AEVt

+ ✏t+1

10log(�

t+1) =⇣log(�

t

)+CSS

t

+ENS

t

+ESS

t

⌘⇤N

t

+ENS

t

⇤NIPt

+AES

t

⇤AEVt

+✏t+1

Note: A ⇤ B means A+ B + AxB

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 9 / 16

Page 15: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Results

Google (NASDAQ: GOOG)

Jan Mar May Jul Sep Nov

20

60

100

Dates

Intr

a!

day

Vola

tility

(%

) RealizedTrainingTesting

Jan Mar May Jul Sep Nov

20

60

100

DatesIn

tra!

day

Vola

tility

(%

) RealizedTrainingTesting

log(�t+1) = log(�

t

) + ✏t+1 log(�

t+1) =⇣log(�

t

) + CSS

t

⌘⇤ N

t

+ ENS

t

+ NIP

t

+ ✏t+1

R

2 = 1.3% R

2 = 31.4%

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 10 / 16

Page 16: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Results

Google (NASDAQ: GOOG)

5.6

5.8

6

6.2

6.4

6.6

6.8

0 1 2 3 4 5 6 7 8 9 10

11

12

13

14

15

16T

rain

ing e

rror

Test

ing e

rror

Model ID (in order of complexity)

TrainingTesting

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 11 / 16

Page 17: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Results

Exxon Mobil (NASDAQ: XOM)

Jan Mar May Jul Sep Nov

515

25

35

Dates

Intr

a!

day

Vola

tility

(%

) RealizedTrainingTesting

Jan Mar May Jul Sep Nov

515

25

35

DatesIn

tra!

day

Vola

tility

(%

) RealizedTrainingTesting

log(�t+1) = log(�

t

) + ✏t+1 log(�

t+1) =⇣log(�

t

) + AES

t

+ CSS

t

+ ENS

t

⌘⇤ AEV

t

+⇣CSS

t

+ ESS

t

⌘⇤ N

t

+⇣NIP

t

+ ESS

t

⌘⇤ ENS

t

+ ✏t+1

R

2 = �1.8% R

2 = 18.4%

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 12 / 16

Page 18: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Results

Exxon Mobil (NASDAQ: XOM)

3.1

3.2

3.3

3.4

3.5

3.6

3.7

0 2 4 6 8 10 12 14 4.8

4.9

5

5.1

5.2

5.3

5.4T

rain

ing e

rror

Test

ing e

rror

Model ID (in order of complexity)

TrainingTesting

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 13 / 16

Page 19: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Results

Apple (NASDAQ: AAPL)

Jan Mar May Jul Sep Nov

10

30

50

Dates

Intr

a!

day

Vo

latil

ity (

%) Realized

TrainingTesting

9

9.5

10

10.5

11

0 1 2 3 4 5 6 7 8 9 8

8.5

9

9.5

10

10.5

11

11.5

12

Tra

inin

g e

rro

r

Te

stin

g e

rro

r

Model ID (in order of complexity)

TrainingTesting

No model can be better than the simple random walk model (R2 = 24.4%)

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 14 / 16

Page 20: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Statistics of regression models

We investigated di↵erent regression models for over 150 stocks in theyear of 2012There are many stocks that are news neutral, in the sense thatvolatility is indi↵erent to any factors from news analytics.For stocks that are response to news factor, volatility is consistentlycorrelated with certain news factors (as shown in the figures below).

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 15 / 16

Page 21: Risk Prediction Using Quantified News · Risk Prediction Using Quantified News Vu Anh Huynh 1 Liang Zhang2 1Department of Aeronautics and Astronautics, MIT 2Department of Nuclear

Conclusions and Future work

We have investigated the regression approach to model how newsa↵ects intraday volatity of securities.

For certain stocks, we built up nested regression model tosignificantly improve the prediction of intra-volatility .

There are news neutral stocks for which news-incorporated models donot outperform the random walk model.

For many stocks, volatility is positively related to the number of newsas well as the news sentiment.

We are building statistics on a larger dataset to support ourhypothesis.

Huynh, Zhang (MIT) Risk Prediction Using Quantified News June 7, 2013 16 / 16