measuring and predicting human behaviour...

31
Tobias Preis Data Science Lab, Behavioural Science Warwick Business School [email protected] http:// www.tobiaspreis.de Measuring and predicting human behaviour using online data

Upload: vunga

Post on 29-Oct-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Tobias Preis

Data Science Lab, Behavioural Science Warwick Business School

[email protected] http://www.tobiaspreis.de

Measuring and predicting human behaviour using online data

The big data explosion

A map of the world built only from GPS locations of Flickr photos  

The advantage of looking forward 1

Future Orientation Index 2010

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

Future Orientation Index 2010 Suzy Moat & Tobias PreisBased on Preis, Moat, Stanley and Bishop (2012)

Ratio of Google searches for “2011” to searches for “2009” during 2010 for 45 countries

more Google searches for “2009” more Google searches for “2011”

0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8

Future Orientation Index 2012 Suzy Moat & Tobias PreisBased on Preis, Moat, Stanley and Bishop (2012)

Ratio of Google searches for “2013” to searches for “2011” during 2012 for 45 countries

Richer countries look forward

Time with Weekly Granularity

Sear

ch V

olum

e

0

5

10

2008 2009 2010

0

5A B

“2008”“2007”

“2009” “2010” “2011”

“2009”

Future-Orientation Index

GD

P / C

apita

[10

4 USD

]

1

2

3

4

0.0 0.5 1.0 1.5 2.0

Preis, Moat, Stanley &

Bishop (2012)

Featured by:

http://www.nature.com/srep/2012/120405/srep00350/pdf/srep00350.pdf

Photo: Perpetual Tourist

Predicting stock markets 2

Financial markets: big data

Photo: Perpetual Tourist

The Internet: big data

Hypothetical strategy

week t

Moat et al. (2013); Preis et al. (2013)

number of Google searches for keyword

number of Google searches for keyword

week t t-1 t-2 t-3

Moat et al. (2013); Preis et al. (2013) Hypothetical strategy

Search volume decreased: BUY stock

in week t+1

week t t-1 t-2 t-3

Moat et al. (2013); Preis et al. (2013) Hypothetical strategy

number of Google searches for keyword

Search volume decreased: BUY stock

in week t+1

Search volume increased: SELL stock

in week t+1 week t t-1 t-2 t-3

Moat et al. (2013); Preis et al. (2013) Hypothetical strategy

number of Google searches for keyword

−40

0

40

2005 2007 2009 2011

516

% profit

“culture” trading strategy buy and hold strategy mean ± 1 sd of random strategies

Preis, Moat & Stanley (2013)

Featured by: Example: “culture”

0100200300

2005 2007 2009 2011

326

16

% profit

“debt” trading strategy buy and hold strategy mean ± 1 sd of random strategies

Preis, Moat & Stanley (2013)

Example: “debt” Featured by:

http://www.nature.com/srep/2013/130425/srep01684/pdf/srep01684.pdf

Random strategy mean + 2 sds

Random strategy mean + 1 sd

return (random strategy sds)

0

1

2

-1

“debt” “culture”

How different keywords perform

Preis, Moat & Stanley (2013)

Random strategy mean + 2 sds

Random strategy mean + 1 sd

return (random strategy sds)

0

1

2

-1

“debt” “culture”

“stocks”

“credit”

“garden” “train”

Preis, Moat & Stanley (2013)

How different keywords perform

# occurrences in FT

# hits on Google

Returns significantly correlated with indicator

of financial relevance

Financial relevance

Random strategy mean + 2 sds

Random strategy mean + 1 sd

return (random strategy sds)

0

1

2

-1

Preis, Moat & Stanley (2013)

How different keywords perform

Return [Std. Dev. of Random Strategies]

Den

sity

0.0

0.2

0.4

0.6

−2 0 2

Wikipedia ViewsDJIA Companies

Wikipedia EditsDJIA Companies

RandomStrategy

Wikipedia: Dow Jones companies

Views strategies profitable

Moat, Curme, Avakian, Kenett,

Stanley & Preis (2013)

Featured by:

http://www.nature.com/srep/2013/130508/srep01801/pdf/srep01801.pdf

0.00

0.25

0.50

0.75

1.00

−2 0 2Return [Std. Dev. of Random Strategies]

Den

sity

Wikipedia ViewsFinancial Topics

Wikipedia EditsFinancial Topics

RandomStrategy

Wikipedia: Financial topics Moat, Curme,

Avakian, Kenett, Stanley & Preis

(2013)

Featured by:

Views strategies profitable

0.0

0.1

0.2

0.3

0.4

−2 0 2Return [Std. Dev. of Random Strategies]

Den

sity

Wikipedia ViewsActors & Filmmakers

RandomStrategy

Wikipedia: Actors and filmmakers?

Strategies NOT profitable

Moat, Curme, Avakian, Kenett,

Stanley & Preis (2013)

Featured by:

debt

housing

crisis

apple

orange

tree

housing

debt Curme, Preis, Stanley & Moat (2014)

What is searched for before falls?

55 groups of search terms

Business and politics most related

Curme, Preis, Stanley & Moat (2014)

What is searched for before falls? Cumulative Returns (%)

-100 0 100 200Random Strategy

Politics IBusiness

http://www.pnas.org/content/111/32/11600.full.pdf

Sensing problems 3

Preis and Moat!(under review);  

Time

Nor

mal

ised

Num

ber o

f Pho

tos

0.000.020.040.060.080.10

20 Oct 25 Oct 30 Oct 05 Nov 10 Nov 15 Nov 20 Nov

Flickr Photos with Hurricane Related Tags

Landfall of Hurricane Sandy

TimeAtm

osph

eric

Pre

ssur

e [m

bar]

960

980

1000

1020

20 Oct 25 Oct 30 Oct 05 Nov 10 Nov 15 Nov 20 Nov

Landfall of Hurricane Sandy

Averaged Pressure in US State New Jersey

A

B

Preis, Moat, Bishop, Treleaven and Stanley (2013)

Flickr and Hurricane Sandy

Flickr: photos taken

Air pressure

http://www.nature.com/srep/2013/131105/srep03141/pdf/srep03141.pdf

Google searches and the flu 4

Preis & Moat (2014) Level of flu cases

Number of influenza cases in the US

●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●

●●●●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●

●●

●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●

2

4

6

2010 2011 2012 2013Time [Weeks]

Influ

enza−L

ike

Illne

ss [%

]

Observed Value●

Butler, Nature 494, 155 (2013)

Preis & Moat (2014) Level of flu cases

Number of influenza cases in the US

●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●

●●●●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●

●●

●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●

2

4

6

2010 2011 2012 2013Time [Weeks]

Influ

enza−L

ike

Illne

ss [%

]

Observed Value●

Level of flu cases

Predicting the present number of influenza cases

●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●

●●●●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●

●●

●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●

2

4

6

2010 2011 2012 2013Time [Weeks]

Influ

enza−L

ike

Illne

ss [%

]

Predicted ValueObserved Value80% Prediction Interval95% Prediction Interval

Trai

ning

Per

iod

Out-of-Sample Nowcast

Preis & Moat (2014)

Level of flu cases

Predicting the present number of influenza cases

●●●●●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●

●●●●●●●●●●

●●

●●

●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●●

●●●●●●●●

●●●●●●

●●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●●●●●

●●●●

●●

●●●

●●●●●●

●●●●●●●●●●●●●●●●●●●●●●●

●●●●

2

4

6

2010 2011 2012 2013Time [Weeks]

Influ

enza−L

ike

Illne

ss [%

]

Predicted ValueObserved Value80% Prediction Interval95% Prediction Interval

Trai

ning

Per

iod

Out-of-Sample Nowcast

Preis & Moat (2014)

Forecast errors significantly reduced by between 16% and 53%.

http://rsos.royalsocietypublishing.org/content/royopensci/1/2/140095.full.pdf

Data from the Internet may help us measure and even predict

human behaviour

How can open data help you? [email protected]

Twitter: t_preis