"three dimensional time: working with alternative data" by kathryn glowinski, engineer at...

43
Three-Dimensional Time: Working with Alternative Data Now You’re Thinking with Perspectives! Kathryn Glowinski Engineer, Quantopian

Upload: quantopian

Post on 21-Jan-2018

672 views

Category:

Economy & Finance


0 download

TRANSCRIPT

Page 1: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Three-Dimensional Time: Working with Alternative Data

Now You’re Thinking with Perspectives!

Kathryn GlowinskiEngineer, Quantopian

Page 2: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Disclaimer

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Additionally, this presentation is being provided on the express basis that it and any related communications (whether written or oral) will not cause Quantopian to become an investment advice fiduciary under ERISA or the Internal Revenue Code with respect to any retirement plan or IRA investor, as the recipients are fully aware that the Quantopian (i) is not undertaking to provide impartial investment advice, make a recommendation regarding the acquisition, holding or disposal of an investment, act as an impartial adviser, or give advice in a fiduciary capacity, and (ii) has a financial interest in the offering and sale of one or more products and services, which may depend on a number of factors relating to Quantopian’s internal business objectives, and which has been disclosed to the recipient. Nothing set forth herein or any information conveyed (in writing or orally) in connection with this presentation is intended to constitute a recommendation that any person take or refrain from taking any course of action within the meaning of U.S. Department of Labor Regulation §2510.3-21(b)(1), including without limitation buying, selling or continuing to hold any security. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. You are advised to contact your own financial advisor or other fiduciary unrelated to Quantopian about whether any given course of action may be appropriate for your circumstances. The information provided herein is intended to be used solely by the recipient in considering the products or services described herein and may not be used for any other reason, personal or otherwise. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.

Page 3: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

What’ll It Be?

Let’s chat about how data is typically viewed through the lens of time.

Because generally, that way is typically some percentage wrong. Let me tell you why.

Lessons learned at Q along the way.

What does “multidimensional” data mean for MY data?

More importantly, what does it mean for ANYONE’S data?

Quantopian.com

Page 4: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

“Alternative” Data?

Page 5: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

“Fundamental” Data?

Page 6: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

My Very Special Dataset

Page 7: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Early Attempts

Page 8: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Consider the Following

Page 9: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

So, We Can Just Change the Data, Right?

Page 10: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian
Page 11: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Enter, Fundamentals

What if we captured, every day, what we knew the latest value to be for every piece of known information?

It should now be the corrected state of the world for any present moment.

3/1/

17

3/3/

17

3/5/

17

3/9/

17

3/7/

17

3/13

/17

3/11

/17

First Known

Revisions

10 12

11

10 10 10 10 10 10 11 12 12 12 12 12 12 12Seen

Quantopian.com

Page 12: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Still Not Quite Right

But we still have the same problem, for revisions to updated data.

And what happens if you have 250GB of sparse data alone, before you even forward fill those values?

Quantopian.com

Page 13: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Dueling Problems

Page 14: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Lookahead Bias

Using data for a backtest that we didn’t know at the time is called lookahead bias.

We try VERY hard to avoid this, because it corrupts evaluation of any strategy.

“I know that Apple did well in the past, so I’m going to backtest a strategy that just holds Apple after 2005.”

Quantopian.com

Page 15: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Stale Data

This can be equally disastrous!

If the data is never updated, you may be stuck with hilariously incorrect values.

“My vendor told me that company ABCD announced a split of 1:25, so I on that day, I traded 25 times what I normally would. But when it actually happened it was only 1:5!”

Quantopian.com

Page 16: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

So, What Do We Do?

This is referred to as point-in-time data.

It’s a BIG deal.

Personally, I think it’s the BIGGEST deal.

(I’m really biased because I do this for a living.)

Quantopian.com

Page 17: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Let’s Talk Timelines!

Page 18: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Timelines

http://www.csusmhistory.org/faulk006/thresholds/

Page 19: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Though Some People Think About This

Page 20: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Or...This.

Page 21: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Perspective Matters

It’s not just about when the data happened.

It’s also about when you’re observing the data.

If you have a dataset that ever has updates, revisions, or corrections, the data for a single data can change as you move through time.

Quantopian.com

Page 22: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian
Page 23: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

5 87 9

Page 24: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

5 297 9

5 87 9 >>>mean(4_days.values())>>>7.25

>>>mean(4_days.values())>>>12.5

Page 25: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Bi-Temporal Data

Separates the concepts of when the information HAPPENED from when we KNOW it.

Maintains accuracy with regards to data changes through history.

Allows questions asked to be answered with regards to perspective.

Quantopian.com

Page 26: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Deltas

Page 27: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian
Page 28: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Base Table:

Deltas Table:

Page 29: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

5 297 9

5 87 9

as-of date 1,timestamp 1

as-of date 1,timestamp 2

Page 30: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

But...Do We Care?

PROSReproduces events EXACTLY as they occurred.

Allows for accurate modeling of simulations from the past.

Easily allows for vendor updates to the most accurate known data.

CONSCan force modeling of atypical past events that wouldn’t happen in modern day.

If there are system errors, can be proliferated even into past data.Data shown can be “imperfect” from vendor or ingestor error.

Quantopian.com

Page 31: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Data Analyses Should be Replicable

Realistic view of data delivery, instead of the optimized view.

The world isn’t perfect, and your data is DEFINITELY not perfect.

But, it should at least be consistently imperfect.

Quantopian.com

Page 32: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Different Users, Different Needs

Quantopian.com

Point in time data is a layer of complexity.

In evaluation, only care- does it have alpha?

Users further on have the luxury of checking survivability.

99% of users don’t want to see a platform’s mistakes.

(I made that statistic up, but I’m pretty sure it’s accurate.)

Page 33: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

What Else Can this Do for Me?

TWTR Actual Time

Perspective

2006 2017

2006 Tweeter -------

2017 Tweeter Twitter

Quantopian.com

Page 34: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Verify data model assumptions

“We never change our data after the fact. We wouldn’t do that” ~A Quantopian Data Vendor

Quantopian.com

Is that All?

Page 35: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Split Adjustments

Quantopian.com

Page 36: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Why Did You Call this 3D Time at All?

Page 37: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Look Into the Future

Page 38: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Fundamentals, Redux

We just deployed a new system.

Capture not only the first known value, but also the adjustments to those values.

3/1/

17

3/3/

17

3/5/

17

3/9/

17

3/7/

17

3/13

/17

3/11

/17

First Known

Revisions

Perspective of 3/6/17

10 12

11

10 10 10 10 10 10 - - - - - - - -

11 11 11 11 11 11 11 11 12 12 12 12 12 12Perspective of 3/14/17

Page 39: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Point in Timeness as a Service

Raw data history for all (ingested) time

But we’d like to update the way that users can give us data too!

Page 40: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

But Wait, there’s More

Point in Time data doesn’t just have to be stock specific data.

This should be applicable to any field, any data.

Page 41: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Data is the Future

Page 42: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Thank You!

Kathryn GlowinskiEngineer, Quantopian

Page 43: "Three Dimensional Time: Working with Alternative Data" by Kathryn Glowinski, Engineer at Quantopian

Disclaimer

This presentation is for informational purposes only and does not constitute an offer to sell, a solicitation to buy, or a recommendation for any security; nor does it constitute an offer to provide investment advisory or other services by Quantopian, Inc. ("Quantopian"). Nothing contained herein constitutes investment advice or offers any opinion with respect to the suitability of any security, and any views expressed herein should not be taken as advice to buy, sell, or hold any security or as an endorsement of any security or company. In preparing the information contained herein, Quantopian has not taken into account the investment needs, objectives, and financial circumstances of any particular investor. Additionally, this presentation is being provided on the express basis that it and any related communications (whether written or oral) will not cause Quantopian to become an investment advice fiduciary under ERISA or the Internal Revenue Code with respect to any retirement plan or IRA investor, as the recipients are fully aware that the Quantopian (i) is not undertaking to provide impartial investment advice, make a recommendation regarding the acquisition, holding or disposal of an investment, act as an impartial adviser, or give advice in a fiduciary capacity, and (ii) has a financial interest in the offering and sale of one or more products and services, which may depend on a number of factors relating to Quantopian’s internal business objectives, and which has been disclosed to the recipient. Nothing set forth herein or any information conveyed (in writing or orally) in connection with this presentation is intended to constitute a recommendation that any person take or refrain from taking any course of action within the meaning of U.S. Department of Labor Regulation §2510.3-21(b)(1), including without limitation buying, selling or continuing to hold any security. No information contained herein should be regarded as a suggestion to engage in or refrain from any investment-related course of action as none of Quantopian nor any of its affiliates is undertaking to provide investment advice, act as an adviser to any plan or entity subject to the Employee Retirement Income Security Act of 1974, as amended, individual retirement account or individual retirement annuity, or give advice in a fiduciary capacity with respect to the materials presented herein. You are advised to contact your own financial advisor or other fiduciary unrelated to Quantopian about whether any given course of action may be appropriate for your circumstances. The information provided herein is intended to be used solely by the recipient in considering the products or services described herein and may not be used for any other reason, personal or otherwise. Any views expressed and data illustrated herein were prepared based upon information, believed to be reliable, available to Quantopian at the time of publication. Quantopian makes no guarantees as to their accuracy or completeness. All information is subject to change and may quickly become unreliable for various reasons, including changes in market conditions or economic circumstances.