apachecon big data 2015 - stock prediction.key

23
1 Pivotal Confidential–Internal Use Only

Upload: tranhuong

Post on 13-Feb-2017

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: ApacheCon Big Data 2015 - Stock Prediction.key

1

1 Pivotal Confidential–Internal Use Only

Page 2: ApacheCon Big Data 2015 - Stock Prediction.key

2

William Markito@william_markito

Fred Melo@fredmelo_br

(incubating)

Implementing a highly scalable Stock prediction system with Apache Geode,

Spring XD and Spark MLib

Page 3: ApacheCon Big Data 2015 - Stock Prediction.key

About us

Fred Melo

Technical Director for Data

[email protected]

@fredmelo_br

William Markito

Enterprise Architect for GemFire

[email protected]

@william_markito

Page 4: ApacheCon Big Data 2015 - Stock Prediction.key

A Simple Example

Data SourcesLook for patterns

Forecast

Page 5: ApacheCon Big Data 2015 - Stock Prediction.key
Page 6: ApacheCon Big Data 2015 - Stock Prediction.key

"Smart System"

Applicability

Page 7: ApacheCon Big Data 2015 - Stock Prediction.key

Smart System

Learns with HISTORICAL TRENDS

Live data becomes historical over time

Real-Time

Evaluates LIVE DATA

Historical

What do we want to build?

Trading Data

“According to historical trends, there’s an 80% chance this stock prices might go down within the next few minutes"

"How were the technical indicator readings when the latest price drops happened? "

Page 8: ApacheCon Big Data 2015 - Stock Prediction.key

Live Data

Data Temperature

Hot

Cold

Apache Hawq

Apache Geode / GemFire1- Live data is ingested into the grid

3 - Results are pushed immediately to deployed applications

4 - “Hot" data ages, becoming part of the historical dataset

5 - Re-training triggered, ML model updated.

Spring XD

2 - Trained ML model compares new data to historical patterns

The Machine Learning Pipeline data flow

Spring XD

Machine Learning model

Page 9: ApacheCon Big Data 2015 - Stock Prediction.key

Live Data

Data Temperature

Hot

Warm

Apache Geode / GemFire1- Live data is ingested into the grid

3 - Results are pushed immediately to deployed applications

Machine Learning model

2 - Trained ML model compares new data to historical patterns

The Machine Learning Pipeline data flow

5 - Re-training triggered, ML model updated.

Spring XD

Simplified Model

Spring XD

Page 10: ApacheCon Big Data 2015 - Stock Prediction.key

Transform Sink

SpringXD

Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native

Machine Learning

Enrich Filter

Split

Dashboard

Indicators

1

2

Predict

3

Real data

Simulator

/Stocks

/TechIndicators

/Predictions

Page 11: ApacheCon Big Data 2015 - Stock Prediction.key

Too complex?? Eating it in small bites…

Page 12: ApacheCon Big Data 2015 - Stock Prediction.key

SpringXD GemFire

Page 13: ApacheCon Big Data 2015 - Stock Prediction.key

Transform Sink

SpringXD

Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native

Machine Learning

Enrich Filter

Split

Dashboard

Indicators

1

2

Predict

3

Real data

Simulator

/Stocks

/TechIndicators

/Predictions

Page 14: ApacheCon Big Data 2015 - Stock Prediction.key

/Stocks

/TechIndicators

/Predictions

• Cache • Configurable through XML, ,Java

• Region • Distributed j.u.Map on steroids • Highly available, redundant

• Member • Locator, Server, Client

• Callbacks • Listener, Writer, AsyncEventListener, Parallel/Serial

Apache Geode Concepts

Page 15: ApacheCon Big Data 2015 - Stock Prediction.key

Apache Geode HA and Fail-Tolerance

Page 16: ApacheCon Big Data 2015 - Stock Prediction.key

Transform Sink

SpringXD

Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native

Machine Learning

Enrich Filter

Split

Dashboard

Indicators

1

2

Predict

3

Real data

Simulator

/Stocks

/TechIndicators

/Predictions

Page 17: ApacheCon Big Data 2015 - Stock Prediction.key

Transform Sink

SpringXDEnrich Filter

Split1

2

Predict3

Streams Pipelines Sources Sinks Filters Taps

Page 18: ApacheCon Big Data 2015 - Stock Prediction.key

Transform Sink

SpringXD

Extensible Open-Source Fault-Tolerant Horizontally Scalable Cloud-Native

Machine Learning

Enrich Filter

Split

Dashboard

Indicators

1

2

Predict

3

Real data

Simulator

/Stocks

/TechIndicators

/Predictions

Page 19: ApacheCon Big Data 2015 - Stock Prediction.key

medium avg (x+1)

relative strength (x)

medium avg (x)

price(x)

Machine Learning Model (e.g. Linear Regression)

Features Label

Page 20: ApacheCon Big Data 2015 - Stock Prediction.key

medium avg (x+1)

relative strength (x)

medium avg (x)

price(x)

Machine Learning Model (e.g. Linear Regression)

Features Label

Page 21: ApacheCon Big Data 2015 - Stock Prediction.key

Demo Time

Error

Page 22: ApacheCon Big Data 2015 - Stock Prediction.key

https://github.com/Pivotal-Open-Source-Hub/StockInference-SparkSource code and detailed instructions available at:

22

William Markito@william_markito

Fred Melo@fredmelo_br

Follow us on Twitter!

Page 23: ApacheCon Big Data 2015 - Stock Prediction.key

23

1 Pivotal Confidential–Internal Use Only