prediction database: the need of the hour -...

Post on 12-Feb-2018

218 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Prediction DataBase: The Need of The Hour

Devavrat Shah

ProfessorEECS

DirectorStats & Data Sc

Massachusetts Institute of Technology

Co-FunderChief Scientist

Celect, Inc.

© 2015 Celect, Inc. All Rights Reserved.Use, reproduction, or disclosure is subject to restrictions set forth in Contract Number 2014-14031000011 and Sub Contract No. Celect 01Use, reproduction, or disclosure is subject to restrictions set forth in Contract Number 2014-14031000011 and Sub Contract No. Celect 01

An Ultimate Prediction Engine?

Prediction

ConfidenceProvenance

BigData

Heterogeneous Data

Sparse Data

Up-and-Running Instantly without

Team of Data Scientists

Add Data Incrementally

Stitch Different Data Sources for Better Predictions

Anybody (Excel user) can use it!

Existing Paradigm of Statistics / Machine Learning

Application

DataStore

Manual Data Processing

Predictive Queries

“Normalized Data”

Model Learning, Prediction

Existing Paradigm of Statistics / Machine Learning

Application

DataStore

Manual Data Processing

Predictive Queries

“Normalized Data”

Model Learning, Prediction

Getting Rid of This!

Prediction DataBase

Application

DataStore

Predictive Queries

A Software Layer

No Manual Processing

Predictive DataBase: A New Data Infrastructure

A Brief History of DataBase

1970s-80s: relational database like MySQL and Postgres

1980s-90s: personal database like Excel

1990s-00s: distributed database like Cassandra

Now: Prediction database

2000s-10s: search engines database like Elastic Search

Formal Description

9: ?

Schema less DB

key : value key : ?

Atomic Prediction

Name Table

1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’

3: ‘Vishal Doshi’

4: ‘Ying-zong Huang’

5: ‘John Andrews’

6: ‘Balaji Rengarajan’

7: ‘Ritesh Madan’

8: ‘Daniel Xu’

Formal Description

Name Table

1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’

3: ‘Vishal Doshi’

4: ‘Ying-zong Huang’

5: ‘John Andrews’

8: ‘Daniel Xu’

6: ‘Balaji Rengarajan’

7: ‘Ritesh Madan’

Schema less DB

key : value key : ?

Atomic Prediction

Gender Table

1: ‘Female’2: ‘Male’

3: ‘Male’

4: ‘Male’

5: ‘Male’

8: ‘Male’

6: ‘Male’

7: ‘Male’

9: ‘John Tsitsiklis’ 9: ?

Formal Description

Name Table

1: ‘Vasudha Shivamoggi’2: ‘Devavrat Shah’

3: ‘Vishal Doshi’

4: ‘Ying-zong Huang’

5: ‘John Andrews’

8: ‘Daniel Xu’

6: ‘Balaji Rengarajan’

7: ‘Ritesh Madan’

Schema less DB

key : value key : ?

Atomic Prediction

Gender Table

1: ‘Female’2: ‘Male’

3: ‘Male’

4: ‘Male’

5: ‘Male’

8: ‘Male’

6: ‘Male’

7: ‘Male’

9: ‘John Tsitsiklis’ 9: ?

May be ‘Male’

Formal Description

(1, ‘Name’): ‘Vasudha Shivamoggi’

(2, ‘Name’): ‘Devavrat Shah’

(3, ‘Name’): ‘Vishal Doshi’

(4, ‘Name’): ‘Ying-zong Huang’

(5, ‘Name’): ‘John Andrews’

(8, ‘Name’): ‘Daniel Xu’

(6, ‘Name’): ‘Balaji Rengarajan’

(7, ‘Name’): ‘Ritesh Madan’

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(1, ‘Gender’): ‘Female’

(2, ‘Gender’): ‘Male’

(3, ‘Gender’): ‘Male’

(4, ‘Gender’): ‘Male’

(5, ‘Gender’): ‘Male’

(8, ‘Gender’): ‘Male’

(6, ‘Gender’): ‘Male’

(7, ‘Gender’): ‘Male’

(9, ‘Name’): ‘John Tsitsiklis’ (9, ‘Gender’): ?

Formal Description

value

text

numeric

image

geoJson

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(key1, key2, table name) : value (key1, key2, table name) : ?

Graph DB: A Special Case

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(key1, key2, table name) : value (key1, key2, table name) : ?

1

2

3

4

Graph DB: A Special Case

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(key1, key2, table name) : value (key1, key2, table name) : ?

1

2

3

4

(1,2, retweet) : ‘GeoInt’

‘GeoInt’

Graph DB: A Special Case

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(key1, key2, table name) : value (key1, key2, table name) : ?

1

2

3

4

(1,2, retweet) : ‘GeoInt’

‘GeoInt’

(1,2, SMS) : ‘Meet @ Hyatt Dulles’

‘Meet @ Hyatt Dulles’

Graph DB: A Special Case

Schema less Prediction DB

(key, table name) : value (key, table name) : ?

Atomic Prediction

(key1, key2, table name) : value (key1, key2, table name) : ?

1

2

3

4

(1,2, retweet) : ‘GeoInt’

‘GeoInt’

(1,2, SMS) : ‘Meet @ Hyatt Dulles’

‘Meet @ Hyatt Dulles’

(1,name) : ‘Dev’

‘Dev’

This is Not A Pipe Dream: Celect Has Built It

Application

DataStore

Schema Definition

Predictive Queries

5% - 15% Increase in Revenue

System in Cloud Auto-Scales as Data Grows

100Gbs/Day 100M+ Customers 100M+ Products

< 100 milliseconds API Response Time

Celect in Retail: At Fortune 500 Scale

Celect Beyond Retail

? ???

??

?

Parting Remarks

Prediction Database

New Paradigm for Modern Statistics and Machine Learning

Make Prediction a “Special” Database Query

Celect, Inc. Has Built Such an Infrastructure

Can Support Most (If Not All) Problems of Interest

Successful in Retail IndustryIntriguing Case-Studies Beyond Retail

Handles Unstructured Data: Text, GeoSpatial, Image

top related