large scale decision support systems // satya ramachandran, neustar [firstmark's data driven]

29
LARGE SCALE DECISION SUPPORT SYSTEMS

Upload: firstmark-capital

Post on 12-Feb-2017

230 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

LARGE SCALE DECISION

SUPPORT SYSTEMS

Page 2: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

ABOUT ME

DECISION

SUPPORT

SYSTEMS

ANALYTICAL

SYSTEMS DATABASE

SYSTEMS

VP Engineering, MarketShare DecisionCloud At Neustar

Page 3: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

MODEL BASED DECISION SUPPORT SYSTEMS

New Client

Onboarding

Data

Management

Data Adapters

Modeling

Post

Processing

ETL

Discovery

Scenario

Analysis

I have my planned spend for next year.

What will my planned spend yield in

terms of my sales/revenue and how

does that compare with my

sales/revenue forecasts?

Page 4: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

BI IS WELL UNDERSTOOD

FACT Geo DIM

Product DIM

Time DIM

Data : Spend in $, Number of impressions bought and the corresponding Google Query Volume for 1st Quarter 2015

Page 5: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Reporting What happened?

Analysis Why did it

happen?

Monitoring What’s

happening now?

Com

ple

xity

Business Value

Query, reporting & search tools

Dashboards, scorecards, listening, real time reporting

OLAP and visualization tools

Business Intelligence

Complex event processing; NLP; Text mining

Time series analysis, data mining, clustering

BI ANSWERS IMPORTANT QUESTIONS ABOUT THE PAST

Page 6: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

IMPACT OF INCREASING SPEND BY 10% IN 3RD QUARTER?

FACT Geo DIM

Product DIM

Time DIM

Page 7: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 4679.25

TID_3 PID_2 GID_1 0 40392

TID_3 PID_3 GID_1 0 37986.5

TID_3 PID_1 GID_2 0 15206.5

TID_3 PID_2 GID_2 0 260338

TID_3 PID_3 GID_2 0 249003.25

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

3RD QUARTER SPEND COULD BE REPLICATED FROM 1ST

FACT Geo DIM

Product DIM

Time DIM

Page 8: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_3 ALL ALL 419147300 607605.5

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 4679.25

TID_3 PID_2 GID_1 0 40392

TID_3 PID_3 GID_1 0 37986.5

TID_3 PID_1 GID_2 0 15206.5

TID_3 PID_2 GID_2 0 260338

TID_3 PID_3 GID_2 0 249003.25

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

TOTAL SPEND COULD BE CALCULATED

FACT Geo DIM

Product DIM

Time DIM

Page 9: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_3 ALL ALL 419147300 607605.5

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 4679.25

TID_3 PID_2 GID_1 0 40392

TID_3 PID_3 GID_1 0 37986.5

TID_3 PID_1 GID_2 0 15206.5

TID_3 PID_2 GID_2 0 260338

TID_3 PID_3 GID_2 0 249003.25

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

TO GET THE GQV VALUE ONE HAS TO BUILD PREDICTIVE

MODELS FACT

Geo DIM Product DIM

Time DIM

Page 10: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Reporting What happened?

Analysis Why did it

happen?

Monitoring What’s

happening now?

Prediction What might

happen

Decision What Should

I do now?

Com

ple

xity

Business Value

Query, reporting & search tools

Dashboards, scorecards, listening, real time reporting

OLAP and visualization tools

Predictive analytics

Decision support and management

Business Intelligence

Complex event processing; NLP; Text mining

Time series analysis, predictive modeling, ensemble modeling, machine learning

Constraint based optimization; choice modeling; decision trees

Time series analysis, data mining, clustering

NEXT-GEN ANALYTICS DRIVES DECISION MAKING

Page 11: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

TO BUILD MODELS FROM DATA

FACT Geo DIM

Product DIM

Time DIM

Page 12: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

THE DATA IS FLATTENED OUT

DENORMALIZED FLATTENED DATA

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

1/1/2015 COMP ALL 176843600 19885.75 8092237

1/1/2015 DIGI ALL 185465400 300730 3691062

1/1/2015 GAME ALL 56838300 286989.75 421021

1/1/2015 COMP GL 0 4679.25 8091825

1/1/2015 DIGI GL 0 40392 3684004

1/1/2015 GAME GL 0 37986.5 414270

1/1/2015 COMP MW 0 15206.5 412

1/1/2015 DIGI MW 0 260338 7058

1/1/2015 GAME MW 0 249003.25 6751

Page 13: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

DATA IS GROUPED INTO SETS CALLED FEATURES

TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV

1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 8092237

1/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 3691062

1/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 421021

1/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825

1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004

1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270

1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412

1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058

1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751

DENORMALIZED FLATENNED DATA

Feature selection, also known as variable selection, attribute selection or variable

subset selection, is the process of selecting a subset of relevant features for use in

model construction.

Page 14: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

FEATURES ARE ASSEMBLED INTO AN EQUATION

TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV

1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 8092237

1/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 3691062

1/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 421021

1/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825

1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004

1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270

1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412

1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058

1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751

FLATTENED DENORMALIZED DATA

Model Equation LOG(GQV_PD + 1) := TV_PRD_IM_LOGC*C(1) + TV_LOCAL_PRD_SP_LOGC*C(2)

Page 15: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

CORRESPONDING COEFFICIENTS ARE ESTIMATED

TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$) GQV COEFF_VALUE

1/1/2015 COMP ALL TV_PRD_IM 176843600 19885.75 8092237 0.045756241

1/1/2015 DIGI ALL TV_PRD_IM 185465400 300730 3691062 0.01985766

1/1/2015 GAME ALL TV_PRD_IM 56838300 286989.75 421021 0.007270448

1/1/2015 COMP GL TV_LOCAL_PRD_SP 0 4679.25 8091825 0.027113343

1/1/2015 DIGI GL TV_LOCAL_PRD_SP 0 40392 3684004 0.027113343

1/1/2015 GAME GL TV_LOCAL_PRD_SP 0 37986.5 414270 0.027113343

1/1/2015 COMP MW TV_LOCAL_PRD_SP 0 15206.5 412 0.027113343

1/1/2015 DIGI MW TV_LOCAL_PRD_SP 0 260338 7058 0.027113343

1/1/2015 GAME MW TV_LOCAL_PRD_SP 0 249003.25 6751 0.027113343

Model Equation

FLATTENED DENORMALIZED DATA

LOG(GQV_PD + 1) := TV_PRD_IM_LOGC*C(1) + TV_LOCAL_PRD_SP_LOGC*C(2)

Page 16: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

CALCULATIONS ARE MATRIX OPERATIONS ON THE MODEL

AND THE COEFFICIENTS

0.045756241

0.01985766

0.007270448

0.027113343

PRD=COMP & GEO = GL

PRD=DIGI & GEO = GL

PRD=GAME & GEO = GL

PRD=COMP & GEO = MW

PRD=DIGI & GEO = MW

PRD=GAME & GEO = MW

C1, PRD=COMP

C1, PRD=DIGI

C1, PRD=GAME

C2, PRD=ALL

GQV_PD

8091825

3684004

414270

412

7058

6751

TV_PRD_IM TV_PRD_IM TV_PRD_IM TV_LOCAL_PRD_SP

176843600 0 0 4679.25

0 185465400 0 40392

0 0 56838300 37986.5

176843600 0 0 15206.5

0 185465400 0 260338

0 0 56838300 249003.3

Model Input Coeff. Stack Outcome

NUMBER OF COLUMNS IS NUMBER OF COEFFICIENTS NUMBER OF ROWS IS NUMBER OF DISTINCT COMBINATIONS OF DIMENSIONS

Page 17: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

FACT Geo DIM

Product DIM

Time DIM

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_3 ALL ALL 419147300 607605.5

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 4679.25

TID_3 PID_2 GID_1 0 40392

TID_3 PID_3 GID_1 0 37986.5

TID_3 PID_1 GID_2 0 15206.5

TID_3 PID_2 GID_2 0 260338

TID_3 PID_3 GID_2 0 249003.25

IMPACT OF INCREASING SPEND BY 10% IN 3RD QUARTER?

Page 18: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_3 ALL ALL 419147300 668366.05

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 5147.18

TID_3 PID_2 GID_1 0 44431.2

TID_3 PID_3 GID_1 0 41785.15

TID_3 PID_1 GID_2 0 16727.15

TID_3 PID_2 GID_2 0 286371.8

TID_3 PID_3 GID_2 0 273903.6

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

FACT Geo DIM

Product DIM

Time DIM

1ST STEP : DISTRIBUTE

DISTRIBUTE THE SPEND AT THE LEVEL WHERE THE MODEL IS DEFINED

Page 19: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION VARIABLE # IMPRESSIONS SPEND ($$)

7/1/2015 ALL ALL M_TV_N_BRD_SP 419147300 668366.05

7/1/2015 COMP ALL M_TV_N_PD_IM 176843600

7/1/2015 DIGI ALL M_TV_N_PD_IM 185465400

7/1/2015 GAME ALL M_TV_N_PD_IM 56838300

7/1/2015 COMP GL M_TV_L_P_SP 0 5147.18

7/1/2015 DIGI GL M_TV_L_P_SP 0 44431.2

7/1/2015 GAME GL M_TV_L_P_SP 0 41785.15

7/1/2015 COMP MW M_TV_L_P_SP 0 16727.15

7/1/2015 DIGI MW M_TV_L_P_SP 0 286371.8

7/1/2015 GAME MW M_TV_L_P_SP 0 273903.6

2ND STEP : CALCULATE

TV_PRD_IM TV_PRD_IM TV_PRD_IM TV_LOCAL_PRD_SP

176843600 0 0 5147.18

0 185465400 0 44431.2

0 0 56838300 41785.15

176843600 0 0 16727.15

0 185465400 0 286371.8

0 0 56838300 273903.6

GQV_PD

88091838

3684114

414372.8

453.23

7763.86

7426.13

Page 20: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

TIME PRODUCT REGION # IMPRESSIONS SPEND ($$) GQV

TID_3 ALL ALL 419147300 668366.05 92205968

TID_1 PID_1 ALL 176843600 19885.75 8092237

TID_1 PID_2 ALL 185465400 300730 3691062

TID_1 PID_3 ALL 56838300 286989.8 421021

TID_1 PID_1 GID_1 0 4679.25 8091825

TID_1 PID_2 GID_1 0 40392 3684004

TID_1 PID_3 GID_1 0 37986.5 414270

TID_1 PID_1 GID_2 0 15206.5 412

TID_1 PID_2 GID_2 0 260338 7058

TID_1 PID_3 GID_2 0 249003.25 6751

TID_3 PID_1 GID_1 0 5147.18 88091838

TID_3 PID_2 GID_1 0 44431.2 3684114

TID_3 PID_3 GID_1 0 41785.15 414372.8

TID_3 PID_1 GID_2 0 16727.15 453.23

TID_3 PID_2 GID_2 0 286371.8 7763.86

TID_3 PID_3 GID_2 0 273903.6 7426.13

ID CODE

PID_1 COMP

PID_2 DIGI

PID_3 GAME

ID CODE

GID_1 GL

GID_2 MW

ID CODE

TID_1 1/1/2015

TID_2 4/1/2015

TID_3 7/1/2015

FACT Geo DIM

Product DIM

Time DIM

3RD STEP : AGGREGATE

AGGREGATE THE OUTPUT AT THE DESIRED LEVEL

Page 21: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Publisher

Marketing Driver

Tactic

Creative Concept

Geo

Time

Campaign

Year Quarter Month Week

Cube

Product

Non-Marketing Driver

Coupon redemption

Discounts

Macro economics

Pricing

Tabs

Weather

eCircular

Measures

National

Central

Dallas

Houston

Great Lakes

Chicago

Cincinnati

Northeast

Boston

New York

Southeast

Atlanta

Charlotte

West

Denver

Los Angeles

Seattle

Placement

Online Media Channel

Offline Media Channel

Social

Paid Social

Other

Affiliate

Display

Mobile

Video

Desktop

Other

Paid Search

Branded

Non Branded

Email

Audio

Magazine

Radio

TV

Leads

Tab

Product Listing

Services Directory

All Products

Dept 21

D21 Core

D21 Fencing

Dept 22

D22 Concrete

Dept 23F

D23F Area Rugs

D23F Carpeting

Dept 24

D24 Applicators

D24 Caulks/Tape/Oth

Dept 59

D59 Decor/Furniture

D59 Organization

D59 Window Coverings

866

1590948

23910

336

15790

28

82

156

33

23

Assists Orders Revenue Clicks Events Impressions Spend

Last touches Click converting rate Converting click

131M granular rows

Dimension

Page 22: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Dimension Publisher

Marketing Driver

Tactic

Creative Concept

Geo

Time

Campaign

Year Quarter Month Week

Cube

Product

Non-Marketing Driver

Coupon redemption

Discounts

Macro economics

Pricing

Tabs

Weather

eCircular

Measures

National

Central

Dallas

Houston

Great Lakes

Chicago

Cincinnati

Northeast

Boston

New York

Southeast

Atlanta

Charlotte

West

Denver

Los Angeles

Seattle

Placement

Online Media Channel

Offline Media Channel

Social

Paid Social

Other

Affiliate

Display

Mobile

Video

Desktop

Other

Paid Search

Branded

Non Branded

Email

Audio

Magazine

Radio

TV

Leads

Tab

Product Listing

Services Directory

All Products

Dept 21

D21 Core

D21 Fencing

Dept 22

D22 Concrete

Dept 23F

D23F Area Rugs

D23F Carpeting

Dept 24

D24 Applicators

D24 Caulks/Tape/Oth

Dept 59

D59 Decor/Furniture

D59 Organization

D59 Window Coverings

866

1590948

23910

336

15790

28

82

156

33

23

Assists Orders Revenue Clicks Events Impressions Spend

Last touches Click converting rate Converting click

131M granular rows

2

Digital manager gets an additional 10% marketing

budget to spend in Q4 2015 on online media and

wants to understand its’ effect.

Page 23: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Online ($)

Q4 2015 40,500,000 Spend view seen by

the user in application

After increasing the spend

by the user

Scoring and generate KPI

view

Revenue ($) Profit($)

3% 8%

Online ($)

Q4 2015 44,550,000

2

SCENARIO ANALYSIS (AS SEEN BY THE USER)

Page 24: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Scen

ari

o C

alc

ula

tio

n T

ime i

n S

eco

nd

s

4.5 B Data points

600MM Variables

170 MM Data points

90,000 Variables

20 MM Data points

7,000 Variables

2MM Data points

1,200 Variables

Representative

deployment

Modeling scale

EXPONENTIAL IMPROVEMENT IN SCENARIO CALCULATION

BANK CO. FINANCE CO. TELECOM CO. TRAVEL CO.

2010 2012 2014 2015

Page 25: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

2010 2012 2014 2015Strategist Data Mgmt. Modeler Ops Support

Software has driven a decline in

service requirements automation

Hours

4.5 B Data points

600MM Variables

170 MM Data points

90,000 Variables

20 MM Data points

7,000 Variables

2MM Data points

1,200 Variables

Total

hours

per

variable

Representative

deployment

Modeling scale

BANK CO. FINANCE CO. TELECOM CO. TRAVEL CO.

The size of models has grown significantly

Yet the deployment effort required

has decreased significantly

LOT LESS EFFORT FOR MUCH LARGER DEPLOYMENTS

Page 26: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Distributed

Cache

Tool

Application

Engines

Calculation

Engines

Execution

Systems

Elastic Load

Balancer

Client Onboarding

Model

Store Config

Store

Attribution Funnel creation Post Processing

Orchestrator

Metadata Store

Modeling Stack Transformation

Stack ETL Configurations ETL

Model UAT

Attribution Models

Evaluate

Automated Model Generation

MSDECISION CLOUD ANALYTICS WORKFLOW

Page 27: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

STAND ON THE SHOULDER OF GIANTS

Distributed

Cache

Tool

Application

Engines

Calculation

Engines

Execution

Systems

Elastic

Load

Balancer

Client Onboarding

Model

Store Config

Store

Attribution Funnel creation Post Processing

Orchestrator

Metadata Store

Modeling

Stack

Transformation

Stack ETL Configurations ETL

Model UAT

Attribution Models

Evaluate

Automated Model Generation

Page 28: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

KEY FOCUS AREAS

o Configuration Driven Platform

o All Modules run via Configurations

oWill use Metadata to automatically fill in Configurations

o Real-time Simulation Engine

o Real-time change propagation of changes to modeling stack-frames

o Supporting infrastructure

oConstraint Engine

oCollaboration

Page 29: Large Scale Decision Support Systems // Satya Ramachandran, Neustar [FirstMark's Data Driven]

Questions &

Discussion