big data e xposed from big data to smart data
DESCRIPTION
This is the deck I presented in the Big Data eXposed event, September 30, David Intercontinental, Israel. In this session I’ll take the audience to a short trip in the eXelate’s cloud and present three big data related challenges and how we faced them.TRANSCRIPT
![Page 1: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/1.jpg)
1© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
From Big data to Smart data
A journey into the
eXelate cloud
Motty Cohen,Chief Architect, eXelate
![Page 2: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/2.jpg)
2© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
eXelate is the smart data company that powers smarter digital marketing decisions worldwide
Advertiser 1st Party
Data
Data Providers
OfflineData
Online Data
Media Platforms
ModelingScoring
Segmentation
AnalyticsDistributionMarketing
Data Exchange Platform
![Page 3: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/3.jpg)
3© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
• Demographic• Age: 40-55• Urbanicity: Suburban• Income: High• Education: Graduate Plus• Employment: Management
• Interest• Sport• Travels• Wines• Gadgets
• Intent• Travel to Barcelona• 4-star resort
Smart Data:Accurate & actionable audience segmentation
![Page 4: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/4.jpg)
4© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Our journey begins in the browser
The
Internet
![Page 5: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/5.jpg)
5© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Inside eXelate Cloud:Real-time Serving & Smart data delivery
Get Event Info
Add History Data
Apply Rules & Models
Sell to buyers
200ms
100+ platforms
~500K Rules~20K Segments
5B Events/Day
~850M Unique Users
14TB Storage27GB daily
![Page 6: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/6.jpg)
6© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenges
Big Data
Relevancy Access Time
On demand Analytics
![Page 7: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/7.jpg)
7© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
big data = noisesmart data = signal
![Page 8: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/8.jpg)
8© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 1: Relevancy
Grabbing the relevant audienceon site, on time
![Page 9: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/9.jpg)
9© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Generating Models
Model
ModelModel
Data Mining
Analytics
Create Models
eXtream
Netezza tables
Running Analytics on
Amazon
Java Packages
![Page 10: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/10.jpg)
10© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Real time segmentation: Running rules and models
Basic Rules
AssociationRules
Analytic Models
Model
Model
Model
Real-time scoring
Real-time learning
Can we run all these within the limited time frame?
~500K Rules
Complex Models
![Page 11: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/11.jpg)
11© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Continuous Incremental Segmentation
Users Info
Serving ClusterSegmentation
Cluster
0MQ
Continuous Incremental Segmentation
![Page 12: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/12.jpg)
12© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 2: Fast access to distributed big storage
![Page 13: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/13.jpg)
13© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
User Object • User Info• Segments, Delivery info, Intermediate results• Object Size: x10 KB ~ x100 KB• ~ 850M UU
• Access time• Read / Write within a few ms
• Availability• For any machine in the cluster• For any cluster in every data center
![Page 14: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/14.jpg)
14© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Aerospike: Frontend storage for fast access
Aerospike Cluster
Serving Cluster
XDR: Cross Data Center Replication
Optimized for SSD, Indexed in RAM
Smart Eviction Policy
Fast read/writes: 500K+ TPS
Key-value NoSQL distributed DB
![Page 15: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/15.jpg)
15© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Replicated storage across data centers
US WEST CA
US CENRALTX
EUROPENL
US EASTNY
Aerospike XDR:Cross Datacenter Replication
![Page 16: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/16.jpg)
16© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Challenge 3: On demand analytics
Show me the data, Now!
![Page 17: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/17.jpg)
17© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
optiX:Interactive data analytics
On Demand Calculation
![Page 18: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/18.jpg)
18© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
optiX:Interactive data analytics
On Demand Calculation
![Page 19: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/19.jpg)
19© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Data Center
Elastic Search:Using search engine for counting.
NetezzaDWH Aggregator
ES Cluster(30 Nodes)
Reporter
S3
Loader
optiX
REST FTP
![Page 20: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/20.jpg)
20© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
What did we have so far?
• Data relevancy• Real-time scoring• Parallel processing• Split processing over time
• Big data access time• Front end, Replicated, Aerospike cluster
• On-demand analytics• Change your schema to optimize query time• Move processing from querying to loading phase• Trade off: Space + Processing -> Performance
![Page 21: Big data e xposed from big data to smart data](https://reader035.vdocuments.net/reader035/viewer/2022081507/554f9fb6b4c90586258b48c0/html5/thumbnails/21.jpg)
21© 2013 eXelate Inc. Confidential and Proprietary. #bdx2013
Thank YouQuestions?