(adv303) mediamath’s data revolution with amazon kinesis and amazon emr | aws re:invent 2014
TRANSCRIPT
November 13, 2014 | Las Vegas, NV
Eddie Fagin, VP Engineering, MediaMath
Ian Hummel, Sr. Director Engineering, MediaMath
Adi Krishnan, Sr. PM Amazon Kinesis
• Query Engine Approach
• Pre-computations such as
indices and dimensional views
improve performance
• Historical, structured data
• Amazon Redshift
• HIVE/SQL-on-Hadoop/ M-R/
Spark
• Batch programs, or other
abstractions breaking down
into MR style computations
• Historical, Semi-structured
data
• Amazon EMR
• Custom computations of relative
simple complexity
• Continuous Processing – filters,
sliding windows, aggregates – on
infinite data streams
• Semi/Structured data, generated
continuously in real-time
• Amazon Kinesis
Data Warehousing Hadoop Style Processing Stream Processing
Real-time processing
High throughput; elastic
Easy to use
S3, Redshift, DynamoDB Integrations
Amazon
Kinesis
Amazon Kinesis
Amazon Web Services
AZ AZ AZ
Durable, highly consistent storage replicates dataacross three data centers (availability zones)
Aggregate andarchive to S3
Millions ofsources producing100s of terabytes
per hour
FrontEnd
AuthenticationAuthorization
Ordered streamof events supportsmultiple readers
Real-timedashboardsand alarms
Machine learningalgorithms or
sliding windowanalytics
Aggregate analysisin Hadoop or adata warehouse
Inexpensive: $0.028 per million puts
Hadoop/HDFS clusters
Hive, Impala, MapReduce
Easy to use; fully managed
On-demand and spot pricing
Amazon EMR
Warehouse
(analytics,
decisioning,
optimization,
archive)
Bidder
Data (wins)
Site Events
3rd Party
Segments
Firehose
(Kinesis)
Decisioning
&
Optimization
Real-time
Analytics
Archive
S3
Bidder
Data (wins)
Site Events
3rd Party
Segments
App
(metadata)
Data mart
(Oracle/
Postgres)
Qubole
Redshift
Hadoop
Scripts
Attribution
BiddersBidders
Bidders S3
S3
S3
S3
EMREMR
EMR
Recurring
partition
jobs/process
jobs
Partners/clients/tools/
internal services
PixelsPixels
Pixels
Realtime
Firehose
Netezza