Download - Turbocharging CDAP Applications with Ampool
![Page 1: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/1.jpg)
©2015Slide 1
Prepared for:BDA Meetup
Turbocharging CDAP Applications With AmpoolMilind Bhandarkar, (@techmilind)Founder & CEO @AmpoolIO
![Page 2: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/2.jpg)
©2015Slide 2
Prepared for:BDA Meetup
Ampool Vision
Pipelines w/ CDAP
IMDG w/ Geode
Ampool w/ CDAP
Q & A
Outline 1
2
3
4
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
5
Q & A
![Page 3: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/3.jpg)
©2015Slide 3
Prepared for:BDA Meetup
Data Processing & Storage layers have evolved for scale-out
Unstructured Structured
Pers
iste
nce
Proc
essi
ng ImmutableMutable
Unmanaged Managed
Log Publish
QTx
ETL
In the beginning…
As app users & data grew…
Big Data/ App Explosion!
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 4: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/4.jpg)
©2015Slide 4
Prepared for:BDA Meetup
ImmutableMutable
Unmanaged Managed
Log Publish
ETL
Build a Processing & Storage-agnostic Memory Architecture
Unstructured Structured
Pers
iste
nce
Proc
essi
ng
Unify data processing
Design for Scale-out
Best of breed data engines!
ampool
Data Frame
Data Set
QTxAmpool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 5: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/5.jpg)
©2015Slide 5
Prepared for:BDA Meetup
Ampool’s Mission:To help build real-time customer experiences through high-performance analytics built for modern, commodity hardware platforms
For the community:To speed-up big, real-time analytics in a democratic way through a memory-centric architecture (complementing existing architectures), driving better interoperability between compute and storage layers.
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 6: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/6.jpg)
©2015Slide 6
Prepared for:BDA Meetup
AnalyticsIngest App UseETL
Big Data Processing Pipelines…use slow, persistent storage for data exchange today!
…!
" # # #
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 7: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/7.jpg)
©2015Slide 7
Prepared for:BDA Meetup
AnalyticsIngest App UseETL
…!
" # # #
AMPOOL: Fast memory across distributed compute clusters...driving performance, simplicity and agility
ampool …
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 8: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/8.jpg)
©2015Slide 8
Prepared for:BDA Meetup
AnalyticsIngest App UseETL
!
"
Energy ManagementIoT Analytics
Data ingestion flows:• Smart meter data
(Kafka)
Hive processing:• De-norm, Sessionize• Aggregations
Spark processing:• Linear Regression• Export to HBase
Downstream Apps:• Web app integration
…ampool
HDFS
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 9: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/9.jpg)
©2015Slide 9
Prepared for:BDA Meetup
Pipeline implemented in CDAPAmpool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 10: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/10.jpg)
©2015Slide 10
Prepared for:BDA Meetup
CDAP Application
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 11: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/11.jpg)
©2015Slide 11
Prepared for:BDA Meetup
In-memory TechnologyWhat is Apache Geode?
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 12: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/12.jpg)
©2015Slide 12
Prepared for:BDA Meetup
How does it compare with the Big Data stack?YCSB: Geode & HBase
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 13: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/13.jpg)
©2015Slide 13
Prepared for:BDA Meetup
Ampool with CDAP
CDAP with HBase
(as-is Application)
Configuration ChangesExtension modules/directoryDistributed Mode table/stream
CDAP with Ampool(powered by Geode)
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 14: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/14.jpg)
©2015Slide 14
Prepared for:BDA Meetup
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & ACDAP Demo Pipeline(Video)
![Page 15: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/15.jpg)
©2015Slide 15
Prepared for:BDA Meetup
Ampool with CDAPPipeline Baseline: Ampool & HBase
Ampool Vision
Pipeline/ CDAP
IMDG / Geode
Ampool/ CDAP
Q & A
![Page 16: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/16.jpg)
©2015Slide 16
Prepared for:BDA Meetup
• CDAP simplifies the development of complex big data pipelines and offers extensibility at multiple layers
• In-memory technology such as Geode promise higher performancein certain use-cases
• Ampool, powered by Geode, is able to show immediate performance gains without any pipeline re-engineering!
• Future…
Key TakeawaysAmpool complements CDAP…
![Page 17: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/17.jpg)
©2015Slide 17
Prepared for:BDA Meetup
C o m p a t i b l e w i t h t h e F u t u r e
![Page 18: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/18.jpg)
©2015Slide 18
Prepared for:BDA Meetup
AnalyticsIngest App UseETL
ampool
Customer BehaviorPredictive Modeling
Data ingestion flows:• Click streams (Kafka)• Dim. tables (Sqoop)
2-stage MR pipeline:• Cleanse data• Sessionize clickstream
HAWQ stages:• Data import (PxF)• Exp. features (MADlib)
Spark modeling stages:• Feature analysis (MLlib)• Scoring (R/ HAWQ)
…HDFS
!
"
![Page 19: Turbocharging CDAP Applications with Ampool](https://reader034.vdocuments.net/reader034/viewer/2022042517/5882b6f31a28abd75a8b75e5/html5/thumbnails/19.jpg)
©2015Slide 19
Prepared for:BDA Meetup
AnalyticsIngest App UseETL
Security AnalyticsBig Data Insights
Data ingestion flows:• Security Logs (Flume)
Pig data processing:• Joins logs w/ catalog• Stores denorm. logs
Kylin stages:• Pre-aggregations• Export to HBase
Downstream Apps:• Drill-down API for logs• Web app integration
…ampool
!
"
HDFS