real-time stream processing architecture for comcast ip video strata conference + hadoop world 2013...
TRANSCRIPT
![Page 1: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/1.jpg)
Real-time Stream Processing Architecture for Comcast IP Video
Strata Conference + Hadoop World 2013
Chris LintzGabriel Commeau
![Page 2: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/2.jpg)
o Comcast VIPER Overviewo Architecture Overviewo Q & A
Agenda
![Page 3: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/3.jpg)
Comcast Video IP Engineering and Research (VIPER)
Packaging
Origination
Storage
Transcoding
iOS
Android
Xbox Live
Samsung
Storm
![Page 4: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/4.jpg)
Why Do We Focus on Real-time?
• Proactively diagnose issues
• Form real-time intelligence
• Help deliver best possible video experience
Prime Time
Viewership
![Page 5: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/5.jpg)
Video Player Analytics Protocol
• Live and On Demand• JSON event objects• Key metrics• Bitrate• Frame rate• Fragments• Errors
We collect and use all data in accordance with best consumer privacy practices and applicable laws
![Page 6: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/6.jpg)
Player Sessions: Key In Understanding Video Experience
![Page 7: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/7.jpg)
High Level Architecture And Data Flow
![Page 8: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/8.jpg)
o Collect, aggregate and move large amounts of datao Distributed, scalable, reliable, customizableo Multi-tier architecture
Flume: Data collection Tier
![Page 9: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/9.jpg)
Storm: Stream Processing Tier
![Page 10: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/10.jpg)
o Sessions in Flume?• Technical issues: consistent hash and exactly-once semantics• Design goals• Separation of concerns
o Session write-through rate?
Player Sessions in Real-time
![Page 11: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/11.jpg)
o Analytics events over HTTPSo HTTP Sourceo Re-batch with inner sink and source
Flume Edge Tier: Video Player Analytics End Point
![Page 12: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/12.jpg)
o Video Player Event processing• Geo-location, asset metadata, validation, to-storm
o Replication channel processor:• HDFS sink• Storm sink
Flume Mid Tier: Processing and Routing Data
![Page 13: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/13.jpg)
o Service discoveryo Distributed, scalable and reliableo Low latency
Bridging Flume to Storm: Flume2Storm Connector
![Page 14: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/14.jpg)
Simplified Video Player Storm Topology
![Page 15: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/15.jpg)
o Functionality beyond key/value storeso Real-time and historic window querieso Speed of in-memory writes and durability of disk
Requirements for Read/Writes from Storm Bolts
![Page 16: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/16.jpg)
Utilizing MemSQL for Persistence
• Distributed in-memory SQL database
• ACID, highly available, fault tolerant
• Aggregators route queries to leaves
• Leaves are auto-sharded• Solves our intense read/writes
![Page 17: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/17.jpg)
Isolated Analysts and Ingest Aggregators
![Page 18: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/18.jpg)
Achievements In Utilizing MemSQL
• Complex queries in milliseconds
• Fault-tolerant Storm bolt state
• Joins now available outside of Storm bolts• Foreign key shards
• Complex data streams • Dynamic alters without locks
or down time• JSON type
![Page 19: Real-time Stream Processing Architecture for Comcast IP Video Strata Conference + Hadoop World 2013 Chris Lintz Gabriel Commeau](https://reader030.vdocuments.net/reader030/viewer/2022032705/56649dbf5503460f94ab30a8/html5/thumbnails/19.jpg)
Wrapping Up
o Real-time at Comcast scale• Millions of video players• Horizontal scale everywhere• Aggregated metrics across US and complex analysis• Real-time API
o Builds foundation• Advanced real-time analytics • Better platform for innovation
– Alerts on complex objects– Supplemental real-time data back to clients– Popularity-based CDN