hadoop @ facebook - qcon tokyo 2016...
TRANSCRIPT
![Page 1: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/1.jpg)
Hadoop @ Facebook
Usage, Issues, Solutions & Beyond
Joydeep Sen Sarma
![Page 2: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/2.jpg)
About Me Facebook: Ran/Managed Hadoop ~ 3 years
Co-Author Hive
Mentor/PM Hadoop Fair-Scheduler
Used Hadoop/Hive (as Warehouse/ETL Dev)
Re-wrote significant chunks of Hadoop Job Scheduling (incl. Corona)
Qubole: Running World’s largest Hadoop and Presto clusters on AWS
2007
2014
![Page 3: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/3.jpg)
Hadoop @ FB
2007-2011
![Page 4: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/4.jpg)
Use Cases
• Reporting (SQL)
- Complex ETL (Joins)
- Simple Summaries (Counters)
• Index Generation (Map-Reduce/Java)
• Data Mining (C++)
• Adhoc Queries (SQL)
• Data Driven Applications (PYMK)
![Page 5: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/5.jpg)
Friend Map
![Page 6: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/6.jpg)
Success Factors: Hadoop
• Scales Linearly, Centrally Managed
• Operationally Simple (15 people – 25 PB)
• Can do almost anything ..
![Page 7: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/7.jpg)
Success Factors: Hive
• SQL is easy (add scripts for map-reduce)
• Browser Based Interface
![Page 8: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/8.jpg)
Success Factors: Scribe
• Log data using Scribe from any application
• Simple to add attributes to user page views
• JSON encoded logging allows schema evolution
![Page 9: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/9.jpg)
Issues
![Page 10: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/10.jpg)
#1: Hadoop Scheduling
• “Have you no Hadoop Etiquettes?” (c 2007) (reducer count capped in response)
• User takes down entire Cluster (OOM) (c 2007-09)
• Bad Job slows down entire Cluster (c 2009)
• Steady State Latencies get intolerable (c 2010-)
• ”How do I know I am getting my fair share?” (c 2011)
• “Too few reducer slots, cluster idle” (c 2013)
![Page 11: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/11.jpg)
#2: Hive Latency
• Queries take at least 30s
- Users expect MySql like latency
• Hive is too slow for BI tools like Tableau
• Difficult to write/test complex SQL applications
• Not possible to do drill downs into detail data
- No indices and latency too high
![Page 12: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/12.jpg)
#3: Real Time Metrics
• Nightly or hourly aggregations not enough
- Need Real-Time
- Driven by Advertiser requirements
• MySql not great to store long-tail summaries
![Page 13: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/13.jpg)
#3: Cluster Too Big
• Single HDFS runs into space/power/network limits
• Hard to do Queries across Data Centers
#4: Not fit for all applications
• In-memory grid better for Graph applications
![Page 14: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/14.jpg)
Solutions
![Page 15: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/15.jpg)
Hadoop Scheduling - Isolation
• Hadoop Fair-Scheduler
- Fix Preemption and Speculation
• Separate clusters for Production Pipelines
• Platinum, Gold, Silver
• Prevent memory over-consumption
- Slaves, JobTracker, NameNode
![Page 16: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/16.jpg)
Hadoop Scheduling - Corona
• Overlaps with Apache YARN
• Separate Cluster Resource Allocation
- from Map-Reduce Task Scheduling
• Push Tasks to Slaves
- Reduced Latency and Idle Time
• Highly Concurrent Design
- Optimistic Locking
- Fast and Scales to thousands of nodes
![Page 17: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/17.jpg)
Fast Queries – Presto! Hive Presto
Uses Hadoop MR for execution Pipelined execution model
Spills intermediate data to FS Intermediate data in memory
Can tolerate failures Does not tolerate failures
Automatic join ordering User-specified join ordering
Can handle joins of two large tables
One table needs to fit in memory
Supports grouping sets Does not support GS
Plug in custom code Cannot plug in custom code
More data types Limited data types
![Page 18: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/18.jpg)
Presto vs Hive Perf
- Presto is 2.5-7x faster
- Some queries run out of memory
![Page 19: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/19.jpg)
Other FB Projects
• Calligraphus (Scribe++)
• Puma
- Real-Time Processing
- Hbase for storage
• Apache Giraph
- In-Memory graph computations
![Page 20: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/20.jpg)
Scaling Issues – Cloud!
• Easily start large Hadoop clusters
- Isolate large ad-hoc workloads from Production
- Qubole auto-scales Hadoop Up/Down
• AWS S3 is much easier to use than HDFS
- also: GCE and Azure Storage
- Use HDFS as Cache (Qubole Columnar Cache)
![Page 21: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/21.jpg)
insert overwrite table dest
select … from ads join
campaigns on …group by …;
21
Auto-Scaling
StarCluster
Map Tasks
ReduceTasks
Demand
Supply
AWS / GCE
Progress
Master
Slaves
Job Tracker
![Page 22: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/22.jpg)
Hive Latency - Qubole
• Works with Small Files!
– Faster Split Computation (8x)
–Prefetching S3 files (30%)
• Direct writes to S3
–HIVE-1620
• Multi-Tenant Hive Server
• JVM Reuse!
– Fix re-entrancy issues
–1.2-2x speedup
• Columnar Cache
–Use HDFS as cache for S3
–Upto 5x faster for JSON data
![Page 24: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/24.jpg)
Appendix
![Page 25: Hadoop @ Facebook - QCon Tokyo 2016 …qcontokyo.com/tokyo-2014/data_2014/SS2-3_JoydeepSenSarma.pdf · Scaling Issues – Cloud! •Easily start large Hadoop clusters -Isolate large](https://reader031.vdocuments.net/reader031/viewer/2022030711/5afa20267f8b9aff288e1267/html5/thumbnails/25.jpg)
In Production @