the past, present, and future of hadoop at linkedin
TRANSCRIPT
The Past, Present, and Future of Hadoop @ LinkedIn
Carl SteinbachSenior Staff Software EngineerData Analytics Infrastructure GroupLinkedIn
The (Not So) Distant Past
PYMK (People You May Know)First version implemented in 2006
6-8 Million members
Ran on Oracle (foreshadowing!)Found various overlaps
School, Work… etc
Used common connections Triangle closing (?)
Triangle Closing
?Mary
Dave
Steve
PYMK ProblemsBy 2008, 40-50 Million membersStill running on OracleFailed oftenInfrequent data refresh
6 weeks – 6 months!
Humble Beginnings Back in ‘08
Success! (circa 2009)Apache Hadoop 0.2020 node cluster (repurposed hardware) PYMK in 3 days!
The Present
Hadoop @ LinkedIn Circa 2016> 10 Clusters> 10,000 Nodes> 1000 Users
Thousands of workflows, datasets, and ad-hoc queries
MR, Pig, Hive, Gobblin, Cubert, Scalding, Tez, Spark, Presto, …
Two Types of Scaling Challenges
Machines
People and Processes
Scaling Machines
Some Tough Talk About HDFSConventional wisdom holds that HDFS Scales to > 4k nodes without federation* Scales to > 8k nodes with federation*
What’s been our experience? Many Apache releases won’t scale past a couple thousand nodes Vendor distros usually aren’t much better
Why? Scale testing happens after the release, not before Most vendors have only a handful of customers with clusters larger than 1k nodes
* Heavily dependent on NN RPC workload, block size, average file size, average container size, etc, etc
March 2015 Was Not a Good Month
What Happened?We rapidly added 500 nodes to a 2000 node cluster
(don’t do this!)
NameNode RPC queue length and wait time skyrocketed
Jobs crawled to a halt
What Was the Cause?A subtle performance/scale regression was introduced upstream
The bug was included in multiple releases
Increased time to allocate a new file
The more nodes you had, the worse it got
How We Used to do Scale Testing1. Deploy the release to a small cluster (num_nodes = 100)2. See if anything breaks3. If no, then deploy to next largest cluster and goto step 24. If yes, figure out what went wrong and fix it
Problems with this approach Expensive: developer time + hardware Risky: Sometimes you can’t roll back! Doesn’t always work: overlooks non-linear regressions
17
• Scale testing and performance investigation tool for HDFS
• High fidelity in all the dimensions that matter
• Focused on the NameNode• Completely Black-box• Accurately fakes thousands of DNs on a
small fraction of the hardware• More details in forthcoming blog post
HDFS Dynamometer
Scaling People and Processes
19
20
v
HadoopPerformanceTuning
21
Too many dials!
Lots of frameworks: each one is slightly different.
Performance can change over time.
Tuning requires constant monitoring and maintenance!
Why Are Most User Jobs Poorly Tuned?
* Tuning decision tree from “Hadoop In Practice”
22
Dr Elephant: Running Light Without OverbyteAutomated Performance Troubleshooting for Hadoop Workflows
● Detects Common MR and Spark Pathologies:
○ Mapper Data Skew○ Reducer Data Skew○ Mapper Input Size○ Mapper Speed○ Reducer Time○ Shuffle & Sort○ More!
● Explains Cause of Disease● Guided Treatment Process
23
Grab the source codegithub.com/linkedin/dr-elephant
Read the blog postengineering.linkedin.com/blog
Dr Elephant is Now Open Source
Upgrades are HardA totally fictional story: The Hadoop team pushes a new Pig upgrade The next day thirty flows fail with ClassNotFoundExceptions Angry users riot Property damage exceeds $30mm
What happened? The flows depended on a third-party UDF that depended on a transitive
dependency provided by the old version of Pig, but not the new version of Pig
Bringing Shading Out of the ShadowsWhat most people think it is
Package artifact and all dependencies in the same JAR + rename some or all of the package names
What it really is Static linking for Java
Unfairly maligned by many people
We built an improved Gradle plugin that makes shading easier for inexperienced users
26
Audit Hadoop flows for incompatible and unnecessary dependencies.
Predict failures before they happen by scanning for dependencies that won’t be satisfied post-upgrade.
Proved extremely useful during Hadoop2 migration
Byte-Ray: “X-Ray Goggles for JAR Files”
Byte-Ray in Action
SoakCycle: Real World Integration Testing
The Future?
Dali2015 was the year of the table
We want to make 2016 the year of the view
Learn more at the Dali talk tomorrow
©2014 LinkedIn Corporation. All Rights Reserved.©2014 LinkedIn Corporation. All Rights Reserved.