tailored for spark
TRANSCRIPT
Tailored for Spark
Hadoop Summit Dublin 2016Petr IgrevskiJohn ScheibmeireBay
eBay - Tailored for Spark 2
How to tailor Spark for maximum impact
1. Optimal infrastructure
2. Customized user experience
eBay - Tailored for Spark 3
Outline
1. eBay, Analytics, Hadoop, and Spark2. Spark Opportunities at eBay3. QA
BACKGROUND
5
eBay
eBay - Tailored for Spark
Q4 2015
6
Analytics at eBay
Analytics
BI
Kylin MicroStrategy Tableau R / SAS
ETL
Ab Initio
Data Platform
Hadoop Teradata
eBay - Tailored for Spark
Streaming Spark
7
Hadoop at eBay
1. Search Index2. Log Management3. Operation Metric Management4. Analytics
eBay - Tailored for Spark
8
Hadoop Hardware
Multiple Generations
12-18 Cores
72-128GB RAM
24-72TB Storage
Provisioned by cabinet
eBay - Tailored for Spark
9
Spark at eBay
• Uses– Spark 1.4 to Spark 1.6
• Methods– Yarn
• Current utilization– 20% analytic clusters
• Use Cases– Purchase Suggestions– Marketing Optimization– Customer Interests, Consistency, and Similarity– Kylin Cube Building
eBay - Tailored for Spark
10
Spark Challenges
• Capacity Management and Efficiency– Map Reduce => Yarn– Job Sizing
• Support– Missing vendor support– Missing expertise
• Deployment– Library conflicts– Configuration challenges– Distribution sprawl
• Integration– Configuration
eBay - Tailored for Spark
TAILORING SPARKSimple things should be simple. Complex things should be possible.
Alan Kay
eBay - Tailored for Spark11
12
We can
• Copy• Test • Run
eBay - Tailored for Spark
13
Opportunities for Spark
•Flexibility•Usability•Simplicity•Speed•Transparency
eBay - Tailored for Spark
14
On YARN
• Security• Multitenancy• Reliability• Experience• Performance
eBay - Tailored for Spark
YARNSpark
HDFS HDFS SWIFT NFS
Ker
bero
s
15
Does it fit?
• Compute• Storage• Network• Provisioning
eBay - Tailored for Spark
Shared Compute resources
Independently scalable storage
Flat Network
16
Can we make it feel better?
• Standard ADLC• Test to your level of comfort• Single click deployment• Watch every step• Certify your job• Let it run• Did you say UI?
eBay - Tailored for Spark
Development
Test
Packaging
Certification
Runtime
RegisterRepos
CIMetadata DBProvisioning
Runtime farmOrchestrator
17
Q/A
eBay - Tailored for Spark