apache spark and object stores —for london spark user group

40
1 © Hortonworks Inc. 2011 – 2017 All Rights Reserved Apache Spark and Object Stores —What you need to know Steve Loughran [email protected] @steveloughran March 2017

Upload: steve-loughran

Post on 16-Apr-2017

109 views

Category:

Technology


8 download

TRANSCRIPT

Page 1: Apache Spark and Object Stores —for London Spark User Group

1 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Apache Spark and Object Stores —What you need to knowSteve [email protected] @steveloughran

March 2017

Page 2: Apache Spark and Object Stores —for London Spark User Group

2 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Why Care?

⬢ It's the only persistent store for in-cloud clusters⬢ Affordable cloud applications: agile clusters⬢ Object stores are the source and final destination of work⬢ + Asynchronous data exchange⬢ + External data sources

Page 3: Apache Spark and Object Stores —for London Spark User Group

3 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

ORC, Parquetdatasets

inbound

Elastic ETL

HDFS

external

Page 4: Apache Spark and Object Stores —for London Spark User Group

4 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

datasets

external

Notebooks

library

Page 5: Apache Spark and Object Stores —for London Spark User Group

5 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Streaming

Page 6: Apache Spark and Object Stores —for London Spark User Group

6 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

A Filesystem: Directories, Files Data

/

work

pending

part-00

part-01

00

00

00

01

0101

complete

part-01

rename("/work/pending/part-01", "/work/complete")

Page 7: Apache Spark and Object Stores —for London Spark User Group

7 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Object Store: hash(name)->blob

00

00

00

01

01

s01 s02

s03 s04

hash("/work/pending/part-01") ["s02", "s03", "s04"]

copy("/work/pending/part-01", "/work/complete/part01")

01

010101

delete("/work/pending/part-01")

hash("/work/pending/part-00") ["s01", "s02", "s04"]

Page 8: Apache Spark and Object Stores —for London Spark User Group

8 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

REST APIs

00

00

00

01

01

s01 s02

s03 s04

HEAD /work/complete/part-01

PUT /work/complete/part01x-amz-copy-source: /work/pending/part-01

01

DELETE /work/pending/part-01

PUT /work/pending/part-01... DATA ...

GET /work/pending/part-01Content-Length: 1-8192

GET /?prefix=/work&delimiter=/

Page 9: Apache Spark and Object Stores —for London Spark User Group

9 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Often: Eventually Consistent

00

00

00

01

01

s01 s02

s03 s04

01

DELETE /work/pending/part-00

GET /work/pending/part-00

GET /work/pending/part-00

200

200

200

Page 10: Apache Spark and Object Stores —for London Spark User Group

10 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

org.apache.hadoop.fs.FileSystem

hdfs s3awasb adlswift gs

Page 11: Apache Spark and Object Stores —for London Spark User Group

11 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

s3:// —“inode on S3”

s3n://“Native” S3

s3a:// Replaces s3n

swift://OpenStack

wasb://Azure WASB

Phase I: Stabilize S3A

oss://Aliyun

gs://Google Cloud

Phase II: speed & scale adl://Azure Data Lake

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

2017

s3://Amazon EMR S3

History of Object Storage Support

Phase III: scale & consistency

Page 12: Apache Spark and Object Stores —for London Spark User Group

12 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Four Challenges with Object Stores

1. Classpath

2. Credentials

3. Code

4. Commitment

Page 13: Apache Spark and Object Stores —for London Spark User Group

13 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Azure Storage: wasb:// Consistent, with locking and fast COPYA full substitute for HDFS

Page 14: Apache Spark and Object Stores —for London Spark User Group

14 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Classpath: fix “No FileSystem for scheme: wasb”

hadoop-azure-2.7.x.jarazure-storage-2.2.0.jar+ (jackson-core; http-components, hadoop-common)

Page 15: Apache Spark and Object Stores —for London Spark User Group

15 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Credentials: core-site.xml / spark-default.conf

<property> <name>fs.azure.account.key.example.blob.core.windows.net</name> <value>0c0d44ac83ad7f94b0997b36e6e9a25b49a1394c</value></property>

spark.hadoop.fs.azure.account.key.example.blob.core.windows.net 0c0d44ac83ad7f94b0997b36e6e9a25b49a1394c

wasb://[email protected]: share, check in to SCM, paste in bug reports…

Page 16: Apache Spark and Object Stores —for London Spark User Group

16 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Code: Azure Storage and Streaming

val streaming = new StreamingContext(sparkConf,Seconds(10))val azure = "wasb://[email protected]/in"val lines = streaming.textFileStream(azure)val matches = lines.map(line => { println(line) line })matches.print()streaming.start()

* PUT into the streaming directory* keep that directory clean* size window for slow scans* checkpoints slow —reduce frequency

Page 17: Apache Spark and Object Stores —for London Spark User Group

17 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Azure Datalake: adl://

HD/Insight & Hadoop 2.8

hadoop-azure-datalake-2.8.0.jarazure-data-lake-store-sdk-2.1.jarokhttp-2.4.0.jarokio-1.4.0.jar

Page 18: Apache Spark and Object Stores —for London Spark User Group

18 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Amazon S3:Use the S3A connector(EMR: use Amazon's s3:// )

Page 19: Apache Spark and Object Stores —for London Spark User Group

19 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Classpath: “No FileSystem for scheme: s3a”

hadoop-aws-2.7.x.jar

aws-java-sdk-1.7.4.jarjoda-time-2.9.3.jar(jackson-*-2.6.5.jar)

See SPARK-7481

Get Spark with Hadoop 2.7+ JARs

Page 20: Apache Spark and Object Stores —for London Spark User Group

20 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Credentials

core-site.xml or spark-default.conf spark.hadoop.fs.s3a.access.key MY_ACCESS_KEY spark.hadoop.fs.s3a.secret.key MY_SECRET_KEY

spark-submit propagates Environment Variables

export AWS_ACCESS_KEY=MY_ACCESS_KEY export AWS_SECRET_KEY=MY_SECRET_KEY

NEVER: share, check in to SCM, paste in bug reports…

Page 21: Apache Spark and Object Stores —for London Spark User Group

21 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Authentication Failure: 403

com.amazonaws.services.s3.model.AmazonS3Exception: The request signature we calculated does not match the signature you provided. Check your key and signing method.

1. Check joda-time.jar & JVM version2. Credentials wrong3. Credentials not propagating4. Local system clock (more likely on VMs)

Page 22: Apache Spark and Object Stores —for London Spark User Group

22 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Code: just use the URL of the object store

val csvdata = spark.read.options(Map( "header" -> "true", "inferSchema" -> "true", "mode" -> "FAILFAST")) .csv("s3a://landsat-pds/scene_list.gz")

Page 23: Apache Spark and Object Stores —for London Spark User Group

23 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

DataFrames

val landsat = "s3a://stevel-demo/landsat"csvData.write.parquet(landsat)

val landsatOrc = "s3a://stevel-demo/landsatOrc"csvData.write.orc(landsatOrc)

val df = spark.read.parquet(landsat)val orcDf = spark.read.parquet(landsatOrc)

*list inconsistency*+commit time O(data)

Page 24: Apache Spark and Object Stores —for London Spark User Group

24 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Finding dirty data with Spark SQL

val sqlDF = spark.sql( "SELECT id, acquisitionDate, cloudCover" + s" FROM parquet.`${landsat}`")

val negativeClouds = sqlDF.filter("cloudCover < 0")negativeClouds.show()

* filter columns and data early * whether/when to cache()?* copy popular data to HDFS

Page 25: Apache Spark and Object Stores —for London Spark User Group

25 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

spark-default.conf

spark.sql.parquet.filterPushdown truespark.sql.parquet.mergeSchema falsespark.hadoop.parquet.enable.summary-metadata false

spark.sql.orc.filterPushdown truespark.sql.orc.splits.include.file.footer truespark.sql.orc.cache.stripe.details.size 10000

spark.sql.hive.metastorePartitionPruning true

Page 26: Apache Spark and Object Stores —for London Spark User Group

26 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Notebooks? Classpath & Credentials

Page 27: Apache Spark and Object Stores —for London Spark User Group

27 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Hadoop 2.8 transforms I/O performance!

// forward seek by skipping streamspark.hadoop.fs.s3a.readahead.range 256K

// faster backward seek for ORC and Parquet inputspark.hadoop.fs.s3a.experimental.input.fadvise random

// PUT blocks in separate threadsspark.hadoop.fs.s3a.fast.output.enabled true

Page 28: Apache Spark and Object Stores —for London Spark User Group

28 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

The Commitment Problem

⬢ rename() used for atomic commitment transaction⬢ time to copy() + delete() is O(data * files)⬢ S3: 6+ MB/s

spark.speculation falsespark.hadoop.mapreduce.fileoutputcommitter.algorithm.version 2spark.hadoop.mapreduce.fileoutputcommitter.cleanup.skipped true

Page 29: Apache Spark and Object Stores —for London Spark User Group

29 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Page 30: Apache Spark and Object Stores —for London Spark User Group

30 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

S3guard:fast, consistent S3 metadataHADOOP-13445Hortonworks + Cloudera + Western Digital

Page 31: Apache Spark and Object Stores —for London Spark User Group

31 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

DynamoDB as fast, consistent metadata store

00

00

00

01

01

s01 s02

s03 s04

01

DELETE part-00200

HEAD part-00200

HEAD part-00404

PUT part-00200

00

Page 32: Apache Spark and Object Stores —for London Spark User Group

32 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Page 33: Apache Spark and Object Stores —for London Spark User Group

33 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Netflix Staging Committer1. Saves output to file://

2. Task commit: upload to S3A as multipart PUT —but does not commit the PUT, just saves the information about it to hdfs://

3. Normal commit protocol manages task and job data promotion

4. Final Job committer reads pending information and generates final PUT —possibly from a different host

Outcome:

⬢ No visible overwrite until final job commit: resilience and speculation

⬢ Task commit time = data/bandwidth⬢ Job commit time = POST * #files

Page 34: Apache Spark and Object Stores —for London Spark User Group

34 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Demo!

Page 35: Apache Spark and Object Stores —for London Spark User Group

35 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Summary

⬢ Object Stores look just like any other Filesystem URL

⬢ …but do need classpath and configuration

⬢ Issues: performance, commitment

⬢ Tune to reduce I/O

⬢ Keep those credentials secret!

Finally: keep an eye out for s3guard!

Page 36: Apache Spark and Object Stores —for London Spark User Group

36 © Hortonworks Inc. 2011 – 2017 All Rights Reserved36 © Hortonworks Inc. 2011 – 2017. All Rights Reserved

Questions?

Page 37: Apache Spark and Object Stores —for London Spark User Group

37 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Backup Slides

Page 38: Apache Spark and Object Stores —for London Spark User Group

38 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

S3 Encryption

⬢ SSE-S3 each object encrypted by a unique key using AES-256 cipher⬢ SSE-KMS and SSE-C in Hadoop 2.9, maybe 2.8.1?⬢ Client side encryption: this will break a lot of code

Page 39: Apache Spark and Object Stores —for London Spark User Group

39 © Hortonworks Inc. 2011 – 2017 All Rights Reserved

Advanced authentication

<property> <name>fs.s3a.aws.credentials.provider</name> <value> org.apache.hadoop.fs.s3a.SimpleAWSCredentialsProvider, org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider, com.amazonaws.auth.EnvironmentVariableCredentialsProvider, com.amazonaws.auth.InstanceProfileCredentialsProvider, org.apache.hadoop.fs.s3a.AnonymousAWSCredentialsProvider </value></property>

+encrypted credentials in JECKS files on HDFS

Page 40: Apache Spark and Object Stores —for London Spark User Group

40 © Hortonworks Inc. 2011 – 2017 All Rights Reserved