spark in yarn managed multi-tenant clusters

Spark in YARN-managed multi-tenant clusters

Pravin Mittal ([email protected])Rajesh Iyer ([email protected])

Spark on Azure HDInsightFully Managed Service• 100% open source Apache Spark and Hadoop bits• Latest releases of Spark• Fully supported by Microsoft and Hortonworks• 99.9% Azure Cloud SLA; 24/7 Managed Service• Certifications: PCI, ISO 27018, SOC, HIPAA, EU-MC

Optimized for experimentation and development• Jupyter Notebooks (scala, python, automatic data visualizations)• IntelliJ plugin (job submission, remote debugging)• ODBC connector for Power BI, Tableau, Qlik, SAP, Excel, etc

Make Spark Simple - Integrated with Azure Ecosystem

• Microsoft R Server - Multi-threaded math libraries and transparent parallelization in R Server means handling up to 1000x more data and up to 50x faster speeds than open source R. This is based on open source R, it does not require any change to R scripts

• Azure Data Lake Store – HDFS for the cloud, optimized for massive throughput, Ultra-high capacity, Low Latency, Secure ACL support• Azure Data Factory orchestrates Spark ETL pipeline• PowerBI connector for Spark for rich visualization. New in Power BI is a streaming connector allowing you to publish real-time events

from Spark Streaming directly to Power BI. • EventsHub connector as a data source for Spark streaming• Azure SQL Datawarehouse & Hbase connector for fast & scalable storage

Jupyter-Spark Integration via Livy• Sparkmagic is an open source library that Microsoft is incubating under the Jupyter Incubator program• Thousands of Spark clusters in production providing feedback to further improve the experience

https://github.com/jupyter-incubator/sparkmagic

Spark Execution Model

Each Spark Application is an instance of SparkContext that gets its own executor processes that has application lifetime

Spark is agnostic of Cluster manager as long it has executor process that can communicate with each other

The driver program must listen for and accept incoming connections from its executors throughout its lifetime

Driver is responsible for scheduling tasks on the cluster

Why Yarn as Cluster Manager?Microsoft, Cloudera, Hortonworks, IBM and many other are all actively working to impove YARN

YARN allows you to dynamically share and centrally configure the same pool of cluster resources between all frameworks that run on YARN.

YARN is the only cluster manager for Spark that supports security. With YARN, Spark can run against Kerberized Hadoop clusters and uses secure authentication between its processes.

YARN allows us to have richer resource management policy• Allows to maximize cluster utilization, fair resource sharing, dynamic pre-emption when running multiple concurrent application and also able to

provide different resource guarantees for Batch and Interactive workload.

[1] http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/

http://blog.cloudera.com/blog/2014/05/apache-spark-resource-management-and-yarn-app-models/

SparkSubmit starts and talks to the ResourceManager for the cluster

The ResourceManager makes a single container request on behalf of the SparkSubmit

The ApplicationMaster starts running within that container.

The ApplicationMaster requests subsequent containers for the Spark Executors from the ResourceManager are allocated to run tasks for the application.

For Spark Batch Applications, all the Spark executor containers and Application master are freed

For Spark interactive Applications (Dynamic Executor enabled), Spark executors are freed after idle timeout but Application master remains till Spark driver exits.

YARN Allocation Model for Spark

https://blog.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/

ttps://blog.cloudera.com/blog/2015/09/untangling-apache-hadoop-yarn-part-1/

Running Spark on YARN in HDInsight• Requirements• Maximize cluster utilization i.e. reduce idle resource• Fair resource sharing between different Spark

applications• Resource guarantee

Maximize cluster utilization• Reduce allocating idle resource• Application should be able to use the entire cluster if necessary• Should be able to work with cluster scaling• What should be the ideal setting for the number of executors for any Spark

application• Spark static allocation

spark.executor.instances to a large value• Spark dynamic allocation

spark.dynamicAllocation.enabled = true spark.dynamicAllocation.maxExecutors to a large value

• YARN capacity scheduler queue yarn.scheduler.capacity.<parent queue>.<child queue>.maximum-capacity to 100

Fair resource sharing• Concurrent applications should be able to share resources• Use separate YARN capacity scheduler queues for different Spark

contexts Queues are statically created Allocated resources are not shared between different Spark contexts Need a way to reclaim allocated resources when another Spark context comes along YARN preemption AND Spark dynamic allocation Spark dynamic allocation gives up only idle resource YARN preemption to reclaim in-use resource (yarn.resourcemanager.scheduler.monitor.enable

& yarn.resourcemanager.scheduler.monitor.policies) YARN preemption predictable with yarn.scheduler.capacity.resource-calculator =

DefaultResourceCalculator YARN JIRA YARN-4390

Fair resource sharing• Use separate Spark resource pools for same Spark context

Resource pools are dynamically created per context Allocated resources are shared between different Spark jobs No need to reclaim allocated resources when another Spark job comes along

• Combination of the above to support concurrently running Notebooks, Batch and BI workloads in the same cluster

Resource guarantee• Every spark application should be able to run immediately• Combination

Separate YARN capacity queues with yarn.scheduler.capacity.<parent queue>.<child queue>.capacity used to guarantee resources for different Spark applications

Separate Spark resource pools within the same Spark application YARN preemption to ensure that in-use resource can be reclaimed Spark dynamic allocation to ensure that idle resources can be reclaimed

Working configuration• Spark settings

Spark.executor.instances = <very large value> OR

Spark.dynamicAllocation.enabled = true Spark.dynamicAllocation.initialExecutors = 0 Spark.dynamicAllocation.minExecutors = 0 Spark.dynamicAllocation.maxExecutors = <very large value>

• YARN settings yarn.resourcemanager.scheduler.monitor.enable = true yarn.resourcemanager.scheduler.monitor.policies =

org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy yarn.scheduler.capacity.resource-calculator=org.apache.hadoop.yarn.util.resource.DefaultResourceCalculator yarn.scheduler.capacity.root.queues=default,<n queues> Yarn.scheduler.capacity.<parent_queue>.<child_queue>.capacity Yarn.scheduler.capacity.<parent_queue>.<child_queue>.maximum_capacity

spark in yarn managed multi-tenant clusters

Data & Analytics