building production spark streaming applications

Download Building production spark streaming applications

Post on 14-Apr-2017

553 views

Category:

Technology

5 download

Embed Size (px)

TRANSCRIPT

Building Production Spark Streaming Applications

Joey Echeverria, Platform Technical Lead - @fwiffoData Day Texas 2017

Building Production Spark Streaming Applications

Rocana, Inc. All Rights Reserved. | #

JoeyWhere I work: Rocana Platform Technical LeadWhere I used to work: Cloudera (11-15), NSADistributed systems, security, data processing, big data

Rocana, Inc. All Rights Reserved. | #

Rocana, Inc. All Rights Reserved. | #

ContextWe built a system for large scale realtime collection, processing, and analysis of event-oriented machine dataOn prem or in the cloud, but not SaaSSupportability is a big deal for usPredictability of performance under load and failuresEase of configuration and operationBehavior in wacky environments

Rocana, Inc. All Rights Reserved. | #

YMMV Not necessarily true for youEnterprise software shipping stuff to peopleFine grained events logs, user behavior, etc.For everything solving the problem of enterprise wide ops, so its everything from everywhere from everyone for all time (until they run out of money for nodes).This isnt condemnation of general purpose search engines as much as what we had to do for our domain

Apache Spark Streaming

Rocana, Inc. All Rights Reserved. | #

Spark streaming overviewStream processing API built on top of the Spark execution engineMicro-batchingEvery n-milliseconds fetch records from data sourceExecute Spark jobs on each input batchDStream APIWrapper around the RDD APILets the developer think in terms of transformations on a stream of events

Rocana, Inc. All Rights Reserved. | #

Input BatchSpark BatchEngine

Output Batch

Rocana, Inc. All Rights Reserved. | #

Structured streamingNew streaming API for SparkRe-use DataFrames API for streamingAPI was too new when we startedFirst release was an alphaNo Kafka support at the timeDetails won't apply, but the overall approach should be in the ballpark

Rocana, Inc. All Rights Reserved. | #

Other notesOur experience is with Spark 1.6.22.0.0 was released after we started our Spark integrationWe use the Apache release of SparkSupports both CDH and HDP without recompilingWe run Spark on YARN, so we're decoupled from other users on the cluster

Rocana, Inc. All Rights Reserved. | #

Use CaseReal-time alerting on IT operational data

Rocana, Inc. All Rights Reserved. | #

Our typical customer use cases>100K events / sec (8.6B events / day), sub-second end to end latency, full fidelity retention, critical use casesQuality of service - are credit card transactions happening fast enough?Fraud detection - detect, investigate, prosecute, and learn from fraud.Forensic diagnostics - what really caused the outage last friday?Security - whos doing what, where, when, why, and how, and is that ok?User behavior - capture and correlate user behavior with system performance, then feed it to downstream systems in realtime.

Rocana, Inc. All Rights Reserved. | #

Overall architectureweirdo formatstransformation 1weirdo format -> eventavro eventstransformation 2event -> storage-specificstorage-specific representation of events

Rocana, Inc. All Rights Reserved. | #

Real-time alertingDefine aggregations, conditions, and actionsUse cases:Send me an e-mail when the number of failed login events from a user is > 3 within an hourCreate a ServiceNow ticket when CPU utilization spikes to > 95% for 10 minutes

Rocana, Inc. All Rights Reserved. | #

UI

Rocana, Inc. All Rights Reserved. | #

Architecture

Rocana, Inc. All Rights Reserved. | #

Packaging, Deployment, and Execution

Rocana, Inc. All Rights Reserved. | #

PackagingApplication classes and dependenciesTwo optionsShade all dependencies into an uber jarMake sure Hadoop and Spark dependencies are marked providedSubmit application jars and dependent jars when submitting

Rocana, Inc. All Rights Reserved. | #

Deployment modesStandaloneManually start up head and worker servicesResource control depends on options selected when launching daemonsDifficult to mix versionsApache MesosCoarse grained run mode, launch executors as Mesos tasksCan use dynamic allocation to launch executors on demandApache Hadoop YARNBest choice if your cluster is already running YARN

Rocana, Inc. All Rights Reserved. | #

Spark on YARNClient mode versus cluster modeClient mode == Spark Driver on local serverCluster mode == Spark Driver in YARN AMSpark executors run in YARN containers (one JVM per executor)spark.executor.instancesEach executor core uses one YARN vCorespark.executor.cores

Rocana, Inc. All Rights Reserved. | #

Job submissionMost documentation covers spark-submitOK for testing, but not great for productionWe use spark submitter APIsBuilt easier to use wrapper APIHide some of the details of configurationSome configuration parameters aren't respected when using submitter APIspark.executor.cores, spark.executor.memoryspark.driver.cores, spark.driver.memory

Rocana, Inc. All Rights Reserved. | #

Job monitoringStreaming applications are always onNeed to monitor the job for failuresRestart the job on recoverable failuresNotify an admin on fatal failures (e.g. misconfiguration)Validate as much up front as possibleOur application runs rules through a type checker and query planner before saving

Rocana, Inc. All Rights Reserved. | #

Instrumentation, Metrics, and Monitoring

Rocana, Inc. All Rights Reserved. | #

InstrumentationYou can't fix whatyou don't measure

Rocana, Inc. All Rights Reserved. | #

Instrumentation APIsSpark supports Dropwizard (ne CodaHale) metricsCollect both application and framework metricsSupports most popular metric typesCountersGaugesHistogramsTimersetc.Use your own APIsBest option if you have your existing metric collection infrastructure

Rocana, Inc. All Rights Reserved. | #

Custom metricsImplement the org.apache.spark.metrics.source.Source interfaceRegister your source with sparkEnv.metricsSystem().registerSource()If you're measuring something during execution, you need to register the metric on the executorsRegister executor metrics in a static blockYou can't register a metrics source until the SparkEnv has been initializedSparkEnv sparkEnv = SparkEnv.get();if (sparkEnv != null) { // create and register source}

Rocana, Inc. All Rights Reserved. | #

Metrics collectionConfigure $SPARK_HOME/conf/metrics.propertiesBuilt-in sinksConsoleSinkCVSSinkJmxSinkMetricsServletGraphiteSinkSlf4jSinkGangliaSinkOr build your own

Rocana, Inc. All Rights Reserved. | #

Build your ownImplement the org.apache.spark.metrics.sink.Sink interfaceWe built a KafkaEventSink that sends the metrics to a Kafka topic formatted as Osso* eventsOur system has a metrics collectorAggregates metrics in a Parquet tableQuery and visualize metrics using SQL

*http://www.osso-project.org

Rocana, Inc. All Rights Reserved. | #

Report and visualize

Rocana, Inc. All Rights Reserved. | #

GotchaDue to the order of metrics subsystem initialization, your collection plugin must be on the system classpath, not application classpathhttps://issues.apache.org/jira/browse/SPARK-18115Options:Deploy library on cluster nodes (e.g. add to HADOOP_CLASSPATH)Build a custom Spark assembly jar

Rocana, Inc. All Rights Reserved. | #

Custom spark assemblyMaven shade pluginMerge upstream Spark assembly JAR with your library and dependenciesShade/rename library packagesMight break configuration parameters as well *.sink.kafka.com_rocana_assembly_shaded_kafka_brokers Mark any dependencies already in the assembly as providedAsk me about our akka.version fiasco

Rocana, Inc. All Rights Reserved. | #

Configuration and Tuning

Rocana, Inc. All Rights Reserved. | #

Architecture

Rocana, Inc. All Rights Reserved. | #

Predicting CPU/task resourcesEach output operation creates a separate batch job when processing a micro-batchnumber of jobs = number of output opsEach data shuffle/re-partitioning creates a separate stagenumber of stages per job = number of shuffles + 1Each partition in a stage creates a separate tasknumber of tasks per job = number of stages * number of partitions

Rocana, Inc. All Rights Reserved. | #

Resources for alertingEach rule has a single output operation (write to Kafka)Each rule has 3 stagesRead from Kafka, project, filter and group data for aggregationAggregate values, filter (conditions) and group data for triggersAggregate trigger results and send trigger events to KafkaFirst stage partitions = number of Kafka partitionsStage 2 and 3 use spark.default.parallelism partitions

Rocana, Inc. All Rights Reserved. | #

Example100 rules, Kafka partitions = 50, spark.default.parallelism = 50number of jobs = 100number of stages per job = 3number of tasks per job = 3 * 50 = 150total number of tasks = 100 * 150 = 15,000

Rocana, Inc. All Rights Reserved. | #

Task slotsnumber of task slots = spark.executor.instances * spark.executor.coresExample50 instances * 8 cores = 400 task slots

Rocana, Inc. All Rights Reserved. | #

WavesThe jobs processing the micro-batches will run in waves based on available task slotsNumber of waves = total number of tasks / number of task slotsExampleNumber of waves = 15,000 / 400 = 38 waves

Rocana, Inc. All Rights Reserved. | #

Max time per wavemaximum time per wave = micro-batch duration / number of wavesExample:15 second micro-batch durationmaximum time per wave = 15,000 ms / 38 waves = 394 ms per waveIf the average task time > 394 ms