tensorframes: google tensorflow on apache spark

TensorFrames: Google Tensorflow on Apache Spark

Tim HunterMeetup 08/2016 - Salesforce

How familiar are you with Spark?

1. What is Apache Spark?

2. I have used Spark

3. I am using Spark in production or I contribute to its development

How familiar are you with TensorFlow?

1. What is TensorFlow?

2. I have heard about it

3. I am training my own neural networks

Founded by the team who created Apache Spark

Offers a hosted service: - Apache Spark in the cloud - Notebooks - Cluster management - Production environment

About Databricks

Software engineer at Databricks

Apache Spark contributor

Ph.D. UC Berkeley in Machine Learning

(and Spark user since Spark 0.5)

About me

Outline•Numerical computing with Apache Spark

•Using GPUs with Spark and TensorFlow

•Performance details

•The future

Numerical computing for Data Science

•Queries are data-heavy

•However algorithms are computation-heavy

•They operate on simple data types: integers, floats, doubles, vectors, matrices

The case for speed•Numerical bottlenecks are good targets for

optimization

• Let data scientists get faster results

• Faster turnaround for experimentations

•How can we run these numerical algorithms faster?

Evolution of computing power

Failure is not an option: it is a fact

When you can afford your dedicated chip

Scale out

Evolution of computing power

NLTKTheano

Today’s talk:Spark + TensorFlow

Evolution of computing power• Processor speed cannot keep up with memory and

network improvements

• Access to the processor is the new bottleneck

• Project Tungsten in Spark: leverage the processor’s heuristics for executing code and fetching memory

• Does not account for the fact that the problem is numerical

Asynchronous vs. synchronous

• Asynchronous algorithms perform updates concurrently• Spark is synchronous model, deep learning frameworks

usually asynchronous• A large number of ML computations are synchronous• Even deep learning may benefit from synchronous updates

•The future

GPGPUs

•Graphics Processing Units for General Purpose computations

Series1

Theoretical peakthroughput

GPU CPU

Series1

Theoretical peakbandwidth

GPU CPU

• Library for writing “machine intelligence” algorithms

• Very popular for deep learning and neural networks

•Can also be used for general purpose numerical computations

• Interface in C++ and Python

Google TensorFlow

Numerical dataflow with Tensorflow

x = tf.placeholder(tf.int32, name=“x”)y = tf.placeholder(tf.int32, name=“y”)output = tf.add(x, 3 * y, name=“z”)

session = tf.Session()output_value = session.run(output, {x: 3, y: 5})

x:int32

y:int32

Numerical dataflow with Spark

df = sqlContext.createDataFrame(…)

x = tf.placeholder(tf.int32, name=“x”)y = tf.placeholder(tf.int32, name=“y”)output = tf.add(x, 3 * y, name=“z”)

output_df = tfs.map_rows(output, df)

output_df.collect()

df: DataFrame[x: int, y: int]

output_df: DataFrame[x: int, y: int, z: int]

x:int32

y:int32

•The future

It is a communication problem

Spark worker process Worker python process

C++buffer

Python pickle

Tungsten binary format

Python pickle

Javaobject

TensorFrames: native embedding of TensorFlow

Spark worker process

C++buffer

Javaobject

• Estimation of distribution from samples•Non-parametric•Unknown bandwidth

parameter•Can be evaluated with

goodness of fit

An example: kernel density scoring

• In practice, compute:

• In a nutshell: a complex numerical function

An example: kernel density scoring

Speedup

Scala UDF Scala UDF (optimized) TensorFrames TensorFrames + GPU0

def score(x: Double): Double = { val dis = points.map { z_k => - (x - z_k) * (x - z_k) / ( 2 * b * b) } val minDis = dis.min val exps = dis.map(d => math.exp(d - minDis)) minDis - math.log(b * N) + math.log(exps.sum)}

val scoreUDF = sqlContext.udf.register("scoreUDF", score _)sql("select sum(scoreUDF(sample)) from samples").collect()

Speedup

)def score(x: Double): Double = { val dis = new Array[Double](N) var idx = 0 while(idx < N) { val z_k = points(idx) dis(idx) = - (x - z_k) * (x - z_k) / ( 2 * b * b) idx += 1 } val minDis = dis.min var expSum = 0.0 idx = 0 while(idx < N) { expSum += math.exp(dis(idx) - minDis) idx += 1 } minDis - math.log(b * N) + math.log(expSum)}

val scoreUDF = sqlContext.udf.register("scoreUDF", score _)sql("select sum(scoreUDF(sample)) from samples").collect()

Speedup

)def cost_fun(block, bandwidth): distances = - square(constant(X) - sample) / (2 * b * b) m = reduce_max(distances, 0) x = log(reduce_sum(exp(distances - m), 0)) return identity(x + m - log(b * N), name="score”)

sample = tfs.block(df, "sample")score = cost_fun(sample, bandwidth=0.5)df.agg(sum(tfs.map_blocks(score, df))).collect()

Speedup

)def cost_fun(block, bandwidth): distances = - square(constant(X) - sample) / (2 * b * b) m = reduce_max(distances, 0) x = log(reduce_sum(exp(distances - m), 0)) return identity(x + m - log(b * N), name="score”)

with device("/gpu"): sample = tfs.block(df, "sample") score = cost_fun(sample, bandwidth=0.5)df.agg(sum(tfs.map_blocks(score, df))).collect()

Demo: Deep dreams

•The future

Improving communication

Spark worker process

C++buffer

Javaobject

Direct memory copy

Columnarstorage

The future• Integration with Tungsten:• Direct memory copy• Columnar storage

•Better integration with MLlib data types

•GPU instances in Databricks: Official support coming this fall

Recap•Spark: an efficient framework for running

computations on thousands of computers

•TensorFlow: high-performance numerical framework

•Get the best of both with TensorFrames:• Simple API for distributed numerical computing• Can leverage the hardware of the cluster

Try these demos yourself•TensorFrames source code and documentation:

github.com/databricks/tensorframesspark-packages.org/package/databricks/tensorframes

•Demo notebooks available on Databricks

•The official TensorFlow website: www.tensorflow.org

Spark Summit EU 2016 15% Discount Code: DatabricksEU16

Thank you.

tensorframes: google tensorflow on apache spark

Software

introduction of tensorflow -...

scalable tensorflow learning on spark clusters€¦ ·...

tensorflow - whitepaper2015

big data spain - nov 17 2016 - madrid continuously deploy...

welcome to tensorflow! · 2017. 7. 10. · awesome projects...

with apache spark scalable monitoring...frameworks: apache...

accelerate big data processing (hadoop, spark, memcached, &...

introduction to tensorflow 2...deep learning intro to...

meetup tensorframes

advanced spark and tensorflow meetup - london - nov 15, 2016...

machine learning in pandaroot - jlab.org · • tensorflow...

· scale with deep learning and apache spark tim hunter &...

optimize + deploy distributed tensorflow, spark, and...

advanced spark and tensorflow meetup 08-04-2016 one click...

tensorflow extended (tfx) · tensorflow transform estimator...

scaling out tensorflow-as-a-service on spark and commodity...

digital twin - ibm · fully black box models (tooling ,...

ai and data science for the enterprise - cisco...spark,...

data engineer - ematiq · pandas, numpy, dask, spark...

building google's ml engine from scratch on aws with gpus,...