do more, code less with parallel computing libraries

Post on 10-Aug-2015

59 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Do More, Code Lesswith

Parallel Computing Libraries

Fu Jie2012.Dec.15

General Artificial Intelligence

A Large-Scale Model of the Functioning Brain, Science, 2012

How can we get a competitive advantage

with data?•More data•Better algorithms HOW?

If you have a lot of time on your hands

Parallel Computing with Jacket GPU library

Easy GPU Acceleration of MATLAB code

No GPU-specific stuff involved

no kernels, no threads, no blocks, just regular M code

Easy to Maintain

• Each new library release improves the speed of our code, without any code modification• Each new library release leverages latest GPU

hardware, without any code modification

Needless to Say, We Need Machine Learning for Big Data

48 Hours a MinuteYouTube24 Million

Wikipedia Pages

750 MillionFacebook Users

6 Billion Flickr Photos

“… data a new class of economic asset, like currency or gold.”

How will wedesign and implement

parallel learning systems?

Big Learning

A Shift Towards Parallelism

GPUs Multicore Clusters Clouds Supercomputers

• ML experts repeatedly solve the same parallel design challenges:• Race conditions, distributed state, communication…

• The resulting code is:• difficult to maintain, extend, debug…

Graduate

students

Avoid these problems by using high-level abstractions

CPU 1 CPU 2 CPU 3 CPU 4

Data Parallelism (MapReduce)

12.9

42.3

21.3

25.8

24.1

84.3

18.4

84.4

17.5

67.5

14.9

34.3

Solve a huge number of independent subproblems

What is this?

It’s next to this…

Addressing Graph-Parallel ML

Data-Parallel Graph-Parallel

CrossValidation

Feature Extraction

Map Reduce

Computing SufficientStatistics

Graphical ModelsGibbs Sampling

Belief PropagationVariational Opt.

Semi-Supervised Learning

Label PropagationCoEM

Data-MiningPageRank

Triangle Counting

Collaborative Filtering

Tensor Factorization

Map Reduce?Graph-Parallel Abstraction

• Designed specifically for ML• Graph dependencies• Iterative• Asynchronous• Dynamic

• Simplifies design of parallel programs:• Abstract away hardware issues• Automatic data synchronization• Addresses multiple hardware

architectures

Efficientparallel

predictions

Know how to solve ML problem

on 1 machine

top related