big data & artificial intelligence

37
Big Data & Artificial Intelligence 2014 Technology Review and Primer Zavain Dar

Upload: zavain-dar

Post on 17-Jul-2015

728 views

Category:

Technology


1 download

TRANSCRIPT

Page 1: Big Data & Artificial Intelligence

Big Data & Artificial Intelligence2014 Technology Review and Primer

Zavain Dar

Page 2: Big Data & Artificial Intelligence

High Level

Data —> Infrastructure —> Enables more Data —> Analytics, Applications, & Artificial Intelligence

If we buy the above, we see ‘AI’, ‘Big Data’, ‘Deep Learning’, etc… not as buzz words, but as a logical next step of technological

progress from the past 20 years

2

Page 3: Big Data & Artificial Intelligence

Outline

• Historical Context: The Web, Big Data, & Distributed Computing

• Modern Infrastructure

• Artificial Intelligence

• Learnings & Thesis Directions

3

Page 4: Big Data & Artificial Intelligence

Computing Infrastructure pre Web

• Storage Paradigm: Relational Databases (Oracle, MySQL, etc…)

• Access Paradigm: Relation Algebra (SQL)

• Each computer owned its data, computation was generally done on a single computer

CC

C

DD

D

Page 5: Big Data & Artificial Intelligence

1984: 100 Nodes convert to TCP/IP

• Until 1984, there was no unified ‘internet’, rather a collection of fragmented networks using one-off protocols

• In 1984, the most connected 100 nodes switched to TCP/IP. Modern Internet was born

Page 6: Big Data & Artificial Intelligence

The Web as a ‘Big Data’-base

• We can view the Web itself as the first big database

• Storage Paradigm: HTML, DOM, Relational Databases (Oracle, MySQL)

• Access Paradigm: HTTP

C

C

C

C

D

D

D

D

Page 7: Big Data & Artificial Intelligence

The Web emerged as the first ‘Big Data’-set

• Other than HTTP requests, which were slow and clunky - we had no way to index, and parse web content

• A handful of search engines came and went, but all struggled to effectively deploy algorithms atop this massive distributed data set

7

Page 8: Big Data & Artificial Intelligence

Google in 1998• Data uniformly distributed across

computers

• Storage Paradigm: GFS (Google Filing System)

• Access Paradigm: ???

• Google kept Access Paradigm proprietary for years

CCC C

D

Page 9: Big Data & Artificial Intelligence

2004: Big Data leaves Google’s confines

• Jeff Dean and Sanjay Ghemawat publish seminal paper outlining MapReduce, a distributed data access paradigm

• Storage Paradigm: GFS

• Access Paradigm: MapReduce

Page 10: Big Data & Artificial Intelligence

Modern Big Data• Apache Hadoop was born as an open source project form Yahoo in 2005.

Followed Google’s GFS and Google MapReduce implementations

• Hadoop consisted of HFS (Hadoop Filing System) and Hadoop Map Reduce

• It took years for the open source framework to become enterprise ready. In the interim, Cloudera and HortonWorks began offering enterprise solutions based around Hadoop

• Others wrote completely black box, proprietary versions based on GFS and Map Reduce. Examples: Palantir and Discovery Engine

• Palantir only recently switching over to Hadoop based code. 10

Page 11: Big Data & Artificial Intelligence

Emergent Themes• Commoditization of Infrastructure

• Early infrastructure providers have plateaued in value; Hortonworks a recent example with a down round IPO

• DevOps

• As computing models changed from local and heterogeneous-hardware based, new solutions emerge to help pace innovation

• ‘Appification' and Analytics atop Hadoop

11

Page 12: Big Data & Artificial Intelligence

DevOps: Docker

• Programming on and testing on a laptop different than running on Dell x86 clusters or mobile+HP server.

• Docker creates a portable container (eg docker) around an application, making it easy to port to heterogenous environments

laptop x86 x86

x86 x86

HP iOS

Application

Application

Application

Page 13: Big Data & Artificial Intelligence

DevOps: Mesosphere

• The old world had Virtual Machines which sliced single computers into numerous ‘virtual instances’ for security, debugging, etc…

• Now we need the opposite, to view entire clusters as a singe computer with shared and (hence) optimized storage, network, and compute C

CC

C’

Page 14: Big Data & Artificial Intelligence

Artificial Intelligence

Traditional AI broken into 2 categories

1. Computational Logic (this guy!) & Search+Planning

2. Machine Learning

14

Page 15: Big Data & Artificial Intelligence

Computational Logic + Planning• Based on implementing static rules for a computer to follow. The

end algorithm and rules are independent of the data

• Old school (Chomskyan) NLP and chess playing followed this approach

• Planning based on route optimization and ‘graph search’

• Eg how do you efficiently plan a UPS route, or guide a robotic arm around obstacles of a pre known course

15

Page 16: Big Data & Artificial Intelligence

Computational Logic + Planning• From 1940s through the early 1990s this was the preferred methodology for AI

• Key assumption: The world is guided by rules, and it’s just going to be a while before we can encode the minimal viable set before computers can deduce future outcomes and propositions

• AI slowed in results, and hence funding from the 70s through the 80s.This was known as the AI Winter. Largely due to heavy academic emphasis on these methods

• The early 90s showed focus on statistical methods - commonly dubbed the Bayesian Revolution

• This lead to the proliferation and growth of machine learning

16

Page 17: Big Data & Artificial Intelligence

Machine Learning• Premise for machine learning:

• Have a dataset

• Have an algorithm f(D)

• f(D) applied to a dataset gives a new function (model) m(i)

• m(i) applied to any input i predicts an output o

17

D

f

Page 18: Big Data & Artificial Intelligence

Machine Learning (Pictorially)

18

D f m(i) o

1. The machine learning algorithm f is applied to the dataset D, giving the model m

2. For any input i, the model m predicts an output o

Page 19: Big Data & Artificial Intelligence

1) Supervised Learning

• D consists of pairs of input, output types: <i, o>

• The larger D the more generalized and accurate the end model m is

• Learn by example

3 Types of Machine Learning

19

D f m(i) o

Page 20: Big Data & Artificial Intelligence

3 Types of Machine Learning2) Unsupervised (Topological) Learning

• D consists of just inputs: <i>

• Generally end up with a partitioning of D

• Good at finding patterns20

D f m(i) o

Page 21: Big Data & Artificial Intelligence

3 Types of Machine Learning3) Reinforcement Learning

• You add some derivative of the output back to the initial dataset, and reoptimize your model

• Eg Learning to play chess by playing over and over again. Ideally the more you play the less you lose

21

D f m(i) o

?

Page 22: Big Data & Artificial Intelligence

Deep Learning• Deep Learning and Neural Nets are synonymous

• Deep Learning is a subset of machine learning, it is a class of functions f from the previous slides

• Deep learning algorithms take in a data set and spit out another function, or model, m

• Can be deployed in structured, unstructured, and reinforced contexts

22

Page 23: Big Data & Artificial Intelligence

Deep Learning

• First theorized and worked on in the 80s

• However, lacked the infrastructure and data to meaningfully deploy

• Has seen a massive resurgence 2009 onwards

• Loosely inspired by (vague) knowledge of brain - layers of abstraction23

Page 24: Big Data & Artificial Intelligence

Deep Learning• Useful for noisy, large, human generated data

• That is data for which, even the correct form of model input i can be tricky to characterize

• When I see a picture of a human face, I immediately recognize eyes, a nose and ears … hence a face

• When a computer receives the same image, it’s a rectangular grid of RGB values. How do we map the computer’s input space to our semantic space?

• Types of data that this makes sense for: Text, Visual (images & video), Audio, User behavior (my patterns on Twitter or Facebook), Basketball (player millisecond movement), etc…

24

Page 25: Big Data & Artificial Intelligence

Deep Learning

25

Image Models

Functions Artificial Neural Nets Can Learn

Audio: “sh ang hai res taur aun ts”

Good Fine-grained Classification

“hibiscus” “dahlia”

Sensible Errors

“snake” “dog”

Embeddings are Powerful

fallen

draw

fell

drawn

drew taketaken

took

givegiven

gave

fall

sentence rep

PCA

LSTM for End to End Translation

linearly separable!wrt subject vs object

Generating Image Captions from Pixels

Human: A young girl asleep on the sofa cuddling a stuffed bear.!

Model sample 1: A close up of a child holding a stuffed animal.!

Model sample 2: A baby is asleep next to a teddy bear.

Work in progress by Oriol Vinyals et al.

Generating Image Captions from Pixels

Human: A young girl asleep on the sofa cuddling a stuffed bear.!

Model sample 1: A close up of a child holding a stuffed animal.!

Model sample 2: A baby is asleep next to a teddy bear.

Work in progress by Oriol Vinyals et al.

Generating Image Captions from Pixels

Human: A young girl asleep on the sofa cuddling a stuffed bear.!

Model sample 1: A close up of a child holding a stuffed animal.!

Model sample 2: A baby is asleep next to a teddy bear.

Work in progress by Oriol Vinyals et al.

Page 26: Big Data & Artificial Intelligence

Current LandscapeGPUs, FPGAs, ASICs (User wants specialized deployments either for the learning function f or the end model m):

Select examples: Nervana Systems, TerraDeep, Qualcomm Neuromorphic Group

APIs, SDKs (USer wants to use prewritten algos on their datasets):

Select examples: Metamind, Skymind.io, Vicarious, Deep Mind

Vertical (Technology is black-boxed from user):

Select examples: Clarifai, Butterfly Networks, Binatix, etc…

26

Page 27: Big Data & Artificial Intelligence

Artificial Intelligence

27

Computational Logic & Planning

Machine Learning

Statistical Regressions

Deep Learning

etc…

Applications

• NLP • Computer Vision • Robotics • Audio • Sports • Genetics • Finance • Anomaly Detection

Page 28: Big Data & Artificial Intelligence

LearningsStatic software commoditizes

• Early big data infrastructure providers stagnating

• Google’s algorithms are essentially public (PageRank etc..)

• Deep Learning algos are an arms race & race to bottom

Defensibility and ability to grow into large 100M+ company is in owning proprietary data from which you can train better models and/or have network or scale effect

Why is now special? We’re sitting at the intersection of:

1. a matured big data infrastructure driven by well understood distributed storage and data access paradigms

2. data continues to explode. Not only though web, but also via noisy sensor and human generated data

3. have AI tools necessary to make sense of unstructured and noisy datasets whose features don’t map well to our a priori intuition

28

Page 29: Big Data & Artificial Intelligence

‘Virtuous’ Feedback LoopsGoing back to Google:

29

CCC C

D

f m(i) o

D’

Page 30: Big Data & Artificial Intelligence

‘Virtuous’ Feedback LoopsGoing back to Google:

30

CCC C

D

f m(i) o

D’

CommoditizedCommoditized

Page 31: Big Data & Artificial Intelligence

Feedback Loops• Google collects click-data with each user - this enables better search

for next user: n+1th user has a better experience than nth user • Google increases margin from competition the more we use it • Leads to a run-away effect • Can explain Google’s monopoly in search • Same analogy with Facebook/Twitter-adds and other large tech co’s • Prediction: Early movers who can bootstrap initial feedback loop will

be big, potentially winner-take-all, winners

31

Page 32: Big Data & Artificial Intelligence

Data —> Infrastructure —> Enables more Data —> Analytics, Applications, & Artificial Intelligence

32

Page 33: Big Data & Artificial Intelligence

Empirical Timeline

MapReduce

Page 34: Big Data & Artificial Intelligence

Empirical Timeline

Hadoop

Page 35: Big Data & Artificial Intelligence

Empirical Timeline

Big Data

Page 36: Big Data & Artificial Intelligence

Empirical Timeline

Deep Learning