angular and deep learning

57
Deep Learning and Angular Angular Meetup (06/14/2017) Google (Mountain View) Oswald Campesato [email protected]

Upload: oswald-campesato

Post on 17-Mar-2018

991 views

Category:

Software


4 download

TRANSCRIPT

Page 1: Angular and Deep Learning

Deep Learning and Angular

Angular Meetup (06/14/2017)

Google (Mountain View)

Oswald Campesato

[email protected]

Page 2: Angular and Deep Learning

The Data/AI Landscape

Page 3: Angular and Deep Learning

Gartner Hype Curve: Where is Deep Learning?

Page 4: Angular and Deep Learning

The Impact of AI

“Robot trucks will kill far fewer people (if any).

Machines don’t get distracted or look at phones

instead of the road.

Machines don’t drink alcohol, do drugs, or things that

contribute to accidents.”

Robot trucks don’t need salaries, vacations, health

insurance, rest periods, or sick time.

The only costs will be upkeep of the machinery.

Page 5: Angular and Deep Learning

AI/ML/DL: How They Differ

Traditional AI (20th century):

based on collections of rules

Led to expert systems in the 1980s

The era of LISP and Prolog

Page 6: Angular and Deep Learning

AI/ML/DL: How They Differ

Machine Learning:

Started in the 1950s (approximate)

Alan Turing and “learning machines”

Data-driven (not rule-based)

Many types of algorithms

Involves optimization

Page 7: Angular and Deep Learning

AI/ML/DL: How They Differ

Deep Learning:

Started in the 1950s (approximate)

The “perceptron” (basis of NNs)

Data-driven (not rule-based)

large (even massive) data sets

Involves neural networks (CNNs: ~1970s)

Lots of heuristics

Heavily based on empirical results

Page 8: Angular and Deep Learning

The Rise of Deep Learning

Massive and inexpensive computing power

Huge volumes of data/Powerful algorithms

The “big bang” in 2009:

”deep-learning neural networks and NVidia GPUs"

Google Brain used NVidia GPUs (2009)

Page 9: Angular and Deep Learning

AI/ML/DL: Commonality

All of them involve a model

A model represents a system

Goal: a good predictive model

The model is based on:

Many rules (for AI)

data and algorithms (for ML)

large sets of data (for DL)

Page 10: Angular and Deep Learning

A Basic Model in Machine Learning

Let’s perform the following steps:

1) Start with a simple model (2 variables)

2) Generalize that model (n variables)

3) See how it might apply to a NN

Page 11: Angular and Deep Learning

Linear Regression

One of the simplest models in ML

Fits a line (y = m*x + b) to data in 2D

Finds best line by minimizing MSE:

m = average of x values (“mean”)

b also has a closed form solution

Page 12: Angular and Deep Learning

Linear Regression in 2D: example

Page 13: Angular and Deep Learning

Linear Regression: alternatives

Fitting a polynomial (degree 2, 3, …)

Can lead to overfitting

Polynomials diverge faster than lines

Can reduce predictive accuracy

NB: Linear Regression != Curve Fitting

Page 14: Angular and Deep Learning

Linear Regression: example #1

One feature (independent variable):

X = number of square feet

Predicted value (dependent variable):

Y = cost of a house

A very “coarse grained” model

We can devise a much better model

Page 15: Angular and Deep Learning

Linear Regression: example #2

Multiple features:

X1 = # of square feet

X2 = # of bedrooms

X3 = # of bathrooms (dependency?)

X4 = age of house

X5 = cost of nearby houses

X6 = corner lot (or not): Boolean

a much better model (6 features)

Page 16: Angular and Deep Learning

Linear Multivariate Analysis

General form of multivariate equation:

Y = w1*x1 + w2*x2 + . . . + wn*xn + b

w1, w2, . . . , wn are numeric values

x1, x2, . . . , xn are variables (features)

Properties of variables:

Can be independent (Naïve Bayes)

weak/strong dependencies can exist

Page 17: Angular and Deep Learning

Neural Network with 3 Hidden Layers

Page 18: Angular and Deep Learning

Neural Networks: equations

Node “values” in first hidden layer:

N1 = w11*x1+w21*x2+…+wn1*xn

N2 = w12*x1+w22*x2+…+wn2*xn

N3 = w13*x1+w23*x2+…+wn3*xn

. . .

Nn = w1n*x1+w2n*x2+…+wnn*xn

Similar equations for other pairs of layers

Page 19: Angular and Deep Learning

Neural Networks: Matrices

From inputs to first hidden layer:

Y1 = W1*X + B1 (X/Y1/B1: vectors; W1: matrix)

From first to second hidden layers:

Y2 = W2*X + B2 (X/Y2/B2: vectors; W2: matrix)

From second to third hidden layers:

Y3 = W3*X + B3 (X/Y3/B3: vectors; W3: matrix)

Apply an “activation function” to y values

Page 20: Angular and Deep Learning

Neural Networks (general)

Multiple hidden layers:

Layer composition is your decision

Activation functions: sigmoid, tanh, RELU

https://en.wikipedia.org/wiki/Activation_function

Back propagation (1980s)

https://en.wikipedia.org/wiki/Backpropagation

=> Initial weights: small random numbers

Page 21: Angular and Deep Learning

Activation Functions (Examples)

import numpy as np

...

# Python sigmoid example:

z = 1/(1 + np.exp(-np.dot(W, x)))

...# Python tanh example:

z = np.tanh(np.dot(W,x));

# Python ReLU example:

z = np.maximum(0, np.dot(W, x))

Page 22: Angular and Deep Learning

What’s the “Best” Activation Function?

Initially sigmoid was popular

then tanh became popular

Now RELU is preferred (better results)

NB: sigmoid + tanh are used in LSTMs

Page 23: Angular and Deep Learning

Sample Cost Function #1

Page 24: Angular and Deep Learning

Sample Cost Function #2

Page 25: Angular and Deep Learning

How to Select a Cost Function

1) Depends on the learning type:

=> supervised/unsupervised/RL

2) Depends on the activation function

3) Other factors

Example:

cross-entropy cost function for supervised

learning on multiclass classification

Page 26: Angular and Deep Learning

GD versus SGD

SGD (Stochastic Gradient Descent):

+ involves a SUBSET of the dataset

+ aka Minibatch Stochastic Gradient Descent

GD (Gradient Descent):

+ involves the ENTIRE dataset

More details:

http://cs229.stanford.edu/notes/cs229-notes1.pdf

Page 27: Angular and Deep Learning

What are Hyper Parameters?

higher level concepts about the model such as

complexity, or capacity to learn

Cannot be learned directly from the data in the

standard model training process

must be predefined

Page 28: Angular and Deep Learning

Hyper Parameters (examples)

# of hidden layers in a neural network

the learning rate (in many models)

# of leaves or depth of a tree

# of latent factors in a matrix factorization

# of clusters in a k-means clustering

Page 29: Angular and Deep Learning

How Many Layers in a DNN?

Algorithm #1 (from Geoffrey Hinton):

1) add layers until you start overfitting your

training set

2) now add dropout or some another

regularization method

Algorithm #2 (Yoshua Bengio):

"Add layers until the test error does not improve

anymore.”

Page 30: Angular and Deep Learning

How Many Hidden Nodes in a DNN?

Based on a relationship between:

# of input and # of output nodes

Amount of training data available

Complexity of the cost function

The training algorithm

Page 31: Angular and Deep Learning

Use Cases for Neural Networks

CNNs (Convolutional NNs):

Good for image processing

2000: CNNs processed 10-20% of all checks

=> Approximately 60% of all NNs

RNNs (Recurrent NNs):

Good for NLP and audio

Page 32: Angular and Deep Learning

CNN: Sample Filters

Page 33: Angular and Deep Learning

CNN Filters (examples)

Page 34: Angular and Deep Learning

Types of RNNs

LSTMs (Long Short Term Memory)

GRUs

ResNets (Residual NNs)

Page 35: Angular and Deep Learning

Features of LSTMs

Used in Google speech recognition + Alpha Go

input/output/forget gates

they avoid the vanishing gradient problem

Can track 1000s of discrete time steps

Used by international competition winners

Often combined with CTC

Page 36: Angular and Deep Learning

Inside an LSTM

Page 37: Angular and Deep Learning

Inside an LSTM

Page 38: Angular and Deep Learning

Inside an LSTM

Page 39: Angular and Deep Learning

Keras/LSTM Code Snippet

import numpy

from keras.datasets import imdb

from keras.models import Sequential

from keras.layers import Dense

from keras.layers import LSTM

from keras.layers.embeddings import Embedding

from keras.preprocessing import sequence

...

Page 40: Angular and Deep Learning

GANs: Generative Adversarial Networks

Page 41: Angular and Deep Learning

GANs: Generative Adversarial Networks

Make imperceptible changes to images

Can consistently defeat all NNs

Can have extremely high error rate

Some images create optical illusions

https://www.quora.com/What-are-the-pros-and-cons-of-using-generative-adversarial-networks-a-type-of-neural-network

Page 42: Angular and Deep Learning

ML/DL Frameworks

Caffe (templates instead of code)

Theano (influenced TensorFlow)

Tensorflow

TensorFlow Lite (release date?)

Keras (“layer” over Theano+TF)

Tefla (mini framework over TF)

Torch (Lua) + PyTorch (Facebook)

MxNET (Amazon)

CNTK (Microsoft)

Page 43: Angular and Deep Learning

Languages for ML/DL

Popular languages for ML:

R (popular among statisticians)

Python (sklearn/pandas/etc)

Popular languages for DL:

Python (Keras/Theano/TF modules)

some Java/C++/Go

Page 44: Angular and Deep Learning

“Challenges” in Deep Learning

overfitting/underfitting of a model

vanishing/exploding gradient

learning rate (too high or too low)

Debugging NNs (good luck)

Page 45: Angular and Deep Learning

Miscellaneous Topics

* Data versus algorithms:

Option A: good data + average algorithm

Option B: average data + good algorithm

=> Option A is preferred over Option B

• “Cleaning” a dataset:

De-duplicate and fix invalid/missing data (how?)

* Dimensionality reduction:

eliminate “unimportant” features (columns)

Page 46: Angular and Deep Learning

Miscellaneous Topics

* XOR requires two hidden layers to solve (why?)

• A dataset whose columns are interchangeable cannot be

solved with a CNN (why?)

• Second generation TPUs

• TensorFlow Lite (open source later in 2017)

www.tensorflow.org/tutorials

Page 47: Angular and Deep Learning

D3 Fun Samples

D3 Animation effects:

MouseMoveFadeAnim1Back1.html

SVG tiger:

svg-tiger-d3.svg

D3 and SVG tiger:

svg-tiger-d3.html

Page 48: Angular and Deep Learning

Deep Learning Playground

TF playground home page:

http://playground.tensorflow.org

Demo #1:

https://github.com/tadashi-aikawa/typescript-

playground

Converts playground to TypeScript

Page 49: Angular and Deep Learning

D3/TypeScript/Deep Learning

Download playground_master.zip

npm install

npm start

Demo converts playground to TypeScript

Page 50: Angular and Deep Learning

D3/TypeScript/Deep Learning

TypeScript files in ‘src’ directory:

state.ts

seedrandom.d.ts

playground.ts

linechart.ts

heatmap.ts

dataset.ts

nn.ts (<= activations/nodes in a neural net)

Page 51: Angular and Deep Learning

Activations in TypeScript (nn.ts)

export class Activations {

public static TANH: ActivationFunction = {

output: x => (Math as any).tanh(x),

der: x => {

let output = Activations.TANH.output(x);

return 1 - output * output;

} }; public static RELU: ActivationFunction = {

output: x => Math.max(0, x), der: x => x <= 0 ? 0 : 1

};

Page 52: Angular and Deep Learning

Activations in TypeScript (nn.ts)

public static SIGMOID: ActivationFunction = {

output: x => 1 / (1 + Math.exp(-x)), der: x => {

let output = Activations.SIGMOID.output(x);

return output * (1 - output);

} }; public static LINEAR: ActivationFunction = {

output: x => x, der: x => 1

}; }

Page 53: Angular and Deep Learning

Angular/Deep Learning App (Demo #2)

Create NGDeepLearning via ‘ng’

Copy ./src/*ts files from playground_master into NGDeepLearning/src subdirectory

Merge the two package.json files

Merge the two index.html files

install d3: npm install d3 --save

Page 54: Angular and Deep Learning

Angular/Deep Learning

Add import * as d3 from 'd3’; to the files:

dataset.ts

heatmap.ts

linechart.ts

playground.ts

Launch the app: ng serve

Page 55: Angular and Deep Learning

Deep Learning and Art/”Stuff”

“Convolutional Blending” images:

=> 19-layer Convolutional Neural Network

www.deepart.io

Bots created their own language:

https://www.recode.net/2017/3/23/14962182/ai-learning-language-open-ai-research

https://www.fastcodesign.com/90124942/this-google-engineer-taught-an-algorithm-to-make-train-footage-and-its-hypnotic

Page 56: Angular and Deep Learning

About Me

I provide training for the following:

=> Deep Learning/TensorFlow/Keras

=> Android

=> Angular 4

Page 57: Angular and Deep Learning

Recent/Upcoming Books

1) HTML5 Canvas and CSS3 Graphics (2013)

2) jQuery, CSS3, and HTML5 for Mobile (2013)

3) HTML5 Pocket Primer (2013)

4) jQuery Pocket Primer (2013)

5) HTML5 Mobile Pocket Primer (2014)

6) D3 Pocket Primer (2015)

7) Python Pocket Primer (2015)

8) SVG Pocket Primer (2016)

9) CSS3 Pocket Primer (2016)

10) Android Pocket Primer (2017)

11) Angular Pocket Primer (2017)