neural networks - deep learning...neural networks - deep learning arti cial intelligence @ allegheny...

48
Neural Networks - Deep Learning Artificial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 1 / 35

Upload: others

Post on 20-May-2020

52 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks - Deep Learning

Artificial Intelligence @ Allegheny College

Janyl Jumadinova

March 4-6, 2020

Credit: Google Workshop

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 1 / 35

Page 2: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 2 / 35

Page 3: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 3 / 35

Page 4: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Neural computing requires a number of neurons, to be connectedtogether into a neural network.

Neurons are arranged in layers.

Two main hyperparameters that control the architecture or topology ofthe network: 1) the number of layers, and 2) the number of nodes in eachhidden layer.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 4 / 35

Page 5: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Neural computing requires a number of neurons, to be connectedtogether into a neural network.

Neurons are arranged in layers.

Two main hyperparameters that control the architecture or topology ofthe network: 1) the number of layers, and 2) the number of nodes in eachhidden layer.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 4 / 35

Page 6: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Activation Functions

The activation function is generally non-linear.

Linear functions are limited because the output is simply proportionalto the input.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 5 / 35

Page 7: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Activation Functions

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 6 / 35

Page 8: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Network structures

Two phases in each iteration:

1 Calculating the predicted output y, known as feed-forward

2 Updating the weights and biases, known as backpropagation

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 7 / 35

Page 9: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Feed-forward example

Feed-forward networks:

Single-layer perceptronsMulti-layer perceptrons

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 8 / 35

Page 10: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Feed-forward example

Feed-forward networks:

Single-layer perceptronsMulti-layer perceptrons

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 8 / 35

Page 11: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Single-layer Perceptrons

Output units all operate separately – no shared weights.

Adjusting weights moves the location, orientation, and steepness of cliff.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 9 / 35

Page 12: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Multi-layer Perceptrons

Layers are usually fully connected.

Numbers of hidden units typically chosen by hand.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 10 / 35

Page 13: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 11 / 35

Page 14: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 12 / 35

Page 15: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Neural Networks

A fully connected NN layer

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 13 / 35

Page 16: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Implementation as Matrix Multiplication

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 14 / 35

Page 17: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Non-Linear Data Distributions

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 15 / 35

Page 18: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 16 / 35

Page 19: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Deep Learning

Most current machine learning works well because of human-designedrepresentations and input features.

Machine learning becomes just optimizing weights to best make afinal prediction.

Deep learning algorithms attempt to learn multiple levels ofrepresentation of increasing complexity/abstraction.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 17 / 35

Page 20: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Deep Learning

Most current machine learning works well because of human-designedrepresentations and input features.

Machine learning becomes just optimizing weights to best make afinal prediction.

Deep learning algorithms attempt to learn multiple levels ofrepresentation of increasing complexity/abstraction.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 17 / 35

Page 21: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Deep Learning

Each neuron implements a relatively simple mathematical function.

y = g(w · x + b)

The composition of 106 − 109 such functions is powerful.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 18 / 35

Page 22: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Deep Learning

Each neuron implements a relatively simple mathematical function.

y = g(w · x + b)

The composition of 106 − 109 such functions is powerful.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 18 / 35

Page 23: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Deep Learning

Book: http://www.deeplearningbook.org/

Chapter 5

“A core idea in deep learning is that we assume that the data wasgenerated by the composition of factors or features, potentially at multiplelevels in a hierarchy.”

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 19 / 35

Page 24: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Results get better (to a degree) with:

more data

bigger models

more computation

Better algorithms, new insights and improved methods help, too!

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 20 / 35

Page 25: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Results get better (to a degree) with:

more data

bigger models

more computation

Better algorithms, new insights and improved methods help, too!

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 20 / 35

Page 26: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 21 / 35

Page 27: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Adoption of Deep Learning Tools on GitHub

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 22 / 35

Page 28: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Epoch: a training iteration (one pass through the dataset).

Batch: Portion of the dataset (number of samples after dataset hasbeen divided).

Regularization: a set of techniques that helps learning models toconverge(http://www.godeep.ml/regularization-using-tensorflow/).

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 23 / 35

Page 29: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Operates over tensors: n-dimensional arrays

Using a flow graph: data flow computation framework

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 24 / 35

Page 30: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Operates over tensors: n-dimensional arrays

Using a flow graph: data flow computation framework

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 24 / 35

Page 31: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Operates over tensors: n-dimensional arrays

Using a flow graph: data flow computation framework

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 24 / 35

Page 32: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

5.7 ← Scalar

Number, Float, etc.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 25 / 35

Page 33: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 26 / 35

Page 34: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 27 / 35

Page 35: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Tensors have a Shape that is described with a vector

[1000, 256, 256, 3]

10000 Images

Each Image has 256 Rows

Each Row has 256 Pixels

Each Pixel has 3 values (RGB)

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 28 / 35

Page 36: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Tensors have a Shape that is described with a vector

[1000, 256, 256, 3]

10000 Images

Each Image has 256 Rows

Each Row has 256 Pixels

Each Pixel has 3 values (RGB)

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 28 / 35

Page 37: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Computation is a dataflow graph

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 29 / 35

Page 38: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Computation is a dataflow graph with tensors

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 30 / 35

Page 39: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Computation is a dataflow graph with state

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 31 / 35

Page 40: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Core TensorFlow data structures and concepts

Graph: A TensorFlow computation, represented as a dataflow graph:- collection of ops that may be executed together as a group.

Operation: a graph node that performs computation on tensors

Tensor: a handle to one of the outputs of an Operation:- provides a means of computing the value in a TensorFlow Session.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 32 / 35

Page 41: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Core TensorFlow data structures and concepts

Graph: A TensorFlow computation, represented as a dataflow graph:- collection of ops that may be executed together as a group.

Operation: a graph node that performs computation on tensors

Tensor: a handle to one of the outputs of an Operation:- provides a means of computing the value in a TensorFlow Session.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 32 / 35

Page 42: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

Core TensorFlow data structures and concepts

Graph: A TensorFlow computation, represented as a dataflow graph:- collection of ops that may be executed together as a group.

Operation: a graph node that performs computation on tensors

Tensor: a handle to one of the outputs of an Operation:- provides a means of computing the value in a TensorFlow Session.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 32 / 35

Page 43: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Constants

Placeholders: must be fed with data on execution.

Variables: a modifiable tensor that lives in TensorFlow’s graph ofinteracting operations.

Session: encapsulates the environment in which Operation objectsare executed, and Tensor objects are evaluated.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 33 / 35

Page 44: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Constants

Placeholders: must be fed with data on execution.

Variables: a modifiable tensor that lives in TensorFlow’s graph ofinteracting operations.

Session: encapsulates the environment in which Operation objectsare executed, and Tensor objects are evaluated.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 33 / 35

Page 45: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Constants

Placeholders: must be fed with data on execution.

Variables: a modifiable tensor that lives in TensorFlow’s graph ofinteracting operations.

Session: encapsulates the environment in which Operation objectsare executed, and Tensor objects are evaluated.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 33 / 35

Page 46: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Constants

Placeholders: must be fed with data on execution.

Variables: a modifiable tensor that lives in TensorFlow’s graph ofinteracting operations.

Session: encapsulates the environment in which Operation objectsare executed, and Tensor objects are evaluated.

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 33 / 35

Page 47: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 34 / 35

Page 48: Neural Networks - Deep Learning...Neural Networks - Deep Learning Arti cial Intelligence @ Allegheny College Janyl Jumadinova March 4-6, 2020 Credit: Google Workshop Janyl Jumadinova

TensorFlow

https://playground.tensorflow.org

Janyl Jumadinova Neural Networks - Deep Learning March 4-6, 2020 35 / 35