neural networkshic/cs7616/pdf/lecture10.pdf · introduction neural networks - architecture network...

28
Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary Neural Networks Henrik I Christensen Robotics & Intelligent Machines @ GT Georgia Institute of Technology, Atlanta, GA 30332-0280 [email protected] Henrik I Christensen (RIM@GT) Neural Networks 1 / 28

Upload: others

Post on 07-Jul-2020

7 views

Category:

Documents


1 download

TRANSCRIPT

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neural Networks

    Henrik I Christensen

    Robotics & Intelligent Machines @ GTGeorgia Institute of Technology,

    Atlanta, GA [email protected]

    Henrik I Christensen (RIM@GT) Neural Networks 1 / 28

    mailto:[email protected]

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    1 Introduction

    2 Neural Networks - Architecture

    3 Network Training

    4 Small Example - ZIP Codes

    5 Summary

    Henrik I Christensen (RIM@GT) Neural Networks 2 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Introduction

    Initial motivation for design from modelling of neural systems

    Perceptrons emerged about same time as we started to have realneural data

    Studies of functional specialization in the brain

    Henrik I Christensen (RIM@GT) Neural Networks 3 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neurons - the motivation

    Henrik I Christensen (RIM@GT) Neural Networks 4 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neural Code Example

    Henrik I Christensen (RIM@GT) Neural Networks 5 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    Outline of ANN architecture

    Formulation of the criteria function

    Optimization of weights

    Example from image analysis

    Next time: Bayesian Neural Networks

    Henrik I Christensen (RIM@GT) Neural Networks 6 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    1 Introduction

    2 Neural Networks - Architecture

    3 Network Training

    4 Small Example - ZIP Codes

    5 Summary

    Henrik I Christensen (RIM@GT) Neural Networks 7 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Data Process w. Two-Layer Neural Network

    wTx h(.) wTz σ(x)

    Henrik I Christensen (RIM@GT) Neural Networks 8 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neural Net Architecture as a Graph

    x0

    x1

    xD

    z0

    z1

    zM

    y1

    yK

    w(1)MD w

    (2)KM

    w(2)10

    hidden units

    inputs outputs

    Henrik I Christensen (RIM@GT) Neural Networks 9 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neural Network Equations

    Consider an input layer

    aj =D∑

    i=0

    w(1)ji xi

    where wj0 and x0 represent the bias weight / term

    The activation, aj , is mapped by an activation function

    zj = h(aj)

    which typically is a Sigmoid or tanh

    The output is considered the hidden activations

    Output unit activations are computed, similarly

    ak =M∑

    j=0

    w(2)kj zj

    Henrik I Christensen (RIM@GT) Neural Networks 10 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Sigmoid - 1/(1 + exp(−v))

    −10 −8 −6 −4 −2 0 2 4 6 8 100

    0.1

    0.2

    0.3

    0.4

    0.5

    0.6

    0.7

    0.8

    0.9

    1

    value

    sigm

    oid

    Sigmoid

    Henrik I Christensen (RIM@GT) Neural Networks 11 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Neural Networks - A few more details

    The full system is then

    yk(x ,w) = σ

    M∑j=0

    w(2)kj h

    (D∑

    i=0

    w(1)ji xi

    )The information is flowing “forward” through the system

    Naming is sometimes complicated!

    3-layer networksingle-hidden-layer networktwo-layer network (input/output)

    Henrik I Christensen (RIM@GT) Neural Networks 12 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    1 Introduction

    2 Neural Networks - Architecture

    3 Network Training

    4 Small Example - ZIP Codes

    5 Summary

    Henrik I Christensen (RIM@GT) Neural Networks 13 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Training Neural Networks

    For optimization we consider the error function:

    E (w) =N∑

    n=1

    ||y(xn,w)− tn||2

    The optimization is similar to earlier searches

    Objective∇E (w) = 0

    Due to non-linearity closed form solution is a challenge

    Newton-Raphson type solutions are possible

    ∆w = −H−1∇Ew

    Often an iterated solution is realistic

    w (τ+1) = w (τ) − η∇E (w (τ))

    Henrik I Christensen (RIM@GT) Neural Networks 14 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Error Backpropagation

    Consider the error composed of parts

    E (w) =N∑

    n=1

    En(w)

    Considering errors by parts we get

    yk =∑

    i

    wkixi

    with the error

    En =1

    2

    ∑k

    (ynk − tnk)2

    the associated gradient is

    ∂En∂wji

    = (ynj − tnj)xni

    Henrik I Christensen (RIM@GT) Neural Networks 15 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Computing gradients

    Givenaj =

    ∑i

    wjizi

    andzj = h(aj)

    The gradient is (using chain rule)

    ∂En∂wji

    =∂En∂aj

    ∂aj∂wji

    We already know∂En∂aj

    = (yk − tj) = δj

    and∂aj∂wji

    = zi

    Henrik I Christensen (RIM@GT) Neural Networks 16 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Updating of weights

    Updating backwards in the systems

    zi

    zj

    δjδk

    δ1

    wji wkj

    Error Propagation

    δj = h′(aj)

    ∑k

    wkjδk

    Henrik I Christensen (RIM@GT) Neural Networks 17 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Update Algorithm

    1 Enter a training sample xn, propagate and compare to expected valuetn, y(xn)

    2 Evaluate δk at all outputs

    3 Backpropagate δ to correct hidden unit weights

    4 Evaluate derivatives to correct input level weights

    Henrik I Christensen (RIM@GT) Neural Networks 18 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Issues related to training of networks

    The Sigmoid is “linear” at 0 so random values around 0 is a goodstart.

    Be aware that training a network too much could result in over fitting

    There can be multiple hidden layers

    Henrik I Christensen (RIM@GT) Neural Networks 19 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    1 Introduction

    2 Neural Networks - Architecture

    3 Network Training

    4 Small Example - ZIP Codes

    5 Summary

    Henrik I Christensen (RIM@GT) Neural Networks 20 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Small Example

    From (Le Cun 1989) on state of the art of ANN’s for recognition

    Recognition of handwritten characters has been widely studied

    Still considered an important benchmark for new recognition methods

    Henrik I Christensen (RIM@GT) Neural Networks 21 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    ZIP code data

    Data normalized to 16x16 pixels

    320 digits in training set and 160 digits in test set

    Henrik I Christensen (RIM@GT) Neural Networks 22 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Different types of networks

    No hidden layer - pure 1 level regression

    1 hidden layer with 12 hidden units - fully connected

    2 hidden layers and local connectivity

    2 hidden layers, locally connected and weight sharing

    2 hidden layers, locally connected and 2 level weight sharing

    Henrik I Christensen (RIM@GT) Neural Networks 23 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Example - Net Architectures

    Henrik I Christensen (RIM@GT) Neural Networks 24 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Example Results

    Henrik I Christensen (RIM@GT) Neural Networks 25 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Example - Summary

    Careful design of network architectures is important

    Neural Networks offer a rich variety of solutions

    Later results have shown improved performance with SVN’s

    Henrik I Christensen (RIM@GT) Neural Networks 26 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Outline

    1 Introduction

    2 Neural Networks - Architecture

    3 Network Training

    4 Small Example - ZIP Codes

    5 Summary

    Henrik I Christensen (RIM@GT) Neural Networks 27 / 28

  • Introduction Neural Networks - Architecture Network Training Small Example - ZIP Codes Summary

    Summary

    Neural networks are general approximators

    Useful both for regression and discrimination

    Some would term them - “self-parameterized lookup tables”

    There is a rich community engaged in design of systems

    Rich variety of optimization techniques

    Henrik I Christensen (RIM@GT) Neural Networks 28 / 28

    IntroductionNeural Networks - ArchitectureNetwork TrainingSmall Example - ZIP CodesSummary