introduction to artificial intelligence · - multilayer perceptron (mlp) - feed forward neural...

Introduction to Artificial Intelligence

COMP307 Neural Network 1: perceptron

Yi Meiyi.mei@ecs.vuw.ac.nz

Outline• Why ANN?• Origin• Perceptron• Perceptron learning• What can (not) perceptron learn• Extending perceptron

Why ANN?

• Killer applications in a lot of areas– Computer Vision/Image processing– Game playing (AlphaGo, Watson, …)– Big data

Origin• Human brain shows amazing capability in

– Learning– Perception– Adaptability– …

• Simulate human brain to achieve the above functionalities• ANN models for this purpose

Origin

Artificial Neuron

zj =mX

wjixi + bj

yj = '(zj)

Activation Functions

Threshold Sigmoid

(1, if zj > 0,

0, otherwiseyj =

1 + e��zj

Perceptron• A special type of artificial neuron

– Real-valued inputs– Binary output– Threshold activation function

(1, if

Pmi=1 wjixi + bj > 0,

0, otherwise

(1, if x1 + x2 > 0,

0, otherwise.

Perceptron• To perform linear classification

– 2 inputs: a line– 3 inputs: a plane

• Can do online learning– Update 𝑤"$ and 𝑏" along with new examples

(1, if

Pmi=0 wjixi > 0,

0, otherwise

Learning Perceptron• How to get the optimal weights and bias?• Only consider accuracy

– Optimal if 100% accuracy on training set– Can have many optimal solutions

• To facilitate notation, transform bias to a weight 𝑤!" = 𝑏!, and 𝑥" = 1 always holds

(1, if

Pmi=1 wjixi + bj > 0,

0, otherwise

bj = wj0 · 1 = wj0x0

Learning Perceptron• Idea

– Initialise weights and threshold randomly (or all zeros)– Given a new example 𝑥', 𝑥), … , 𝑥+, 𝑑

• Input feature vector: 𝑥', 𝑥), … , 𝑥+• Output (class label): 𝑑• Predicted output 𝑦

– If 𝑦 = 0 and 𝑑 = 1, • increase 𝑏 = 𝑤1, increase 𝑤$ for each positive 𝑥$, decrease 𝑤$ for each

negative 𝑥$, – If 𝑦 = 1 and 𝑑 = 0,

• decrease 𝑏 = 𝑤1, decrease 𝑤$ for each positive 𝑥$, increase 𝑤$ for each negative 𝑥$,

– Repeat the process for each new example until the desired behaviour is achieved

(1, if

Pmi=0 wixi > 0,

0, otherwise

Learning Perceptron• Implementation

– Initialise weights and threshold randomly (or all zeros)– Given a new example 𝑥', 𝑥), … , 𝑥+, 𝑑

• Input feature vector: 𝑥', 𝑥), … , 𝑥+• Output (class label): 𝑑• Predicted output 𝑦

– Where 𝜂 ∈ [0,1] is called the learning rate– Repeat the process for each new example until the desired

behaviour is achieved

wi wi + ⌘(d� y)xi, i = 0, 1, 2, . . . ,m

(1, if

Pmi=0 wixi > 0,

0, otherwise

Problem with Perceptron• What can the perceptron learn?

• Perceptron convergence theorem: The perceptron learning algorithm will converge if and only if the training set is linearly separable.

• Cannot learn for XOR (Minsky and Papert, 1969)15

Multi-Layer Perceptron• Add one hidden node between the inputs and output

Threshold

x1 x2 y (class)

1 1 0 y =

(1, if

Pi wixi � T > 0,

0, otherwise

Neural Network• Add more hidden layers• Add more nodes for hidden layers

COMP307 ML5 (NNs):

Multilayer Perceptron (Neural Networks) • Change one or two layers of nodes to three or more layers

- Multilayer perceptron (MLP)

- Feed forward neural networks

- Standard feed forward networks: nodes in neighbouring layers are fully connected

- Input layer: input patterns/features

- Output layer: output patterns/class labels

- Hidden layer(s): high level features

3COMP307 ML5 (Neural Networks): 3

Multilayer Perceptron (Neural Networks)

• Change one or two layers of nodes to three or more layers

– Multilayer perceptron (MLP)

– Feed forward neural networks

– Standard feed forward networks: nodes in neighbouring layers are fully

connected

layerHidden Hidden OutputInput

layer layer layer

– Input layer: input patterns/features

– Output layer: output patterns/class labels

– Hidden layer(s): high level features

Summary• ANN has many killer applications

– Image analysis, game playing, …

• Perceptron – the simplest neural network• How to learn a perceptron• Limitation of perceptron, and general neural

network

• Reading: Text book section 20.5 (2nd edition) or section 18.7 (3rd edition) or Web materials

introduction to artificial intelligence · - multilayer perceptron (mlp) - feed forward neural...

Documents

feed-forward mapping networks -...

neural networks part ii feed-forward neural networks

texture networks: feed-forward synthesis of...

neural networks chapter 8. 8.1 feed-forward neural networks

feed-forward artificial neural networks

diversified texture synthesis with feed-forward networks

viet - prediction of natural gas consumption with...

robotic vision -...

a comparative study of power distribution over voltages...

course 395: machine learning - lectures · 3-layer...

c.m. bishop's prml: chapter 5; neural...

feed up, feed back & feed forward teachmeet, term 3 2012

neural networks - greg mori - cmpt...

artificial neural networks. 2 outline what are neural...

feed-forward neural networks - mathematical models based on...

teaching feed-forward neural networks by simulated annealing

improved machine performance with feed forward control feed...

bioinspired computing lecture 9 artificial neural networks:...

a feed-forward neural networks-based nonlinear...

feed-forward neural networks and genetic algorithms for