data mining and semantic web

Neural Networks: Backpropagation algorithm

Data Mining and Semantic Web

University of BelgradeSchool of Electrical Engineering Chair of Computer Engineering and Information Theory

Miroslav Tiš[email protected]

You see this:

But the camera sees this:

What is this?

23.12.2011. Miroslav Tišma 2/21

Computer Vision: Car detection

Testing:

What is this?

Not a carCars


pixel 1

pixel 2

Raw image

Cars“Non”-Cars

50 x 50 pixel images→ 2500 pixels (7500 if RGB)

pixel 1 intensitypixel 2 intensity

pixel 2500 intensity

Quadratic features ( ): ≈3 million features

Learning Algorithm

pixel 1

pixel 2


Neural Networks

• Origins: Algorithms that try to mimic the brain

• Was very widely used in 80s and early 90s; popularity diminished in late 90s.

• Recent resurgence: State-of-the-art technique for many applications


Neurons in the brain

Dendr(I)tes

Ax(O)n


Neuron model: Logistic unit

Sigmoid (logistic) activation function.

hΘ (𝑥 )= 11+𝑒−Θ

𝑇 𝑥

𝑔 (𝑧 )= 11+𝑒− 𝑧

“bias unit”

“output”

“input wires”

“weights” - parameters


Neural Network

Layer 3Layer 1 Layer 2

“bias unit”

“output layer”“hidden layer”“input layer”23.12.2011. Miroslav Tišma 8/21

Neural Network“activation” of unit in layer

matrix of weights controlling function mapping from layer to layer

If network has units in layer , units in layer , then will be of dimension .


Simple example: AND

0 00 11 01 1

-30

+20

+20

hΘ (𝑥 )=𝑔(−30+20 𝑥1+20 𝑥2)

hΘ (𝑥 )≈ 𝑥1𝐴𝑁𝐷 𝑥223.12.2011. Miroslav Tišma 10/21

Example: OR function

0 00 11 01 1

-10

+20

+20

hΘ (𝑥 )=𝑔(−10+20 𝑥1+20𝑥2)hΘ (𝑥 )≈ 𝑥1𝑂𝑅𝑥2


Multiple output units: One-vs-all.

Pedestrian Car Motorcycle Truck

Want , when pedestrian 23.12.2011. Miroslav Tišma 12/21

when car when motorcycle

, etc.,

Neural Network (Classification)

Binary classification

1 output unit

Layer 1 Layer 2 Layer 3 Layer 4

Multi-class classification (K classes)

K output units

total no. of layers in network

no. of units (not counting bias unit) in layer

pedestrian car motorcycle truck

E.g. , , ,


Cost function

Logistic regression:


Neural network:

Gradient computation

Need code to compute:- -


Our goal is to minimize the cost function

Given one training example ( , ):Forward propagation:


𝑎 (1 ) 𝑎 (2) 𝑎 (3) 𝑎 (4 )


Backpropagation algorithm

Backpropagation algorithm

Intuition: “error” of node in layer .


For each output unit (layer L = 4) 𝛿(4 )𝛿(3 )𝛿(2 )

(h𝜃 (𝑥 ) ) 𝑗

the derivate of activation function can be written as

𝜕𝜕Θ𝑖𝑗

❑ 𝐽 (𝜃 )=𝑎(𝑙)𝛿❑(𝑙+1)

element-wise multiplication operator


Backpropagation algorithmTraining setSet (for all ).

ForSetPerform forward propagation to compute for Using , computeCompute

used to compute


Advantages:- Relatively simple implementation- Standard method and generally wokrs well- Many practical applications: * handwriting recognition, autonomous driving car

Disadvantages:- Slow and inefficient- Can get stuck in local minima resulting in sub-optimal solutions


Literature:

- http://en.wikipedia.org/wiki/Backpropagation- http://www.ml-class.org- http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html


http://en.wikipedia.org/wiki/Backpropagation

http://en.wikipedia.org/wiki/Backpropagation

http://www.ml-class.org/

http://www.ml-class.org/

http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html


Thank you for your attention!

data mining and semantic web

Documents

miroslav tima

output unit layer

layer matrix of weights

information theory miroslav

pixel images

rgb pixel

algorithm pixel

neural networklayer