data mining and semantic web

21
Neural Networks: Backpropagation algorithm Data Mining and Semantic Web University of Belgrade School of Electrical Engineering Chair of Computer Engineering and Information Theory Miroslav Tišma [email protected]

Upload: dawson

Post on 24-Feb-2016

35 views

Category:

Documents


0 download

DESCRIPTION

University of Belgrade School of Electrical Engineering Chair of Computer Engineering and Information Theory. Data Mining and Semantic Web. Neural Networks: Backpropagation algorithm. Miroslav Ti šma tisma.etf @gmail.com. But the camera sees this:. What is this?. You see this: . - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Mining and Semantic Web

Neural Networks: Backpropagation algorithm

Data Mining and Semantic Web

University of BelgradeSchool of Electrical Engineering Chair of Computer Engineering and Information Theory

Miroslav Tiš[email protected]

Page 2: Data Mining and Semantic Web

You see this:

But the camera sees this:

What is this?

23.12.2011. Miroslav Tišma 2/21

Page 3: Data Mining and Semantic Web

Computer Vision: Car detection

Testing:

What is this?

Not a carCars

23.12.2011. Miroslav Tišma 3/21

Page 4: Data Mining and Semantic Web

pixel 1

pixel 2

Raw image

Cars“Non”-Cars

50 x 50 pixel images→ 2500 pixels (7500 if RGB)

pixel 1 intensitypixel 2 intensity

pixel 2500 intensity

Quadratic features ( ): ≈3 million features

Learning Algorithm

pixel 1

pixel 2

23.12.2011. Miroslav Tišma 4/21

Page 5: Data Mining and Semantic Web

Neural Networks

• Origins: Algorithms that try to mimic the brain

• Was very widely used in 80s and early 90s; popularity diminished in late 90s.

• Recent resurgence: State-of-the-art technique for many applications

23.12.2011. Miroslav Tišma 5/21

Page 6: Data Mining and Semantic Web

Neurons in the brain

Dendr(I)tes

Ax(O)n

23.12.2011. Miroslav Tišma 6/21

Page 7: Data Mining and Semantic Web

Neuron model: Logistic unit

Sigmoid (logistic) activation function.

hΘ (𝑥 )= 11+𝑒−Θ

𝑇 𝑥

𝑔 (𝑧 )= 11+𝑒− 𝑧

“bias unit”

“output”

“input wires”

“weights” - parameters

23.12.2011. Miroslav Tišma 7/21

Page 8: Data Mining and Semantic Web

Neural Network

Layer 3Layer 1 Layer 2

“bias unit”

“output layer”“hidden layer”“input layer”23.12.2011. Miroslav Tišma 8/21

Page 9: Data Mining and Semantic Web

Neural Network“activation” of unit in layer

matrix of weights controlling function mapping from layer to layer

If network has units in layer , units in layer , then will be of dimension .

23.12.2011. Miroslav Tišma 9/21

Page 10: Data Mining and Semantic Web

Simple example: AND

0 00 11 01 1

-30

+20

+20

hΘ (𝑥 )=𝑔(−30+20 𝑥1+20 𝑥2)

hΘ (𝑥 )≈ 𝑥1𝐴𝑁𝐷 𝑥223.12.2011. Miroslav Tišma 10/21

Page 11: Data Mining and Semantic Web

Example: OR function

0 00 11 01 1

-10

+20

+20

hΘ (𝑥 )=𝑔(−10+20 𝑥1+20𝑥2)hΘ (𝑥 )≈ 𝑥1𝑂𝑅𝑥2

23.12.2011. Miroslav Tišma 11/21

Page 12: Data Mining and Semantic Web

Multiple output units: One-vs-all.

Pedestrian Car Motorcycle Truck

Want , when pedestrian 23.12.2011. Miroslav Tišma 12/21

when car when motorcycle

, etc.,

Page 13: Data Mining and Semantic Web

Neural Network (Classification)

Binary classification

1 output unit

Layer 1 Layer 2 Layer 3 Layer 4

Multi-class classification (K classes)

K output units

total no. of layers in network

no. of units (not counting bias unit) in layer

pedestrian car motorcycle truck

E.g. , , ,

23.12.2011. Miroslav Tišma 13/21

Page 14: Data Mining and Semantic Web

Cost function

Logistic regression:

23.12.2011. Miroslav Tišma 14/21

Neural network:

Page 15: Data Mining and Semantic Web

Gradient computation

Need code to compute:- -

23.12.2011. Miroslav Tišma 15/21

Our goal is to minimize the cost function

Page 16: Data Mining and Semantic Web

Given one training example ( , ):Forward propagation:

Layer 1 Layer 2 Layer 3 Layer 4

𝑎 (1 ) 𝑎 (2) 𝑎 (3) 𝑎 (4 )

23.12.2011. Miroslav Tišma 16/21

Backpropagation algorithm

Page 17: Data Mining and Semantic Web

Backpropagation algorithm

Intuition: “error” of node in layer .

Layer 1 Layer 2 Layer 3 Layer 4

For each output unit (layer L = 4) 𝛿(4 )𝛿(3 )𝛿(2 )

(h𝜃 (𝑥 ) ) 𝑗

the derivate of activation function can be written as

𝜕𝜕Θ𝑖𝑗

❑ 𝐽 (𝜃 )=𝑎(𝑙)𝛿❑(𝑙+1)

element-wise multiplication operator

23.12.2011. Miroslav Tišma 17/21

Page 18: Data Mining and Semantic Web

Backpropagation algorithmTraining setSet (for all ).

ForSetPerform forward propagation to compute for Using , computeCompute

used to compute

23.12.2011. Miroslav Tišma 18/21

Page 19: Data Mining and Semantic Web

Advantages:- Relatively simple implementation- Standard method and generally wokrs well- Many practical applications: * handwriting recognition, autonomous driving car

Disadvantages:- Slow and inefficient- Can get stuck in local minima resulting in sub-optimal solutions

23.12.2011. Miroslav Tišma 19/21

Page 20: Data Mining and Semantic Web

Literature:

- http://en.wikipedia.org/wiki/Backpropagation- http://www.ml-class.org- http://home.agh.edu.pl/~vlsi/AI/backp_t_en/backprop.html

23.12.2011. Miroslav Tišma 20/21

Page 21: Data Mining and Semantic Web

23.12.2011. Miroslav Tišma 21/21

Thank you for your attention!