![Page 1: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/1.jpg)
Computer vision and machine learning for perception and decision making in driverless cars
Ljubljana, 12.06. 2018
Danijel Skočaj
Visual Cognitive Systems Lab
Faculty of Computer and Information Science
University of Ljubljana
![Page 2: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/2.jpg)
2
Driverless cars as intelligent agents
Requirements
Perception-action loop
Perception and decision making
Computer vision
Deep learning
Computer vision and machine learning for perception and decision making in driverless cars
![Page 3: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/3.jpg)
3
Stone age of driverless driving
DARPA 2004 Grand Challenge Goal: autonomous drive of a car through Mohave dessert
Main reward: 1.000.000 USD
In 2004, most of the algorithms did not work well in uncontrolled environment!
Increase the robustness, adaptability, and intelligence!
was not awarded!
![Page 4: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/4.jpg)
4
Driverless cars today
2017
Waymo
![Page 5: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/5.jpg)
5
Enabling technologies
Improved sensors!
Improved perception!
Improved decision making!
Computer vision
Machine learning
Decision making
Perception-action cycle
Driverless cars=
Intelligent robotsSensors online
![Page 6: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/6.jpg)
6
Cognitive robot systems
industrialrobots SF
human
cognitive robots
communication
perception action
attention goals
planning reasoning
learning
![Page 7: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/7.jpg)
7
Perception
Perception
Visual information (image, video; RGB, BW, IR,…)
Sound (speech, music, noise, …)
Haptic information (haptic sensors, collision detectors, ect.)
Range/depth/space information (range images, 3D models, 3D maps, …)
Many different modalities – very multimodal system
Attention
Selective attention
Handling complexity of input signals
![Page 8: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/8.jpg)
8
Representation of visual information
= + a1 + a2 + a3 +…
![Page 9: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/9.jpg)
9
Representation of space
Metric information
Topological map
Hierarchical representation
![Page 10: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/10.jpg)
10
Recognition
Recognition of
objects
properties
faces
rooms
affordances
actions
speech
relations
intentions,…
Categorisation
Multimodalrecognition
![Page 11: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/11.jpg)
11
Learning
Buildnig representations
Continuous learning
Different learning modes
Multimodal learning
Forgetting, unlearning
Robustness
Nature:nurture
learning recognition
training samples
…
test samples
representation result
learning
updatedrepresentation
new training sample
![Page 12: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/12.jpg)
12
Reasoning, planning, decision making
In unpredictable environment
With incomplete information
With robot limitations
In dynamic environment
Considering different modalities
In real time
![Page 13: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/13.jpg)
13
An example of a cognitive system
Autonomous car
City drive
Competencies
Perception (image, 3D, collision)
Planning
Reasoning
Learning
Navigation
Obstacle avoidance
Action
Flexibility
Robustness
Efficiency
…
Google self-driving car
![Page 14: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/14.jpg)
14
Intelligent agents
Perception
Action
Reasoning, planning, decision making
Autonomy
Environment
ENVIRONMENT
see actionAGENT
next state
![Page 15: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/15.jpg)
15
Perception action cycle
Significant abstractionof the realworld
sense perception
modelling
planning
task execution
motor control act
![Page 16: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/16.jpg)
16
Simulation of robot perception and control
![Page 17: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/17.jpg)
17
Sensors
Range sensors
Object recognition
Bumper –
collision detector
Odometer
![Page 18: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/18.jpg)
18
Planning and control
Planning
Control
![Page 19: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/19.jpg)
19
Deep learning
![Page 20: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/20.jpg)
20
Excellent results
ILSVRC results
Deep learning era
![Page 21: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/21.jpg)
21
Modern deep learning
More data!
More computational power!
Improved learning details!
![Page 22: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/22.jpg)
22
The main concept
Zelier and Fergus, 2014
![Page 23: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/23.jpg)
23
Shift in paradigm
Hand-craftedEngineered
Data-drivenLearning-based
Solutions
Features
Applications
![Page 24: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/24.jpg)
24
Perceptron
Rosenblatt, 1957
Binary inputs and output
Weights
Threshold
Bias
Very simple!
![Page 25: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/25.jpg)
25
Sigmoid neurons
Real inputs and outputs from interval [0,1]
Activation function: sgimoid function
output =
![Page 26: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/26.jpg)
26
Sigmoid neurons
Small changes in weights and biases causes small change in output
Enables learning!
![Page 27: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/27.jpg)
27
Feedfoward neural networks
Network architecture:
![Page 28: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/28.jpg)
28
Example: recognizing digits
MNIST database of handwritten digits
28x28 pixes (=784 input neurons)
10 digits
50.000 training images
10.000 validation images
10.000 test images
http://neuralnetworksanddeeplearning.com
![Page 29: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/29.jpg)
29
Loss function
Given:
for all training images
Loss function:
(mean sqare error – quadratic loss function)
Find weigths w and biases b that for given input x
produce output a that minimizes Loss function C
![Page 30: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/30.jpg)
30
Gradient descend
Find minimum of
Change of C:
Gradient of C:
Change v in the opposite
direction of the gradient:
Algorithm:
Initialize v
Until stopping criterium riched
Apply udate rule
Learning rate
![Page 31: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/31.jpg)
31
Gradient descend in neural networks
Loss function
Update rules:
Consider all training samples
Very many parameters=> computationaly very expensive
Use Stochastic gradient descend instead
Compute gradient only for a subset ofm training samples
![Page 32: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/32.jpg)
32
Backpropagation
All we need is gradient of loss function
Rate of change of C wrt. to change in any weigt
Rate of change of C wrt. to change in any biase
How to compute gradient?
Numericaly
Simple, approximate, extremely slow
Analyticaly for entire C
Fast, exact, nontractable
Chain individual parts of netwok
Fast, exact, doable
Backpropagation!
![Page 33: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/33.jpg)
33
Main principle
We need the gradient of the Loss function
Two phases:
Forward pass; propagation: the input sample is propagated through thenetwork and the error at the final layer is obtained
Backward pass; weight update: the error is backpropagated to the individuallevels, the contribution of the individual neuron to the error is calculated andthe weights are updated accordingly
![Page 34: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/34.jpg)
34
Backpropagation and SGD
For a number of epochs
Until all training images are used
Select a mini-batch of training samples
For each training sample in the mini-batch
Input: set the corresponding activation
Feedforward: for each
compute and
Output error: compute
Backpropagation: for each
compute
Gradient descend: for each and update:
![Page 35: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/35.jpg)
35
Convolutional neural networks
From feedforward fully-connected neural networks
To convolutional neural networks
![Page 36: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/36.jpg)
36
Sparse connectivity
Local connectivity – neurons are only locally connected (receptive field)
Reduces memory requirements
Improves statistical efficiency
Requires fewer operations
from below from above
The receptive field of the units in the deeper layers is large=> Indirect connections!
![Page 37: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/37.jpg)
37
Parameter sharing
Neurons share weights!
Tied weights
Every element of the kernel is used at every position of the input
All the neurons at the same level detectthe same feature (everywhere in the input)
Greatly reduces the number of parameters!
Equivariance to translation
Shift, convolution = convolution, shift
Object moves => representation moves
Fully connected network with an infinitively strong prior over its weights
Tied weights
Weights are zero outside the kernel region
=> learns only local interactions and is equivariant to translations
![Page 38: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/38.jpg)
38
CNN layers
Layers used to build ConvNets:
INPUT: raw pixel values
CONV: convolutional layer
ReLU: introducing nonlinearity
POOL: downsampling
FC: for computing class scores
![Page 39: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/39.jpg)
39
CNN architecture
Stack the layers in an appropriate order
Hu et. al.
Babenko et. al.
![Page 40: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/40.jpg)
40
Detection of traffic signs
DFG database
200 categories
Tabernik, Skočaj, Deep Learning for for
Large-Scale Traffic Sign Detection and
Recognition, submitted
![Page 41: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/41.jpg)
41
Detection of traffic signs
Data augmentation
Mask R-CNN +
Online hard-example mining
Distribution of selected training samples
Sample weighting
Adjusting region pass-through during detection
![Page 42: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/42.jpg)
42
Experimental results
Swedish traffic sign database DFG traffic sign database
![Page 43: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/43.jpg)
43
Traffic sign detection
![Page 44: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/44.jpg)
44
Knowing everything
Data, big data!
Machine learning!
Make sense of huge amount of data!
![Page 45: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/45.jpg)
45
Digitalisation of decision making
Different problem complexities
Simple, well defined problems
Complex, vaguely defined problems
Com
ple
xity
Rule-baseddecision making
Data-driven decision making
Programming Machine learning
![Page 46: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/46.jpg)
46
Digital decision making
Decision making based on history
Decision making modelling based on previous decisions
Explicit capturing + digitalisation of all important attributes
Bias in favor of past decisions
Size and representativeness of the training set
Updating the model through time
Combining the training model with new rules
Explainabilty of decisions
![Page 47: Computer vision and machine learning for perception and ... · Update rules: Consider all training samples ... m training samples. 32 Backpropagation All we need is gradient of loss](https://reader035.vdocuments.net/reader035/viewer/2022081323/601deb5f55dee44384301469/html5/thumbnails/47.jpg)
47
Conclusion
T-60 T-30 T T+30