computer vision...computer vision from traditional approaches to deep neural networks stanislav...
TRANSCRIPT
![Page 1: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/1.jpg)
Computer Vision
From traditional approaches to deep neural networks
Stanislav Frolov München, 27.02.2018
![Page 2: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/2.jpg)
● Computer vision● Human vision● Traditional approaches and methods● Artificial neural networks● Summary
2
Outline of this talkWhat we are going to talk about
![Page 3: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/3.jpg)
● trained deep neural networks for object detection during master thesis
● still fascinated and interested
3
Stanislav Frolov
Big Data Engineer @inovex
![Page 4: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/4.jpg)
● Teach computers how to see● Automatic extraction, analysis and understanding of
images● Infer useful information, interpret and make decisions● Automate tasks that human visual system can do● One of the most exciting fields in AI and ML
4
What is computer visionGeneral
![Page 5: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/5.jpg)
5
What is computer visionMotivation
● Era of pixels● Internet consists
mostly of images● Explosion of visual
data● Cannot be labeled
by humans
![Page 6: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/6.jpg)
6
What is computer visionDrivers
● Two drivers for computer vision explosion○ Compute (faster and cheaper)○ Data (more data > algorithms)
![Page 7: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/7.jpg)
7
What is computer visionInterdisciplinary field
Computer Science
Mathematics
Engineering
Physics
BiologyPsychology
Information Retrieval
Machine LearningGraphs,
Algorithms
Systems Architecture
Robotics
Speech, NLP
Image Processing
OpticsSolid-State Physics
Neuroscience
Cognitive SciencesBiological vision
![Page 8: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/8.jpg)
Synonyms?
8
![Page 9: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/9.jpg)
● Imaging for statistical pattern recognition● Image transformations such as pixel-by-pixel operations
○ Contrast enhancement○ Edge extraction○ Noise reduction○ Geometrical and spatial operations (i.e rotations)
9
What is computer visionRelated fields - image processing
![Page 10: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/10.jpg)
● Creates new images from scene descriptions● Produces image data from 3D models● “Inverse” of computer vision● AR as a combination of both
10
What is computer visionRelated fields - computer graphics
![Page 11: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/11.jpg)
● Mainly manufacturing applications● Image-based automatic inspection, process control,
robot guidance● Usually employs strong assumptions (colour, shape,
light, structure, orientation, ...) -> works very well● Output often pass/fail or good/bad● Additionally numerical/measurement data, counts
11
What is computer visionRelated fields - machine vision
![Page 12: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/12.jpg)
● Create “intelligent” systems● Studying computational aspects of intelligence● Make computers do things at which, at the moment,
people are better● Many techniques play an important role (ML, ANNs)● Currently does a few things better/faster at scale than
humans can● Ability to do anything “human” is not answered
12
What is computer visionRelated fields - AI
![Page 13: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/13.jpg)
● Related fields have a large intersection● Basic techniques used, developed and studied are very
similar
13
What is computer visionRelated fields- summary
![Page 14: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/14.jpg)
Short trip to human vision
14
![Page 15: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/15.jpg)
● Two stage process○ Eyes take in light reflected off the objects and retina
converts 3D objects into 2D images○ Brain’s visual system interprets 2D images and “rebuilds”
a 3D model
15
What is human visionGeneral
![Page 16: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/16.jpg)
● Pair of 2D images with slightly different view allows to infer depth
● Position of nearby objects will vary more across the two images than the position of more distant objects
16
What is human visionStereoscopic vision
![Page 17: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/17.jpg)
● Prior knowledge of relative sizes and depths is often key for understanding and interpretation
17
What is human visionPrior knowledge
![Page 18: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/18.jpg)
● Texture and texture change helps solving depth perception
18
What is human visionTexture pattern
![Page 19: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/19.jpg)
19
What is human visionBiases and illusions in human perception
● Shadows make all the difference in interpretation● Gradual changes in light ignored to not be misled by
shadow
![Page 20: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/20.jpg)
20
What is human visionA few more illusions
● Two arrows with different orientations have the same length
![Page 21: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/21.jpg)
● Assumptions and familiarity (distorted room)● Face recognition bias● Up-down orientation bias
21
What is human visionBiases and illusions in human perception
![Page 22: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/22.jpg)
22
What is human visionSummary
● Illusions are fun, but the complete puzzle to understand human vision is far from being complete
![Page 23: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/23.jpg)
Back to computer vision
23
![Page 24: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/24.jpg)
● Recognition● Localization● Detection● Segmentation
24
What is computer visionTypical tasks
![Page 25: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/25.jpg)
● Part-based detection○ Deformable parts model○ Pose estimation and poselets
25
What is computer visionTypical tasks
![Page 26: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/26.jpg)
● Image captioning (actions, attributes)
26
What is computer visionTypical tasks
![Page 27: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/27.jpg)
● Motion analysis○ Egomotion (camera)○ Optical flow (pixels)
27
What is computer visionTypical tasks
![Page 28: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/28.jpg)
● Scene understanding and reconstruction
28
What is computer visionTypical tasks
![Page 29: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/29.jpg)
● Image restoration● Colouring black & white photos
29
What is computer visionTypical tasks
![Page 30: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/30.jpg)
Solving this is useful for many applications
30
![Page 31: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/31.jpg)
31
What is computer visionTypical applications
● Assistance systems for cars and people● Surveillance● Navigation (obstacle avoidance, road following, path
planning)● Photo interpretation● Military (“smart” weapons)● Manufacturing (inspection, identification)● Robotics● Autonomous vehicles (dangerous zones)
![Page 32: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/32.jpg)
32
What is computer visionTypical applications
● Recognition and tracking● Event detection● Interaction (man-machine interfaces)● Modeling (medical, manufacturing, training, education)● Organizing (database index, sorting/clustering)● Fingerprint and biometrics● …
![Page 33: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/33.jpg)
Why so difficult?
33
![Page 34: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/34.jpg)
34
What is computer visionWhy it is difficult
● Occlusion● Deformation● Scale● Clutter● Illumination● Viewpoint● Object pose
● Tons of classes and variants
● Often n:1 mapping● Computationally
expensive● Full understanding of
biological vision is missing
![Page 35: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/35.jpg)
System overview
35
![Page 36: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/36.jpg)
● Input: image(s) + labels● Output: Semantic data, labels
● Digital image pixels usually have three channels [R,G,B] each [0...255] + Location[x,y]
● Digital images are just vectors
36
What is computer visionSystem overview
![Page 37: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/37.jpg)
1. Image acquisition (camera, sensors)2. Pre-processing (sampling, noise reduction,
augmentation)3. Feature extraction (lines, edges, regions, points)4. Detection and segmentation5. Post-processing (verification, estimation, recognition)6. Decision making● -> Ability of a machine to step back and interpret the big
picture of those pixels37
What is computer visionSystem overview
![Page 38: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/38.jpg)
Some history
38
![Page 39: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/39.jpg)
1950s
● 2D imaging for statistical pattern recognition● Theory of optical flow based on a fixed point
towards which one moves
39
What is computer visionHistory
![Page 40: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/40.jpg)
Image processing
● Histograms● Filtering● Stitching● Thresholding● ...
40
What is computer visionTraditional approaches
![Page 41: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/41.jpg)
1960s
● Desire to extract 3D structure from 2D images for scene understanding
● Began at pioneering AI universities to mimic human visual system as stepping stone for intelligent robots
● Summer vision project at MIT: attach camera to computer and having it “describe what it saw”
41
What is computer visionHistory
![Page 42: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/42.jpg)
● Given to 10 undergraduate students● … an attempt to use our summer workers effectively … ● … construction of a significant part of a visual system … ● … task can be segmented into sub-problems … ● … participate in the construction of a system complex
enough to be a real landmark in the development of “pattern recognition” …
42
What is computer visionHistory: summer vision project @MIT 1966
![Page 43: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/43.jpg)
● Goal: analyse scenes and identify objects● Structure of system:
○ Region proposal○ Property lists for regions○ Boundary construction○ Match with properties○ Segment
● Basic foreground/background segmentation with simple objects (cubes, cylinders, ….)
43
What is computer visionHistory: summer vision project @MIT 1966
![Page 44: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/44.jpg)
● Unlike general intelligence, computer vision seemed tractable
● Amusing anecdote, but it did never aimed to “solve” computer vision
● Computer vision today differs from what it was thought to be in 1966
44
What is computer visionHistory: summer vision project @MIT 1966
![Page 45: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/45.jpg)
1970s
● Formed many algorithms that exist today● Edges, lines and objects as interconnected
structures
45
What is computer visionHistory
![Page 46: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/46.jpg)
46
What is computer visionTraditional approaches
Edge detection based on
● Brightness● Gradients● Geometry● Illumination
![Page 47: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/47.jpg)
47
What is computer visionTraditional approaches - part based detector
● Objects composed of features of parts and their spatial relationship
● Challenge: how to define and combine
![Page 48: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/48.jpg)
1980s
● More rigorous mathematical analysis and quantitative aspects
● Optical character recognition● Sliding window approaches● Usage of artificial neural networks
48
What is computer visionHistory
![Page 49: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/49.jpg)
49
What is computer visionTraditional approaches - HOG detection (histogram of oriented gradients)
● Concept in 80s but used only in 2005● Create HOG descriptors (object generalizations)● One feature vector per object● Train with SVM● Sliding window @multiple scales
![Page 50: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/50.jpg)
50
What is computer visionTraditional approaches - HOG detection (histogram of oriented gradients)
● Computation of HOG descriptors:
1. Compute gradients2. Compute histograms on cells3. Normalize histograms4. Concatenate histograms
● Requires a lot of engineering● Must build ensembles of feature descriptors
![Page 51: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/51.jpg)
1990s
● Significant interaction with computer graphics (rendering, morphing, stitching)
● Approaches using statistical learning● Eigenface (Ghostfaces) through principal component
analysis (PCA)
51
What is computer visionHistory
![Page 52: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/52.jpg)
52
What is computer visionTraditional approaches - deformable parts model (DPM)
● Objects constructed by its parts● First match whole object, then refine on the parts● HOG + part-based + modern features ● Slow but good at difficult objects● Involves many heuristics
![Page 53: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/53.jpg)
53
What is computer visionFeatures
● Feature points○ Small area of pixels with certain properties
● Feature detection○ Use features for identification○ Activate if “object” present
● Examples:○ Lines, edges, colours, blobs, …○ Animals, faces, cars, ...
![Page 54: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/54.jpg)
54
What is computer visionTraditional approaches - classical recognition
● Init: extract features for objects in different scales, colours, orientations, rotations, occlusion levels
● Inference: extract features from query image and find closest match in database or train a classifier
● Computationally expensive (hundreds of features in image, millions in database) and complex due to errors and mismatches
![Page 55: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/55.jpg)
55
What is computer visionHistory
Before the new era
● Bags of features● Handcrafted ensembles
Input Feat. 2
Feat. 1
Feat. n
FinalDecision
Feature Extraction
![Page 56: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/56.jpg)
The new era of computer vision
56
![Page 57: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/57.jpg)
● Elementary building block
● Inspired by biological neurons
● Mathematical function y=f(wx+b)
● Learnable weights
57
Artificial neural networksFundamentals - artificial neuron
![Page 58: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/58.jpg)
● Collection of neurons organized in layers
● Universal approximators
● Fully-connected network here
58
Artificial neural networksFundamentals - artificial neural networks
![Page 59: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/59.jpg)
59
Artificial neural networksFundamentals - training
● Basically an optimization problem
● Find minimum of a loss function by an iterative process (training)
● Designing the loss function is sometimes tricky
![Page 60: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/60.jpg)
60
Artificial neural networksFundamentals - training
Simple optimizer algorithm:
1. Forward pass with a batch of data2. Calculate error between actual and wanted output3. Nudge weights in proportion to error into the right
direction (same data would result in smaller error)4. Repeat until convergence
![Page 61: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/61.jpg)
61
Artificial neural networksFundamentals - CNN
● Local neighborhood contributes to activation
● Exploit spatial information
● Hierarchical feature extractors
● Less parameters input
activation
filters
receptive field
![Page 62: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/62.jpg)
62
Artificial neural networksFundamentals - CNN
● Filter of size 3x3 applied to an input of 7x7
![Page 63: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/63.jpg)
63
Artificial neural networksFundamentals - pooling
● Max-pooling● Dimension reduction/adaption● Existence is more important than location
![Page 64: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/64.jpg)
64
Artificial neural networksFundamentals - pooling
● Zero-padding● Controlling dimensions
![Page 65: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/65.jpg)
65
Artificial neural networksFundamentals - general network architecture
Input image
convolutional layers
... Final decision
![Page 66: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/66.jpg)
66
Artificial neural networksFundamentals - hierarchical feature extractors
Lines, edges, blobs, colours, ...
Abstract objectsParts of abstract objects
First layers Deeper layers
Activations for:
![Page 67: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/67.jpg)
Modern history of object recognition
67
![Page 68: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/68.jpg)
● Classification and detection○ 27k images○ 20 classes
■ person, bird, cat, cow, dog, horse, sheep, aeroplane, bicycle, boat, bus, car, motorbike, train, bottle, chair, dining table, potted plant, sofa, tv/ monitor
68
BenchmarkDatasets - PASCAL VOC
![Page 69: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/69.jpg)
● Challenges on a subset of ImageNet○ 14kk labeled images○ 20k object categories
● ILSVRC* usually on 10k categories including 90 out of 120 dog breeds
69
BenchmarkDatasets - ImageNet
*ImageNet Large Scale Visual Recognition Challenge
![Page 70: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/70.jpg)
● ILSVRC 2012 winner by a large margin from 25% to 16%● Proved effectiveness of CNNs and kicked of a new era● 8 layers, 650k neurons, 60kk parameters
70
Artificial neural networksRoadmap - AlexNet
![Page 71: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/71.jpg)
● ILSVRC 2013 winner with a best top-5 error of 11.6%● AlexNet but using smaller 7x7 kernels to keep more
information in deeper layers
71
Artificial neural networksRoadmap - ZFNet
![Page 72: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/72.jpg)
● ILSVRC 2013 localization winner● Uses AlexNet on multi-scale input images with sliding
window approach● Accumulates bounding boxes for final detection (instead
of non-max suppression)
72
Artificial neural networksRoadmap - OverFeat
![Page 73: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/73.jpg)
● 2k proposals generated by selective search● SVM trained for classification● Multi-stage pipeline
73
Artificial neural networksRoadmap - RCNN (region based CNN)
![Page 74: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/74.jpg)
● Not a winner but famous due to simplicity and effectiveness
● Replace large-kernel convolutions by stacking several small-kernel convolutions
74
Artificial neural networksRoadmap - VGGNet
![Page 75: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/75.jpg)
● ILSVRC 2014 winner● Stacks up “inception” modules● 22 layers, 5kk parameters
75
Artificial neural networksRoadmap - InceptionNet (GoogleNet)
![Page 76: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/76.jpg)
● Jointly learns region proposal and detection● Employs a region of interest (RoI) that allows to reuse
the computations
76
Artificial neural networksRoadmap - Fast RCNN
![Page 77: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/77.jpg)
● Directly predicts all objects and classes in one shot● Very fast● Processes images at ~40 FPS on a Titan X GPU● First real-time state-of-the-art detector● Divides input images into multiple grid cells which are
then classified
77
Artificial neural networksRoadmap - YOLO (you only look once)
![Page 78: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/78.jpg)
● ILSVRC 2015 winner with a 3.6% error rate (human performance is 5-10%)
● Employs residual blocks which allows to build deep networks (hundreds of layers)
● Additional identity mapping
78
Artificial neural networksRoadmap - ResNet (Microsoft)
![Page 79: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/79.jpg)
● Not a recognition network● A region proposal network● Popularized prior/anchor boxes (found through
clustering) to predict offsets● Much better strategy than starting the predictions with
random coordinates● Since then heuristic approaches have been gradually
fading out and replaced
79
Artificial neural networksRoadmap - MultiBox
![Page 80: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/80.jpg)
● Fast RCNN with heuristic region proposal replaced by region proposal network (RPN) inspired by MultiBox
● RPN shares full-image convolutional features with the detection network (cost-free region proposal)
● RPN uses “attention” mechanism to tell where to look● ~5 FPS on a Titan K40 GPU● End-to-end training
80
Artificial neural networksRoadmap - Faster RCNN
![Page 81: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/81.jpg)
● SSD leverages the Faster RCNN’s RPN to directly classify objects inside each prior box (similar to YOLO)
● Predicts category scores and box offsets for a fixed set of default bounding boxes
● Fixes the predefined grid cells used in YOLO by using multiple aspect ratios
● Produces predictions of different scales● ~59 FPS
81
Artificial neural networksRoadmap - SSD (single shot multibox detector)
![Page 82: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/82.jpg)
● Open-source software library for machine learning applications
● Tensorflow Object Detection API○ A collection of pretrained models○ construct, train and deploy object detection models
82
Artificial neural networksTensorFlow object detection API
![Page 83: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/83.jpg)
Summary
83
![Page 84: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/84.jpg)
● Humans are good at understanding the big picture● Neural networks are good at details● But they can be fooled...
84
SummaryHuman vs machine
![Page 85: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/85.jpg)
● Need a large amount data● Lots of engineering● Trial and error● Long training time● Still lots of hyperparameter parameter tuning● No general network (generalization not answered)● Little mathematical foundation
85
SummaryComputer vision is still difficult
![Page 86: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/86.jpg)
● Despite all of these advances, the dream of having a computer interpret an image at the same level as a human remains unrealized
86
SummaryComputer vision is hard
![Page 87: Computer Vision...Computer Vision From traditional approaches to deep neural networks Stanislav Frolov München, 27.02.2018](https://reader034.vdocuments.net/reader034/viewer/2022052102/603c52b65877394cc63c157b/html5/thumbnails/87.jpg)
Thank You
Stanislav Frolov
Big Data Engineer
0173 318 11 35
inovex GmbH
Lindberghstraße 3
80939 München