machine learning & category recognition

Machine learning &

category recognition

Cordelia Schmid

Jakob Verbeek

Content of the course

• Visual object recognition

• Robust image description

• Machine learning

Visual recognition - Objectives

• Particular objects and scenes, large databases

• Object classes and categories (intra-class variability)

Visual object recognition

candle

person

drinking

indoors

person

kidnappinghouse

street

outdoors

person

street

outdoors car enter

person

roadfield

countryside

car crash

Visual object recognition

exit through a doorbuilding

people

outdoors

• Human motion and actions

Difficulties: within object variations

Variability: Camera position, Illumination,Internal parameters

Within-object variations

Difficulties: within-class variations

Visual recognition

• Robust image description – Appropriate descriptors for objects and categories

• Statistical modeling and machine learning for vision– Selection and adaptation of existing techniques

Robust image description

• Invariant detectors and descriptors• Scale and affine-invariant keypoint detectors

Matching of descriptors

Significant viewpoint change

Basis:contour segment network

edgel-chains partitioned into straight contour segments

segments connected at edgel-chains’ endpoints and junctions

Ferrari et al. ECCV 2006

Contour features

[Ferrari, Fevrier, Jurie & Schmid, Pami’07]

Localization of “shape” categories

Window descriptor + SVM Horse localization

Why machine learning?

• Early approaches: simple features + handcrafted models• Can handle only few images, simples tasks

L. G. Roberts, Machine Perception of Three Dimensional Solids,

Ph.D. thesis, MIT Department of Electrical Engineering, 1963.

• Early approaches: manual programming of rules• Tedious, limited and does not take into accout the data

Y. Ohta, T. Kanade, and T. Sakai, “An Analysis System for Scenes Containing objects with Substructures,” International Joint Conference on Pattern Recognition, 1978.

• Today lots of data, complex tasks

Internet images, personal photo albums

Movies, news, sports

• Today lots of data, complex tasks

Surveillance and security Medical and scientific images

• Today: Lots of data, complex tasks

• Instead of trying to encode rules directly, learn them from examples of inputs and desired outputs

Types of learning problems

• Supervised– Classification– Regression

• Unsupervised• Semi-supervised• Reinforcement learning• Active learning• ….

Supervised learning

• Given training examples of inputs and corresponding outputs, produce the “correct” outputs for new inputs

• Two main scenarios:

– Classification: outputs are discrete variables (category labels). Learn a decision boundary that separates one class from the other

– Regression: also known as “curve fitting” or “function approximation.” Learn a continuous input-output mapping from examples (possibly noisy)

Unsupervised Learning

• Given only unlabeled data as input, learn some sort of structure

• The objective is often more vague or subjective than in supervised learning. This is more of an exploratory/descriptive data analysis

• Clustering– Discover groups of “similar” data points

• Quantization– Map a continuous input to a discrete (more compact) output

• Dimensionality reduction, manifold learning– Discover a lower-dimensional surface on which the data lives

• Density estimation– Find a function that approximates the probability density of the

data (i.e., value of the function is high for “typical” points and low for “atypical” points)

– Can be used for anomaly detection

Other types of learning

• Semi-supervised learning: lots of data is available, but only small portion is labeled (e.g. since labeling is expensive)

• Semi-supervised learning: lots of data is available, but only small portion is labeled (e.g. since labeling is expensive)– Why is learning from labeled and unlabeled data better than

learning from labeled data alone?

• Active learning: the learning algorithm can choose its own training examples, or ask a “teacher” for an answer on selected inputs

• Reinforcement learning: an agent takes inputs from the environment, and takes actions that affect the environment. Occasionally, the agent gets a scalar reward or punishment. The goal is to learn to produce action sequences that maximize the expected reward (e.g. driving a robot without bumping into obstacles)

• Image classification: assigning label to the image

Visual object recognition - tasks

Car: presentCow: presentBike: not presentHorse: not present…

• Image classification: assigning label to the image

Car: presentCow: presentBike: not presentHorse: not present…• Object localization: define the location and the category

Car CowLocatio

machine learning & category recognition

machine perception

unlabeled data

object variationsvariability

pattern recognition

correct outputs

corresponding outputs

early approaches

simples tasks

Documents

image analysis lecture 9.3 - introduction to machine...

category-blind human action recognition: a practical...

constructing category hierarchies for visual recognition

pattern recognition and machine ... - thoth.inrialpes.fr

the nist 2014 speaker recognition i-vector machine ... ·...

machine learning for speaker recognition

pattern recognition & machine learning

cvpr2007 object category recognition p5 - summary and...

pattern recognition and machine learning.pdf

data-driven 3d voxel patterns for object category...

pattern recognition and machine learning (1.1)

pattern recognition and machine learning -...

object recognition using vision and machine...

bag-of-features for category recognition

pattern recognition and machine learning -...

internet video category recognition

rare category detection in machine learning

emotional pattern recognition using machine learning

machine learning for speech recognition

machine learning & category recognition cordelia schmid...