vision january 10, 2007. today's agenda ● some general notes on vision ● colorspaces ●...

Vision

January 10, 2007

Today's Agenda

● Some general notes on vision

● Colorspaces

● Numbers and Java

● Feature detection

● Rigid body motion

Some general notes on vision

Vision is difficult

● Animal vision is very complex

● Right: Circuitry of macaque visual cortex (Felleman and Van Essen, 1991)

Object recognition is difficult

A theory of object recognition has to specify:

● The nature of stored visual representations in long-term memory

● The nature of the intermediate representations● A computational account of how each intermediate

representation can be derived from the previous one

● A determination of whether the answers to the above points are different for different kinds of objects

What makes it difficult?

● A single image can be cast by many different objects

● A single object can cast many different images

● Two main challenges:

1. Invariance/Tolerance: Generalizing across changes in size, orientation, lighting, etc. to realize that these images are all of the same thing

2. Specificity: Appreciating the distinction between different categories

Theories on solving this problem

● Inverse optics:● Try to solve the problem of determining what shape

may have cast the image● This is an ill-posed problem

● Association:● Store each possible version of an object● Brute force

● Other intermediate descriptors (e.g. Parts, image fragments)

But...

Luckily, in MASLab, your problem is greatly simplified!

Colorspaces

Representing Colors

● RGB used to represent colors of light

● CMYK used to represent colors of pigment

● ...but both mix color, tint, brightness

HSV Colorspace

● Hue (color): 360 degrees mapped 0 to 255. Note that red is both 0 and 255.

● Saturation (amount of color)● Value (amount of light and dark)● We provide the code to convert to HSV● Note: White is low saturation but can have

any hue● Note: Black is low value but can have any

hue

Tips for differentiating colors

● Calibrate for different lighting conditions (26-100 is darker than the lab)

● Globally define thresholds● Use the gimp/bot client on real images● Use a large sample set

You don't HAVE to use HSV. Past teams have used other models such as RGB.

Numbers and Java

How values are stored

Uses Hexadecimal (base 16)● 0x12 = 18

A color is 4 bytes = 8 hex numbers

For HSV, these bytes are:● Alpha● Hue● Saturation● Value

Manipulating HSV values

Use masks to pick out parts.● 0x12345678 & 0x0000FF00 = 0x00005600

Shift to move parts.● 0x12345678 >> 8 = 0x00123456

Example:Hue = (X >> 16) & 0xFF

Notes on Java...

All Java types are signed!● A byte ranges from -128 to 127● Coded in two's complement: to change sign, flip

every bit and add 1

Don't forget higher-order bits!● (int) 0x0000FF00 = (int) 0xFF00● (int) ((byte) 0xFF) = (int) 0xFFFFFFFF

Watch out for shifts!● 0xFD000000 >> 8 = 0xFFFD0000

Performance

Getting an image performs a copy● Int[] = bufferedImage.getRGB(…)

Getting a pixel performs a multiplication● int v = bufferedImage.RGB(x,y)● offset = y*width + x

Go across rows, down columns (memory in rows, not columns)

Feature Detection

MASLab Features

● Red balls● Yellow goals● Blue lines● Blue ticks● Green-and-black bar codes

Ideas for dealing with blue lines

● Search for N blue pixels in a column● Make sure there's wall-white below● Candidate voting:

● In each column, list places where you think line might be

● Find shortest left-to-right path through candidates

Ideas for dealing with bar codes

● Look for green and black● Look for not-white under blue line● Check along a column to determine colors● RANdom SAmple Consensus (RANSAC):

● Pick random pixels within bar code● Are they black or green?

Looking for an object

● Look for a red patch● Set center to current

coordinates● Loop:

– Find new center based on pixels within d of old

– Enlarge d and recompute

– Stop if increasing d doesn't add enough red pixels

Distance

Remember:● Closer objects are bigger● Closer objects are lower

Other considerations

Find ways to deal with:

● Noise● Occlusion● Different viewing angles● Overly large thresholds

Rigid Body Motion

Rigid Body Motion

● Going from data association to motion

● Given– Starting x1, y1, θ1– Set of objects

visible in both images

● What are x2, y2, θ2?

More rigid body motion

● If we know angles but not distances, the problem is hard

● Assume distances are known

But if angles and distances are known...

...we can construct triangles

Rigid body motion math

● Apply the math for a rotation:

x1i = cos(θ) * x2i + sin(θ) * y2i + x0y1i = cos(θ) * y2i - sin(θ) * x2i + y0

● Solve for x0, y0, θ with least squares:

Σ (x1i - cos(θ) * x2i - sin(θ) * y2i – x0)^2 +(y1i - cos(θ) * y2i + sin(θ) * x2i – y0)^2

● Need at least two objects to solve

Advantages and Disadvantages

Advantages:● Relies on the world, not odometry● Can use many or few associations

Disadvantages:● Can take time to compute

vision january 10, 2007. today's agenda ● some general notes on vision ● colorspaces ●...

Documents

vision slide

hue slide

java slide

colorspaces slide

xff slide

brightness slide

0xfffd0000 slide

feature detection slide