video surveillance: legally blind? peter kovesi school of computer science & software...

Post on 01-Apr-2015

216 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Video Surveillance:Legally Blind?

Peter KovesiSchool of Computer Science & Software Engineering

The University of Western Australia

Questions

• What image quality do we need for identification?

• How do you measure image quality?

• What is the image quality from a video camera?

• What is the effect on image quality when you: • record to video tape? • use image compression?

Humans are very bad at recognizing unfamiliar faces

• Kemp, Towell and Pike (1997) tested the value of having photos on credit cards. When a user presented a card with a photograph of someone else that had some resemblance to the user, they were challenged less than 40% of the time.

• Bruce et al. (1999, 2001) have tested the ability of people to match good quality CCTV images of unfamiliar faces under a variety of scenarios. Correct recognition rates are typically only 70-80%.

Good quality photograph of target

Array of 10 good quality CCTV images

Bruce et al (1999).

Is this person in the array?If they are present match the person.

Good quality photograph of target

Array of 10 good quality CCTV images

Bruce et al (1999).

Is this person in the array?If they are present match the person.

Good quality photograph of target

Array of 10 good quality CCTV images

When target was present in the array. 12% picked wrong person and 18% said they were not present (overall only 70% correct).

When target was not present in the array 70% still matched the target to someone in the array.

Bruce et al (1999).

Is this person in the array?If they are present match the person.

• Face recognition performance by humans is poor.

• Face recognition performance by machine is becoming quite good - but only if the images are of good quality.

• Surveillance video rarely provides good quality images.

• Face recognition performance by humans is poor.

• Face recognition performance by machine is becoming quite good - but only if the images are of good quality.

• Surveillance video rarely provides good quality images.

What image quality is needed for face identification?

Image quality is defined by many attributes

• Minimum feature size that can be resolved

• Noise level

• Quality of luminance reproduction

• Quality of colour reproduction.

(Hayes, Morrone and Burr 1986)(Costen, Parker and Craw 1996)(Nasanen 1999)

In humans it has been found that face recognition is tuned to a set of spatial frequencies ranging from about 20 cycles per face width down to about 5 cycles per face width.

20 cycles

10 cycles

5 cycles

Human Face Recognition

Maximum sensitivity is centred around 8 to 13 cycles/face width.

To recognize with confidence you need to be able to resolve down to 20 cycles/face width

(Hayes, Morrone and Burr 1986)(Costen, Parker and Craw 1996)(Nasanen 1999)

In humans it has been found that face recognition is tuned to a set of spatial frequencies ranging from about 20 cycles per face width down to about 5 cycles per face width.

~ 160mm

20 cycles

10 cycles

5 cycles

Human Face Recognition

8mm

16mm

Maximum sensitivity is centred around 8 to 13 cycles/face width.

To recognize with confidence you need to be able to resolve down to 20 cycles/face width

1951 USAF Chart

Groupings of 6 pairs of bars. Each successive set is half the size of the previous.

1951 USAF Chart

Groupings of 6 pairs of bars. Each successive set is half the size of the previous.

16mm

8mm

Eye charts also provide a simple way of measuring the minimum feature size that can be resolved.

20/20 Vision…

… or in metric, 6/6 vision

Snellen fraction66

Distance at which you should be able to read the line

Distance at which you can read the line on the chart

Minimum Angle of Resolution

Ian Bailey and Jan Lovie

The logMAR chart

88mm

72mm

58mm

36mm

44mm

6/6

6/12

6/24

6/48

Snellenfraction

Letter height

Number plate letters 80mm

Average eye spacing 65mm

9mm

18mm

6/60(legally blind)

Tests conducted with Pulnix TM6CN 1/2” CCD camera positioned 6m from the target.

Images were digitized directly from the camera using a Data Translation 3155 frame grabber

C-mount lenses: 4mm6mm8.5mm12.5mm 16mm

4mm lens

6mm lens

8.5mm lens

12.5mm lens

16mm lens

Camera image recorded to video, then played back and digitized. (Look at the USAF chart)

Camera image digitized directly.

Expect to lose quality when images are recorded to video

(cropped images taken with 12.5mm lens)

Compression is problematic.

Test targets survive compression well, but faces do not.

JPEG image quality 0 (14kB) JPEG image quality 4 (24kB)

Original PNG image (190kB)

JPEG images compressed using Photoshop. Image ‘quality’ can range from 0 - 12

JPEG (14kB) JPEG (24kB) Original

Faces do not survive compression well

What Does Compression Do?

• Image is divided into 8x8 blocks.

• Discrete Cosine Transform is applied to each block.

• The transform coefficients are quantized, many will be rounded to zero.

• When reconstructed, the amplitude and phase of the spatial frequencies within each 8x8 block will be altered. The 64 basis functions of an 8x8

Discrete Cosine Transform

JPEG and MPEG

12.5mm lens at 6m

No compression

~ 40 pixels

12.5mm lens at 6m

18:1 compression

12.5mm lens at 6m

18:1 compression

12.5mm lens at 6m

31:1 compression

12.5mm lens at 6m

31:1 compression

40 pixels across face= 5 DCT blocks

Spatial frequencies from 5 cycles/face width upwardsare all corrupted

This is exactly the range that is most important for face recognition!

A Real Surveillance Camera Installation…

4.8 m

Image quality is defined by many attributes

• Minimum feature size that can be resolved

• Noise level

• Quality of luminance reproduction

• Quality of colour reproduction.

Original laser scannedfaces

Same shape, varying pigmentation

Same pigmentation,varying shape

(Russell et al 2007)

Luminance and colour cues are at least as important as shape cues

People perform about equally well using just shape information or just pigmentation cues.

Hue values as greyscale

16 x16 macro-blocks

Image compression typically quantizes colour information very heavily…

Conclusions

• Surveillance video, as it is currently used, is almost useless for identification.

• Face recognition in low resolution images is badly affected by compression artifacts.

• Image quality standards are needed for surveillance camera installations.

Conan O’BrienUS talk show host

Tarja HalonenPresident of Finland

top related