1 incremental detection of text on road signs from video wen wu joint work with xilin chen and jie...

1

Incremental Detection of Text on Road Signs from Video

Wen Wu

Joint work with Xilin Chen and Jie Yang

2

Acquire Text, Process Text

Corpus

Language(Text)

Language(Text)

Web

Visual

Speech

NLP

Translation

IR/IE

Multimedia

Speech

3

Text helps to understand images

4

Why interested in text on signs?

• Signs are everywhere in our daily life, such as shop names, billboard, street names, etc;

• Like other information device, road signs are placed to convey information to human for different purposes;

• Text could be the most flexible way to express dynamic information.

• Why not make computer to understand those text and further assist human?

5

Too many signs cause problems

6

It happened in Pittsburgh too!

7

Task • Automatically detect text on road signs

from video.

8

Related work

9

What makes us to detect sign?

10

What do you think?

11

Vertical plane property of signs

12

Divide-and-Conquer Strategy

• Decompose the original task into two sub-tasks, that is, localization of road signs and detection of text;

• Propose algorithms for two sub-tasks respectively, integrate them by mapping corresponding feature points;

• Use features from not only individual 2D images but also temporal dependency between them.

13

Incremental Detection Framework

14

Why incremental?

• Computation requirement– Detection is a computation-expensive step;– In contrast, mapping correspondence points

is a cheap step;

• Video resolution – Detection requires low resolution– OCR requires high resolution

Localize Detect Recognize

Time

15

System Implementation Prototype

Built on a PC with Intel Pentium 4 CPU @1.8

GHz and 1GB memory, Windows XP;

Data:

1) Captured by a DV camera mounted on a minivan.

2) Video frame size is 640*480.

3) The database included about 3 hours of videos, captured in different conditions, i.e., in the morning, afternoon, and dusk.

16

A Demo

Demo

17

Sequences of the Demo

18

Incremental vs. Non-incremental

Another demo

19

Summary of Evaluation• 22 video sequences with

different driving situations; • Vehicle’s speed varies from

20 to 55 MPH • Testing data contain ~90

road signs and > 300 words.

# of signs Hit rate False hits

92 92.4% 17.9%

Hit rate False hits Speed

Non-Incre- 80.2% 85.6% 2-6 fps

Incre- 88.9% 9.2% 8-16fps

Table 1. Sign localization performance Table 2. Text detection performance

20

Contributions

• Proposed a unified framework for automatically detecting text on road signs from video based on the natural characteristics of the task;

• Exploited features for text detection not only from individual 2D images but also from temporal dependency in video;

• Made connection between understanding visual information and understanding language (text).

21

Conclusions & Future Work

• Automatic detection of text on road signs could be very useful in various applications;

• Experiments have shown that the new framework could significantly improves robustness and efficiency of any existing text detection algorithm;

• Future work: Apply various language methods to detected texts in video, e.g., translation, IR, etc.

22

Question ?

Thank You

1 incremental detection of text on road signs from video wen wu joint work with xilin chen and jie...

Documents

images slide

incremental detection

demo demo slide

language text

related work slide

jie yang slide

different purposes text

localization of road