1 incremental detection of text on road signs from video wen wu joint work with xilin chen and jie...
Post on 22-Dec-2015
214 views
TRANSCRIPT
1
Incremental Detection of Text on Road Signs from Video
Wen Wu
Joint work with Xilin Chen and Jie Yang
2
Acquire Text, Process Text
Corpus
Language(Text)
Language(Text)
Web
Visual
Speech
NLP
Translation
IR/IE
Multimedia
Speech
3
Text helps to understand images
4
Why interested in text on signs?
• Signs are everywhere in our daily life, such as shop names, billboard, street names, etc;
• Like other information device, road signs are placed to convey information to human for different purposes;
• Text could be the most flexible way to express dynamic information.
• Why not make computer to understand those text and further assist human?
5
Too many signs cause problems
6
It happened in Pittsburgh too!
7
Task • Automatically detect text on road signs
from video.
8
Related work
9
What makes us to detect sign?
10
What do you think?
11
Vertical plane property of signs
12
Divide-and-Conquer Strategy
• Decompose the original task into two sub-tasks, that is, localization of road signs and detection of text;
• Propose algorithms for two sub-tasks respectively, integrate them by mapping corresponding feature points;
• Use features from not only individual 2D images but also temporal dependency between them.
13
Incremental Detection Framework
14
Why incremental?
• Computation requirement– Detection is a computation-expensive step;– In contrast, mapping correspondence points
is a cheap step;
• Video resolution – Detection requires low resolution– OCR requires high resolution
Localize Detect Recognize
Time
15
System Implementation Prototype
Built on a PC with Intel Pentium 4 CPU @1.8
GHz and 1GB memory, Windows XP;
Data:
1) Captured by a DV camera mounted on a minivan.
2) Video frame size is 640*480.
3) The database included about 3 hours of videos, captured in different conditions, i.e., in the morning, afternoon, and dusk.
16
A Demo
Demo
17
Sequences of the Demo
18
Incremental vs. Non-incremental
Another demo
19
Summary of Evaluation• 22 video sequences with
different driving situations; • Vehicle’s speed varies from
20 to 55 MPH • Testing data contain ~90
road signs and > 300 words.
# of signs Hit rate False hits
92 92.4% 17.9%
Hit rate False hits Speed
Non-Incre- 80.2% 85.6% 2-6 fps
Incre- 88.9% 9.2% 8-16fps
Table 1. Sign localization performance Table 2. Text detection performance
20
Contributions
• Proposed a unified framework for automatically detecting text on road signs from video based on the natural characteristics of the task;
• Exploited features for text detection not only from individual 2D images but also from temporal dependency in video;
• Made connection between understanding visual information and understanding language (text).
21
Conclusions & Future Work
• Automatic detection of text on road signs could be very useful in various applications;
• Experiments have shown that the new framework could significantly improves robustness and efficiency of any existing text detection algorithm;
• Future work: Apply various language methods to detected texts in video, e.g., translation, IR, etc.
22
Question ?
Thank You