speech balloon contour classification in comics
DESCRIPTION
In this work we detail a novel approach for classifying speech balloon in scanned comics book pages based on their contour time series.TRANSCRIPT
Lehigh University, Bethlehem, PA, USA - August 21 2013
Speech balloon contour classification in comics
Christophe RigaudDimosthenis KaratzasJean-Christophe Burie
Jean-Marc Ogier
GREC'13 - christophe-rigaud.com 2
Summary
● Project
● Speech balloons
● Detection
● Classification
● Dataset
● Evaluation
● Conclusion
http://www.tumblr.com
GREC'13 - christophe-rigaud.com 3
Project
L3i project: eBDtheque● June 2011 – September 2014
● Participants
– 2 doctoral researchers
– 5 assistant professors
– 3 professors
● Comic books
– Cultural heritage
– Need to be valorized by the new technologies
● Objective: comics content understanding
– Augmented reading experience
– Information retrieval (e.g. semantic query, full text search)
– New dataset
● Progress
– Panels, text lines, balloons, people
GREC'13 - christophe-rigaud.com 4
Speech balloonsShapes and contours
Smooth contour: dialogue, conversation...
Zigzag contour: exclamation, event, action...
Wavy contour:Thought, dream, insinuation...
Image credits: eBDtheque dataset
GREC'13 - christophe-rigaud.com 5
Detection
Initialization
Detection
Results
min(E)=min(Einternal+Eexternal+Etext)
Active contour model[1]
[1] C. Rigaud, D. Karatzas, J. Van de Weijer, J-C Burie, J-M Ogier. An active contour model for speech balloon detection in comics. In 12th International Conference on Document Analysis and Recognition (ICDAR), 2013
GREC'13 - christophe-rigaud.com 6
ClassificationShape/contour separation
Var = 3.97
Var = 0.35
Barycentre (red), start (green), anticlockwise.
Barycentre (red), start (green), anticlockwise.
GREC'13 - christophe-rigaud.com 7
Dataset
● eBDtheque subset● 22 speech balloons● Pixel level ground truth● Type {oval, rectangle, peak, cloud}● Tail direction
http://ebdtheque.univ-lr.fr
GREC'13 - christophe-rigaud.com 8
Predicted class
Smooth Wavy Zigzag
Actualclass
Smooth 13 1 0
Wavy 1 2 0
Zigzag 1 0 4
Evaluation
● Label correspondences
● Confusion matrix
● Accuracy: 86.3%
Ground truth Classification Variance threshold
Oval, rectangle Smooth < 1.5
Cloud Wavy 1.5 < var <= 2
Peak Zigzag > 2
GREC'13 - christophe-rigaud.com 9
Conclusion
● One step further in the comics content understanding
● High dependence to balloon detection
● Shape/contour separation
● Contours are more discriminant than shapes
● Next:● Normalize the metric according to the size
● Frequency domain information
● More data
● Tail detection and speakers localization