2011 international conference on document analysis and...

24
IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7 2011 International Conference on Document Analysis and Recognition (ICDAR 2011) Beijing, China 18 – 21 September 2011 Pages 1-758 1/2

Upload: others

Post on 20-Feb-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

IEEE Catalog Number: ISBN:

CFP11227-PRT 978-1-4577-1350-7

2011 International Conference on Document Analysis and Recognition (ICDAR 2011)

Beijing, China 18 – 21 September 2011

Pages 1-758

1/2

Page 2: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

11 International Conferenceon Document Analysis

and Recognition

ICDAR 2011Table of Contents

Welcome from the General Chairs........................................................................................................xxvii

Welcome from the Program Chairs......................................................................................................xxviii

Conference Committees.........................................................................................................................xxix

Reviewers................................................................................................................................................xxxii

Sponsors.................................................................................................................................................xxxv

Document Image ProcessingA Tool for Tuning Binarization Techniques ...................................................................................................1

Vavilis Sokratis and Ergina Kavallieratou

A Laplacian Energy for Document Binarization ............................................................................................6Nicholas R. Howe

An MRF Model for Binarization of Natural Scene Text ...............................................................................11Anand Mishra, Karteek Alahari, and C.V. Jawahar

Stroke-Like Pattern Noise Removal in Binary Document Images ..............................................................17Mudit Agrawal and David Doermann

Combination of Document Image Binarization Techniques ........................................................................22Bolan Su, Shijian Lu, and Chew Lim Tan

Determining Document Skew Using Inter-line Spaces ...............................................................................27Boris Epshtein

Datasets and Performance EvaluationWhen is a Problem Solved? .......................................................................................................................32

Daniel Lopresti and George Nagy

CASIA Online and Offline Chinese Handwriting Databases .......................................................................37Cheng-Lin Liu, Fei Yin, Da-Han Wang, and Qiu-Feng Wang

th

v

Page 3: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

An Open Architecture for End-to-End Document Analysis Benchmarking .................................................42Bart Lamiroy and Daniel Lopresti

Aletheia - An Advanced Document Layout and Text Ground-Truthing Systemfor Production Environments ......................................................................................................................48

C. Clausner, S. Pletschacher, and A. Antonacopoulos

HMM-Based Alignment of Inaccurate Transcriptions for Historical Documents .........................................53Andreas Fischer, Emanuel Indermühle, Volkmar Frinken, and Horst Bunke

Transcript Mapping for Handwritten Text Lines Using Conditional RandomFields ..........................................................................................................................................................58

Xiang-Dong Zhou, Fei Yin, Da-Han Wang, Qiu-Feng Wang, Masaki Nakagawa,and Cheng-Lin Liu

Document Retrieval (1)Browsing Heterogeneous Document Collections by a Segmentation-FreeWord Spotting Method ................................................................................................................................63

Marçal Rusiñol, David Aldavert, Ricardo Toledo, and Josep Lladós

Fast Key-Word Searching via Embedding and Active-DTW .......................................................................68Raid Saabni and Alex Bronstein

Keyword Spotting in Online Handwritten Documents Containing Textand Non-text Using BLSTM Neural Networks ............................................................................................73

Emanuel Indermühle, Volkmar Frinken, Andreas Fischer, and Horst Bunke

Keyword Spotting in Offline Chinese Handwritten Documents Usinga Statistical Model .......................................................................................................................................78

Liang Huang, Fei Yin, Qing-Hu Chen, and Cheng-Lin Liu

BLSTM Neural Network Based Word Retrieval for Hindi Documents ........................................................83Raman Jain, Volkmar Frinken, C.V. Jawahar, and R. Manmatha

A Method for Removing Inflectional Suffixes in Word Spotting of MongolianKanjur ..........................................................................................................................................................88

Hongxi Wei, Guanglai Gao, and Yulai Bao

Poster Session 1A Handwritten Character Extraction Algorithm for Multi-language DocumentImage ..........................................................................................................................................................93

Yonghong Song, Guilin Xiao, Yuanlin Zhang, Lei Yang, and Liuliu Zhao

Retrieval of Envelope Images Using Graph Matching ................................................................................99Li Liu, Yue Lu, and Ching Y. Suen

Automatic Estimation of the Legibility of Binarised Historic Documentsfor Unsupervised Parameter Tuning .........................................................................................................104

M. Stommel and G. Frieder

vi

Page 4: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Segmentation of Handwritten Textlines in Presence of Touching Components .......................................109Jayant Kumar, Le Kang, David Doermann, and Wael Abd-Almageed

Ternary Entropy-Based Binarization of Degraded Document Images UsingMorphological Operators ..........................................................................................................................114

T. Hoang Ngan Le, Tien D. Bui, and Ching Y. Suen

Large Scale Page-Based Book Similarity Clustering ................................................................................119Nemanja Spasojević and Guillaume Poncin

A New Gradient Based Character Segmentation Method for Video TextRecognition ...............................................................................................................................................126

Palaiahnakote Shivakumara, Souvik Bhowmick, Bolan Su, Chew Lim Tan,and Umapada Pal

Video Character Recognition through Hierarchical Classification ............................................................131Palaiahnakote Shivakumara, Trung Quy Phan, Shijian Lu, and Chew Lim Tan

Robust Vanishing Point Detection for MobileCam-Based Documents .....................................................136Xu-Cheng Yin, Hong-Wei Hao, Jun Sun, and Satoshi Naoi

A Benchmark Kannada Handwritten Document Dataset and Its Segmentation .......................................141Alireza Alaei, P. Nagabhushan, and Umapada Pal

A Digital Ink Recogntion Server for Handwritten Japanese Text ..............................................................146Daqing Wang, Bilan Zhu, and Masaki Nakagawa

A Novel Method for Embedded Text Segmentation Based on Stroke and Color .....................................151Xiufei Wang, Lei Huang, and Changping Liu

Using Ontologies to Reduce the Semantic Gap between Historians and ImageProcessing Algorithms ..............................................................................................................................156

Mickal Coustaty, Alain Bouju, Karell Bertet, and Georges Louis

Probabilistic Mathematical Formula Recognition Using a 2D Context-FreeGraph Grammar ........................................................................................................................................161

Mehmet Celik and Berrin Yanikoglu

Chromatic / Achromatic Separation in Noisy Document Images ..............................................................167Asma Ouji, Yann Leydier, and Frank Lebourgeois

Novel Data Representation for Text Extraction from Multispectral HistoricalDocument Images .....................................................................................................................................172

Rachid Hedjam and Mohamed Cheriet

Text Detection in Natural Scene Images by Stroke Gabor Words ...........................................................177Chucai Yi and Yingli Tian

A Shape Descriptor Combining Logarithmic-Scale Histogram of RadonTransform and Phase-Only Correlation Function .....................................................................................182

Makoto Hasegawa and Salvatore Tabbone

Segmentation of Graphical Objects as Maximally Stable Salient Regions ...............................................187Su Yang and Yuanyuan Wang

vii

Page 5: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

An Optimized Multi-stream Decoding Algorithm for Handwritten WordRecognition ...............................................................................................................................................192

Yousri Kessentini, Thierry Paquet, and Ahmed Guermazi

Indexing On-line Handwritten Texts Using Word Confusion Networks ....................................................197Sebastián Peña Saldarriaga and Mohamed Cheriet

Circle Text Expansion as Low-Rank Textures ..........................................................................................202Xin Zhang and Fuchun Sun

MRG-OHTC Database for Online Handwritten Tibetan Character Recognition .......................................207Long-long Ma, Hui-dan Liu, and Jian Wu

Using Readers' Highlighting on Monochromatic Documents for Automatic TextTranscription and Summarization .............................................................................................................212

Ricardo da Silva Barboza, Rafael Dueire Lins, and Victor Matheus de S. Pereira

Color-Mixing Correction of Overlapped Colors in Scanner Images ..........................................................217Misako Suwa

Enhanced Active Contour Method for Locating Text ................................................................................222Yaakov Navon, Vladimir Kluzner, and Boaz Ophir

Updating Knowledge in Feedback-Based Multi-classifier Systems ..........................................................227D. Impedovo and G. Pirlo

A New Feature Optimization Method Based on Two-Directional 2DLDAfor Handwritten Chinese Character Recognition ......................................................................................232

Xue Gao, Wenhuan Wen, and Lianwen Jin

Development of Template-Free Form Recognition System .....................................................................237Junichi Hirayama, Hiroshi Shinjo, Toshikazu Takahashi, and Takeshi Nagasaki

Data Extraction from Web Tables: The Devil is in the Details ..................................................................242George Nagy, Sharad Seth, Dongpu Jin, David W. Embley, Spencer Machado,and Mukkai Krishnamoorthy

A Method of Evaluating Table Segmentation Results Based on a Table ImageGround Truther .........................................................................................................................................247

Yanhui Liang, Yizhou Wang, and Eric Saund

The SCRIBO Module of the Olena Platform: A Free Software Frameworkfor Document Image Analysis ...................................................................................................................252

Guillaume Lazzara, Roland Levillain, Thierry Géraud, Yann Jacquelet,Julien Marquegnies, and Arthur Crépin-Leblond

A Semi-supervised Ensemble Learning Approach for Character Labelingwith Minimal Human Effort ........................................................................................................................259

Szilárd Vajda, Akmal Junaidi, and Gernot A. Fink

Character Recognition Based on DTW-Radon .........................................................................................264K.C. Santosh

viii

Page 6: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

A Mixed Approach for Handwritten Documents Structural Analysis .........................................................269Vincent Malleron and Véronique Eglin

Binarization of Color Character Strings in Scene Images Using K-MeansClustering and Support Vector Machines .................................................................................................274

Toru Wakahara and Kohei Kita

Efficient Cut-Off Threshold Estimation for Word Spotting Applications ....................................................279A. L. Kesidis and B. Gatos

Improvement of On-line Recognition Systems Using a RBF-Neural NetworkBased Writer Adaptation Module ..............................................................................................................284

Lobna Haddad, Tarek M. Hamdani, Monji Kherallah, and Adel M. Alimi

Distortion Measurement for Automatic Document Verification .................................................................289Joost van Beusekom and Faisal Shafait

Composite Script Identification and Orientation Detection for Indian TextImages ......................................................................................................................................................294

Shamita Ghosh and Bidyut B. Chaudhuri

A Painting Based Technique for Skew Estimation of Scanned Documents .............................................299Alireza Alaei, Umapada Pal, P. Nagabhushan, and Fumitaka Kimura

Database Development and Recognition of Handwritten Devanagari LegalAmount Words ..........................................................................................................................................304

R. Jayadevan, S. R. Kolhe, P. M. Patil, and Umapada Pal

Sample-Dependent Feature Selection for Faster Document ImageCategorization ...........................................................................................................................................309

Jérôme Louradour and Christopher Kermorvant

Co-training for Handwritten Word Recognition .........................................................................................314Volkmar Frinken, Andreas Fischer, Horst Bunke, and Alicia Fóornes

Web Multimedia Object Clustering via Information Fusion .......................................................................319Wenting Lu, Lei Li, Tao Li, Honggang Zhang, and Jun Guo

A New Text-Line Alignment Approach Based on Piece-Wise PaintingAlgorithm for Handwritten Documents ......................................................................................................324

Alireza Alaei, P. Nagabhushan, and Umapada Pal

Baseline Dependent Percentile Features for Offline Arabic HandwritingRecognition ...............................................................................................................................................329

Pradeep Natarajan, David Belanger, Rohit Prasad, Matin Kamali,Krishna Subramanian, and Prem Natarajan

Stroke-Based Performance Metrics for Handwritten Mathematical Expressions .....................................334Richard Zanibbi, Amit Pillay, Harold Mouchère, Christian Viard-Gaudin,and Dorothea Blostein

ix

Page 7: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

An Application of the 2D Gaussian Filter for Enhancing Feature Extractionin Off-line Signature Verification ...............................................................................................................339

Vu Nguyen and Michael Blumenstein

alpha-Shape Based Classification with Applications to Optical CharacterRecognition ...............................................................................................................................................344

Eli Packer, Asaf Tzadok, and Vladimir Kluzner

Bags of Strokes Based Approach for Classification and Indexing of Drop Caps .....................................349Thi Thuong Huyen Nguyen, Mickaël Coustaty, and Jean-Marc Ogier

Robust Cell Extraction Method for Form Documents Based on IntersectionSearching and Global Optimization ..........................................................................................................354

Hiroshi Tanaka, Hiroaki Takebe, and Yoshinobu Hotta

Hypothesis Preservation Approach to Scene Text Recognition with WeightedFinite-State Transducer ............................................................................................................................359

Takafumi Yamazoe, Minoru Etoh, Takeshi Yoshimura, and Kousuke Tsujino

Statistical Grouping for Segmenting Symbols Parts from Line Drawings,with Application to Symbol Spotting ..........................................................................................................364

Nibal Nayef and Thomas M. Breuel

Overlapped Handwriting Input on Mobile Phones ....................................................................................369Yanming Zou, Yingfei Liu, Ying Liu, and Kongqiao Wang

A Robust Color-Independent Text Detection Method from Complex Videos ...........................................374Yan Zhao, Tong Lu, and Wujun Liao

Handwritten and Audio Information Fusion for Mathematical SymbolRecognition ...............................................................................................................................................379

Sofiane Medjkoune, Harold Mouchère, Simon Petitrenaud,and Christian Viard-Gaudin

A Novel Skew Detection Technique Based on Vertical Projections .........................................................384A. Papandreou and B. Gatos

Discrimination of Old Document Images Using Their Style ......................................................................389Mickael Coustaty and Jean-Marc Ogier

Performance Evaluation of Algorithms for Newspaper Article Identification .............................................394Roberto Beretta and Luigi Laura

Table Detection in Noisy Off-line Handwritten Documents .......................................................................399Jin Chen and Daniel Lopresti

A Model-Based Ruling Line Detection Algorithm for Noisy HandwrittenDocuments ................................................................................................................................................404

Jin Chen and Daniel Lopresti

An On-line Arabic Handwriting Recognition System: Based on a New On-lineGraphemes Segmentation Technique ......................................................................................................409

Hesham M. Eraqi and Sherif Abdel Azeem

x

Page 8: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Keynote Speech 1Document Recognition without Strong Models .........................................................................................414

Henry S. Baird

Text Extraction (1)Detection and Segmentation of Antialiased Text in Screen Images .........................................................424

Sivan Gleichman, Boaz Ophir, Amir Geva, Mattias Marder, Ella Barkan,and Eli Packer

AdaBoost for Text Detection in Natural Scene .........................................................................................429Jung-Jin Lee, Pyoung-Hean Lee, Seong-Whan Lee, Alan Yuille, and Christof Koch

Dot Text Detection Based on FAST Points ...............................................................................................435Yuning Du, Haizhou Ai, and Shihong Lao

Text Detection and Character Recognition in Scene Images with UnsupervisedFeature Learning ......................................................................................................................................440

Adam Coates, Blake Carpenter, Carl Case, Sanjeev Satheesh, Bipin Suresh,Tao Wang, David J. Wu, and Andrew Y. Ng

Mathematics RecognitionMath Spotting: Retrieving Math in Technical Documents Using HandwrittenQuery Images ...........................................................................................................................................446

Richard Zanibbi and Li Yu

HAMEX - A Handwritten and Audio Dataset of Mathematical Expressions .............................................452Solen Quiniou, Harold Mouchère, Sebastián Pen Saldarriaga,Christian Viard-Gaudin, Emmanuel Morin, Simon Petitrenaud,and Sofiane Medjkoune

HMM-Based Recognition of Online Handwritten Mathematical Symbols UsingSegmental K-Means Initialization and a Modified Pen-Up/Down Feature ................................................457

Lei Hu and Richard Zanibbi

Comparing Approaches to Mathematical Document Analysis from PDF .................................................463Josef B. Baker, Alan P. Sexton, Volker Sorge, and Masakazu Suzuki

Applications (1)Preservative License Plate De-identification for Privacy Protection .........................................................468

Liang Du and Haibin Ling

Evaluation of Voting with Form Dropout Techniques for Ballot Vote Counting ........................................473Elisa H. Barney Smith, Shatakshi Goyal, Robbie Scott, and Daniel Lopresti

Conversion of PDF Books in ePub Format ...............................................................................................478Simone Marinai, Emanuele Marino, and Giovanni Soda

xi

Page 9: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Handwritten Street Name Recognition for Indian Postal Automation .......................................................483Umapada Pal, Ramit Kumar Roy, and Fumitaka Kimura

Layout AnalysisTable Content Understanding in SmartFIX ...............................................................................................488

Florian Deckert, Benjamin Seidler, Markus Ebbecke, and Michael Gillmann

Continuous CRF with Multi-scale Quantization Feature Functions Applicationto Structure Extraction in Old Newspaper .................................................................................................493

David Hebert, Thierry Paquet, and Stephane Nicolas

Classifying Textual Components of Bilingual Documents with Decision-TreeSupport Vector Machines .........................................................................................................................498

Xiao-Rong Lin, Chien-Yang Guo, and Fu Chang

Iterative Analysis of Pages in Document Collections for Efficient UserInteraction .................................................................................................................................................503

Joseph Chazalon, Bertrand Coüasnon, and Aurélie Lemaitre

Layout Analysis for Historical Manuscripts Using Sift Features ...............................................................508Angelika Garz, Robert Sablatnig, and Markus Diem

Handwrit en Text RecognitionJoint Optimization of Hidden Conditional Random Fields and Non LinearFeature Extraction ....................................................................................................................................513

Antoine Vinel, Trinh Minh Tri Do, and Thierry Artières

Improving Handwritten Chinese Text Recognition by ConfidenceTransformation ..........................................................................................................................................518

Qiu-Feng Wang, Fei Yin, and Cheng-Lin Liu

Concurrent Optimization of Context Clustering and GMM for OfflineHandwritten Word Recognition Using HMM .............................................................................................523

Tomoyuki Hamamura, Bunpei Irie, Takuya Nishimoto, Nobutaka Ono,and Shigeki Sagayama

Dempster-Shafer Based Rejection Strategy for Handwritten Word Recognition ......................................528Thomas Burger, Yousri Kessentini, and Thierry Paquet

Handwritten Text Recognition for Marriage Register Books .....................................................................533Verónica Romero, Joan Andreu Sánchez, Nicolás Serrano, and Enrique Vidal

Character Recognition (1)Limits on the Application of Frequency-Based Language Models to OCR ...............................................538

Ray Smith

xii

t

Page 10: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

An Impact of OCR Errors on Automated Classification of OCR Japanese Textswith Parts-of-Speech Analysis ..................................................................................................................543

Akihiro Kokawa, Lazaro S.P. Busagala, Wataru Ohyama,Tetsushi Wakabayashi, and Fumitaka Kimura

Recognizing Characters with Severe Perspective Distortion Using HashTables and Perspective Invariants ............................................................................................................548

Pan Pan, Yuanping Zhu, Jun Sun, and Satoshi Naoi

An Automatic Method for Enhancing Character Recognition in DegradedHistorical Documents ................................................................................................................................553

Gabriel Pereira e Silva and Rafael Dueire Lins

Discriminative Bernoulli Mixture Models for Handwritten Digit Recognition .............................................558Adrià Giménez, J. Andrés-Ferrer, Alfons Juan, and Nicolás Serrano

Document SegmentationLanguage-Independent Text Lines Extraction Using Seam Carving ........................................................563

Raid Saabni and Jihad El-Sana

Template Based Segmentation of Touching Components in Handwritten TextLines .........................................................................................................................................................569

Le Kang and David Doermann

Graph Clustering-Based Ensemble Method for Handwritten Text LineSegmentation ............................................................................................................................................574

Vasant Manohar, Shiv N. Vitaladevuni, Huaigu Cao, Rohit Prasad,and Prem Natarajan

Text-Line Extraction Using a Convolution of Isotropic Gaussian Filter witha Set of Line Filters ...................................................................................................................................579

Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel

Fast Rule-Line Removal Using Integral Images and Support Vector Machines .......................................584Jayant Kumar and David Doermann

Online Handwriting RecognitionA Generative Model for Handwritings Based on Enhanced FeatureDesynchronization ....................................................................................................................................589

Seiichi Uchida, Toru Sasaki, and Feng Yaokai

Objective Function Design for MCE-Based Combination of On-line and Off-lineCharacter Recognizers for On-line Handwritten Japanese Text Recognition ..........................................594

Bilan Zhu, JinFeng Gao, and Masaki Nakagawa

A Weighted Finite-State Transducer (WFST)-Based Language Modelfor Online Indic Script Handwriting Recognition .......................................................................................599

Suhan Chowdhury, Utpal Garain, and Tanushyam Chattopadhyay

xiii

Page 11: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

On-line Handwritten Japanese Characters Recognition Using a MRF Modelwith Parameter Optimization by CRF .......................................................................................................603

Bilan Zhu and Masaki Nakagawa

Symbol Knowledge Extraction from a Simple Graphical Language .........................................................608Jinpeng Li, Harold Mouchere, and Christian Viard-Gaudin

Forensic Document AnalysisSegmentation and Normalisation in Grapheme Codebooks ....................................................................613

Tara Gilliam, Richard C. Wilson, and John A. Clark

Evaluating the Rarity of Handwriting Formations ......................................................................................618Sargur N. Srihari

Multi-fractal Modeling for On-line Text-Independent Writer Identification .................................................623Aymen Chaabouni, Houcine Boubaker, Monji Kherallah, Adel M. Alimi,and Haikal El Abed

Writer Retrieval - Exploration of a Novel Biometric Scenario Using PerceptualFeatures Derived from Script Orientation .................................................................................................628

Vlad Atanasiu, Laurence Likforman-Sulem, and Nicole Vincent

Quality Analysis of Dynamic Signature Based on the Sigma-Lognormal Model ......................................633Javier Galbally, Julian Fierrez, Marcos Martinez-Diaz, and Réjean Plamondon

Poster Session 2An Improved Method Based on Weighted Grid Micro-structure Featurefor Text-Independent Writer Recognition ..................................................................................................638

Lu Xu, Xiaoqing Ding, Liangrui Peng, and Xin Li

A Multi-scale Text Line Segmentation Method in Freestyle HandwrittenDocuments ................................................................................................................................................643

Yangdong Gao, Xiaoqing Ding, and Changsong Liu

Digit/Symbol Pruning and Verification for Arabic Handwritten Digit/SymbolSpotting .....................................................................................................................................................648

Nicola Nobile, Chun Lei He, Malik Waqas Sagheer, Louisa Lam, and Ching Y. Suen

Modified Two-Class LDA Based Compound Distance for Similar HandwrittenChinese Characters Discrimination ..........................................................................................................653

Yunxue Shao, Chunheng Wang, Baihua Xiao, Rongguo Zhang, and Linbo Zhang

Error Correction with In-domain Training across Multiple OCR System Outputs .....................................658William B. Lund and Eric K. Ringger

Effects of Generating a Large Amount of Artificial Patterns for On-lineHandwritten Japanese Character Recognition .........................................................................................663

Bin Chen, Bilan Zhu, and Masaki Nakagawa

xiv

Page 12: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

A Novel Short Merged Off-line Handwritten Chinese Character StringSegmentation Algorithm Using Hidden Markov Model .............................................................................668

Zhiwei Jiang, Xiaoqing Ding, Changsong Liu, and Yanwei Wang

Binarization of Textual Content in Video Frames .....................................................................................673Konstantinos Ntirogiannis, Basilis Gatos, and Ioannis Pratikakis

Word Retrieval in Historical Document Using Character-Primitives .........................................................678Partha Pratim Roy, Jean-Yves Ramel, and Nicolas Ragot

An On-line Handwritten Text Search Method Based on Directional FeatureMatching ...................................................................................................................................................683

Pasitthideth Luangvilay, Bilan Zhu, and Masaki Nakagawa

Text Localization in Real-World Images Using Efficiently Pruned ExhaustiveSearch ......................................................................................................................................................687

Lukáš Neumann and Jiří Matas

Classical Mongolian Words Recognition in Historical Document .............................................................692Guanglai Gao, Xiangdong Su, Hongxi Wei, and Yeyun Gong

A Novel Italic Detection and Rectification Method for Chinese AdvertisingImages ......................................................................................................................................................698

Jie Liu, Heping Li, Shuwu Zhang, and Wei Liang

Minimizing User Annotations in the Generation of Layout Ground-Truthed Data ....................................703Karim Hadjar and Rolf Ingold

An Improved Scene Text Extraction Method Using Conditional Random Fieldand Optical Character Recognition ...........................................................................................................708

Hongwei Zhang, Changsong Liu, Cheng Yang, Xiaoqing Ding, and KongQiao Wang

Identification of Indic Scripts on Torn-Documents ....................................................................................713Sukalpa Chanda, Katrin Franke, and Umapada Pal

A Contour-Based Method for Logo Detection ...........................................................................................718The Anh Pham, Mathieu Delalandre, and Sabine Barrat

A Contour-Based Progressive Technique for Shape Recognition ............................................................723Stefano Ferilli, Teresa M.A. Basile, Floriana Esposito, and Marenglen Biba

Localization of Digit Strings in Farsi/Arabic Document Images Using StructuralFeatures and Syntactical Analysis ............................................................................................................728

Ali Abedi and Karim Faez

Text/Graphics Segmentation in Architectural Floor Plans ........................................................................734Sheraz Ahmed, Markus Weber, Marcus Liwicki, and Andreas Dengel

OCR-Driven Writer Identification and Adaptation in an HMM HandwritingRecognition System ..................................................................................................................................739

Huaigu Cao, Rohit Prasad, and Prem Natarajan

xv

Page 13: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Handwritten and Typewritten Text Identification and Recognition Using HiddenMarkov Models .........................................................................................................................................744

Huaigu Cao, Rohit Prasad, and Prem Natarajan

Metadata Extraction System for Chinese Books ......................................................................................749Liangcai Gao, Yuan Zhong, Yingmin Tang, Zhi Tang, Xiaofan Lin, and Xuan Hu

A Fast Alignment Scheme for Automatic OCR Evaluation of Books ........................................................754Ismet Zeki Yalniz and R. Manmatha

Improving Scene Text Detection by Scale-Adaptive Segmentationand Weighted CRF Verification ................................................................................................................759

Yi-Feng Pan, Yuanping Zhu, Jun Sun, and Satoshi Naoi

Progressive Alignment and Discriminative Error Correction for Multiple OCREngines .....................................................................................................................................................764

William B. Lund, Daniel D. Walker, and Eric K. Ringger

Offline Writer Identification Using K-Adjacent Segments ..........................................................................769Rajiv Jain and David Doermann

Binarizing the Courtesy Amount Field on Color Chinese Bank Check Images ........................................774Dong Liu and Youbin Chen

A Table Detection Method for Multipage PDF Documents via VisualSeperators and Tabular Structures ...........................................................................................................779

Jing Fang, Liangcai Gao, Kun Bai, Ruiheng Qiu, Xin Tao, and Zhi Tang

Look Inside the World of Parts of Handwritten Characters .......................................................................784Wang Song, Seiichi Uchida, and Marcus Liwicki

Chinese Keyword Spotting Using Knowledge-Based Clustering ..............................................................789Yong Xia, Kuanquan Wang, and Mingwei Li

Text Segmentation of Consumer Magazines in PDF Format ...................................................................794Jian Fan

On-line Chinese Character Recognition System for Overlapping Samples .............................................799Xiang Wan, Changsong Liu, and Yanming Zou

On-line Signature Verification Using Segment-to-Segment Graph Matching ...........................................804Kaiyue Wang, Yunhong Wang, and Zhaoxiang Zhang

Snap and Translate Using Windows Phone .............................................................................................809Jun Du, Qiang Huo, Lei Sun, and Jian Sun

Comparative Study of Part-Based Handwritten Character RecognitionMethods ....................................................................................................................................................814

Wang Song, Seiichi Uchida, and Marcus Liwicki

A Keypoint-Based Approach toward Scenery Character Detection .........................................................819Seiichi Uchida, Yuki Shigeyoshi, Yasuhiro Kunishige, and Feng Yaokai

xvi

Page 14: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Three Dimensional Rotation-Free Recognition of Characters ..................................................................824Ryo Narita, Wataru Ohyama, Tetsushi Wakabayashi, and Fumitaka Kimura

A Novel Approach for Graphics Recognition Based on Galois Lattice and Bagof Words Representation ..........................................................................................................................829

Amani Boumaiza and Salvatore Tabbone

Effects of Line Densities on Nonlinear Normalization for Online HandwrittenJapanese Character Recognition .............................................................................................................834

Truyen Van Phan, JinFeng Gao, Bilan Zhu, and Masaki Nakagawa

Super-Resolved Binarization of Text Based on the FAIR Algorithm ........................................................839Thibault Lelore and Frédéric Bouchara

Writer Identification Using TF-IDF for Cursive Handwritten Word Recognition ........................................844Quang Anh Bui, Muriel Visani, Sophea Prum, and Jean-Marc Ogier

Creation and Analysis of a Corpus of Text Rich Indian TV Videos ..........................................................849T. Chattopadhyay, Soumik Sengupta, Aniruddha Sinha, and Nisha Rampuria

Text Classification and Document Layout Analysis of Paper Fragments .................................................854Markus Diem, Florian Kleber, and Robert Sablatnig

Touching Character Separation in Chinese Handwriting Using Visibility-BasedForeground Analysis .................................................................................................................................859

Liang Xu, Fei Yin, Qiu-Feng Wang, and Cheng-Lin Liu

Improved Automatic Analysis of Architectural Floor Plans .......................................................................864Sheraz Ahmed, Marcus Liwicki, Markus Weber, and Andreas Dengel

Subgraph Spotting through Explicit Graph Embedding: An Applicationto Content Spotting in Graphic Document Images ...................................................................................870

Muhammad Muzzamil Luqman, Jean-Yves Ramel, Josep Lladós,and Thierry Brouard

Exploiting Collection Level for Improving Assisted Handwritten WordTranscription of Historical Documents ......................................................................................................875

Laurent Guichard, Joseph Chazalon, and Bertrand Coüasnon

Embedding a Mathematical OCR Module into OCRopus .........................................................................880Shinpei Yamazaki, Fumihiro Furukori, Qinzheng Zhao, Keiichiro Shirai,and Masayuki Okamoto

Handwriting Character Recognition as a Service: A New HandwritingRecognition System Based on Cloud Computing .....................................................................................885

Yan Gao, Lanwen Jin, Cong He, and Guibin Zhou

Page Curling Correction for Scanned Books Using Local Distortion Information .....................................890Vladimir Kluzner and Asaf Tzadok

Image Enhancement for Degraded Binary Document Images .................................................................895Zhixin Shi, Srirangaraj Setlur, and Venu Govindaraju

xvii

Page 15: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Hybrid Approach to Adaptive OCR for Historical Books ...........................................................................900Vladimir Kluzner, Asaf Tzadok, Dan Chevion, and Eugene Walach

Restoration of Arbitrarily Warped Historical Document Images Using FlowLines .........................................................................................................................................................905

Maryam Rahnemoonfar and Apostolos Antonacopoulos

Towards Improving the Accuracy of Telugu OCR Systems ......................................................................910P. Pavan Kumar, Chakravarthy Bhagvati, Atul Negi, Arun Agarwal,and B. L. Deekshatulu

Correcting Specular Noise in Multiple Images of Photographed Documents ...........................................915Ednardo Mariano, Rafael Dueire Lins, Gabriel de França Pereira e Silva,Jian Fan, Peter Majewicz, and Marcelo Thielo

A Study on Automatic Chinese Text Classification ...................................................................................920Xi Luo, Wataru Ohyama, Tetsushi Wakabayashi, and Fumitaka Kimura

A New System for Recognition of Handwritten Persian Bank Checks .....................................................925Javad Sadri, Younes Akbari, Mohammad J. Jalili, Ahmad Farahi,and Maliheh Habibi

Dynamic Text Line Segmentation for Real-Time Recognition of ChineseHandwritten Sentences .............................................................................................................................931

Da-Han Wang and Cheng-Lin Liu

Character Enhancement for Historical Newspapers Printed Using Hot MetalTypesetting ...............................................................................................................................................936

Iuliu Konya, Stefan Eickeler, and Christoph Seibert

Character n-Gram Spotting in Document Images .....................................................................................941M. Sudha Praveen, K. Pramod Sankar, and C. V. Jawahar

Use of Semantic and Physical Constraints in Bayesian Networks for FormRecognition ...............................................................................................................................................946

Philippot Emilie, Belaïd Yolande, and Belaïd Abdel

Keynote Speech 2Chinese Paleography, Calligraphy, and Pattern Recognition: Stylesand Scripts in Excavated Ancient Chinese Documents ............................................................................951

Xing Wen

xviii

Page 16: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Applications (2)MCS for Online Mode Detection: Evaluation on Pen-Enabled Multi-touchInterfaces ..................................................................................................................................................957

Markus Weber, Marcus Liwicki, Yannik T. H. Schelske, Christopher Schoelzel,Florian Strauß, and Andreas Dengel

Discovering Legible Chinese Typefaces for Reading Digital Documents .................................................962Bing Zhang, Ying Li, Ching Y. Suen, and Xuemin Zhang

Detecting Figure-Panel Labels in Medical Journal Articles Using MRF ...................................................967Daekeun You, Sameer Antani, Dina Demner-Fushman, Venu Govindaraju,and George R. Thoma

A New Method on the Segmentation and Recognition of Chinese Charactersfor Automatic Chinese Seal Imprint Retrieval ...........................................................................................972

Chao Ren, Dong Liu, and Youbin Chen

Graphics RecognitionCalliGUI: Interactive Labeling of Calligraphic Character Images ..............................................................977

George Nagy and Xiafen Zhang

Symbol Spotting in Line Drawings through Graph Paths Hashing ...........................................................982Anjan Dutta, Josep Lladós, and Umapada Pal

A Non-rigid Feature Extraction Method for Shape Recognition ................................................................987Jon Almazán, Alicia Fornés, and Ernest Valveny

Low Resolution QR-Code Recognition by Applying Super-Resolution Usingthe Property of QR-Codes ........................................................................................................................992

Yuji Kato, Daisuke Deguchi, Tomokazu Takahashi, Ichiro Ide, and Hiroshi Murase

Character Recognition (2)Tuning between Exponential Functions and Zones for Membership FunctionsSelection in Voronoi-Based Zoning for Handwritten Character Recognition ............................................997

S. Impedovo and G. Pirlo

Multiple Instance Learning Based Method for Similar Handwritten ChineseCharacters Discrimination .......................................................................................................................1002

Yunxue Shao, Chunheng Wang, Baihua Xiao, Rongguo Zhang, and Yang Zhang

Perceptron Learning of Modified Quadratic Discriminant Function ........................................................1007Tong-Hua Su, Cheng-Lin Liu, and Xu-Yao Zhang

Similar Handwritten Chinese Character Recognition Using DiscriminativeLocality Alignment Manifold Learning .....................................................................................................1012

Dapeng Tao, Lingyu Liang, Lianwen Jin, and Yan Gao

xix

Page 17: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Keynote Speech 3The Four and a Half Challenges of Humanities Data .............................................................................1017

Marc Wilhelm Küster

Text Extraction (2)A Gradient Vector Flow-Based Method for Video Character Segmentation ...........................................1024

Trung Quy Phan, Palaiahnakote Shivakumara, Bolan Su, and Chew Lim Tan

Text Extraction from Video Using Conditional Random Fields ...............................................................1029Xujun Peng, Huaigu Cao, Rohit Prasad, and Premkumar Natarajan

Scene Text Extraction by Superpixel CRFs Combining Multiple CharacterFeatures ..................................................................................................................................................1034

Min Su Cho, Jae-Hyun Seok, Seonghun Lee, and Jin Hyung Kim

Bayesian Approach to Photo Time-Stamp Recognition .........................................................................1039Asif Shahab, Faisal Shafait, and Andreas Dengel

A Chinese Character Localization Method Based on Intergrating Structureand CC-Clustering for Advertising Images .............................................................................................1044

Jie Liu, Shuwu Zhang, Heping Li, and Wei Liang

Scenery Character Detection with Environmental Context .....................................................................1049Yasuhiro Kunishige, Feng Yaokai, and Seiichi Uchida

Document Retrieval (2)Real-Time Document Image Retrieval for a 10 Million Pages Database witha Memory Efficient and Stability Improved LLAH ...................................................................................1054

Kazutaka Takeda, Koichi Kise, and Masakazu Iwamura

Document Image Classification and Labeling Using Multiple Instance Learning ...................................1059Jayant Kumar, Jaishanker Pillai, and David Doermann

A Lattice-Based Method for Keyword Spotting in Online Chinese Handwriting .....................................1064Heng Zhang and Cheng-Lin Liu

A Graph Lattice Approach to Maintaining Dense Collections of Subgraphsas Image Features ..................................................................................................................................1069

Eric Saund

Similar Manga Retrieval Using Visual Vocabulary Based on Regionsof Interest ................................................................................................................................................1075

Weihan Sun and Koichi Kise

Case Study in Hebrew Character Searching ..........................................................................................1080Irina Rabaev, Ofer Biller, Jihad El-Sana, Klara Kedem, and Itshak Dinstein

xx

Page 18: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Character Recognition (3)Multiscale Histogram of Oriented Gradient Descriptors for Robust CharacterRecognition .............................................................................................................................................1085

Andrew J. Newell and Lewis D. Griffin

A Coarse Classifier Construction Method from a Large Number of BasicRecognizers for On-line Recognition of Handwritten Japanese Characters ..........................................1090

Bilan Zhu and Masaki Nakagawa

Affine-Invariant Recognition of Handwritten Characters via Accelerated KLDivergence Minimization .........................................................................................................................1095

Toru Wakahara and Yukihiko Yamashita

MQDF Discriminative Learning Based Offline Handwritten Chinese CharacterRecognition .............................................................................................................................................1100

Yanwei Wang, Xiaoqing Ding, and Changsong Liu

A Semi-supervised SVM Framework for Character Recognition ............................................................1105Amit Arora and Anoop M. Namboodiri

Efficient Word Recognition Using a Pixel-Based Dissimilarity Measure .................................................1110Sebastian Colutto and Basilis Gatos

Poster Session 3A Compression Scheme for Handwritten Patterns Based on Curve Fitting ...........................................1115

Kamal Gupta, Manish Bansal, and Santanu Chaudhury

Edge-Based Features for Localization of Artificial Urdu Text in Video Images ......................................1120Akhtar Jamil, Imran Siddiqi, Fahim Arif, and Ahsen Raza

Stamp Detection in Color Document Images ..........................................................................................1125Barbora Micenková and Joost van Beusekom

Trie-Lexicon-Driven Recognition for On-line Handwritten Japanese DiseaseNames Using a Time-Synchronous Method ...........................................................................................1130

Bilan Zhu and Masaki Nakagawa

Convolutional Neural Network Committees for Handwritten CharacterClassification ...........................................................................................................................................1135

Dan Claudiu Cireşan, Ueli Meier, Luca Maria Gambardella,and Jürgen Schmidhuber

Semantic Logging: Towards Explanation-Aware DAS ............................................................................1140Björn Forcher, Stefan Agne, Andreas Dengel, Michael Gillmann,and Thomas Roth-Berghofer

A Novel Preprocessing Method for Hectography Prints Based on IndependentComponent Analysis ...............................................................................................................................1145

Thomas Kurbiel, Iuliu Konya, and Stefan Eickeler

xxi

Page 19: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

An Empirical Evaluation on HIT-OR3C Database ..................................................................................1150Shusen Zhou, Qingcai Chen, Xiaolong Wang, Xinyi Guo, and Hui Li

Greek Polytonic OCR Based on Efficient Character Class Number Reduction .....................................1155B. Gatos, G. Louloudis, and N. Stamatopoulos

Adaptive Zoning Features for Character and Word Recognition ............................................................1160B. Gatos, A. L. Kesidis, and A. Papandreou

A New Fourier-Moments Based Video Word and Character Extraction Methodfor Recognition ........................................................................................................................................1165

Deepak Rajendran, Palaiahnakote Shivakumara, Bolan Su, Shijian Lu,and Chew Lim Tan

Signature Segmentation from Machine Printed Documents Using ConditionalRandom Field .........................................................................................................................................1170

Ranju Mandal, Partha Pratim Roy, and Umapada Pal

Lexicon-Free, Novel Segmentation of Online Handwritten Indic Words .................................................1175Suresh Sundaram and A. G. Ramakrishan

Scale Space Binarization Using Edge Information Weighted by a ForegroundEstimation ...............................................................................................................................................1180

Florian Kleber, Markus Diem, and Robert Sablatnig

Multi Resolution Layout Analysis of Medieval Manuscripts Using Dynamic MLP ..................................1185Micheal Baechler and Rolf Ingold

Document Images Indexing with Relevance Feedback: An Applicationto Industrial Context ................................................................................................................................1190

O. Augereau, N. Journet, and J.-P. Domenger

Interactive Competitive Breadth-First Exploration for Sketch Interpretation ...........................................1195Achraf Ghorbel, Sébastien Macé, Aurélie Lemaitre, and Eric Anquetil

Document Image Indexing Using Edit Distance Based Hashing ............................................................1200Ehtesham Hassan, Santanu Chaudhury, and M. Gopal

New Binarization Approach Based on Text Block Extraction .................................................................1205Ines Ben Messaoud, Hamid Amiri, Haikal El Abed, and Volker Märgner

Searching OCR'ed Text: An LDA Based Approach ................................................................................1210Ehtesham Hassan, Vikram Garg, S. K. Mirajul Haque, Santanu Chaudhury,and M. Gopal

A CRF Based Scheme for Overlapping Multi-colored Text Graphics Separation ..................................1215Ritu Garg, Ehtesham Hassan, Santanu Chaudhury, and M. Gopal

Fuzzy Relative Positioning Templates for Symbol Recognition .............................................................1220Adrien Delaye and Eric Anquetil

Recognition of Printed Mathematical Expressions Using Two-DimensionalStochastic Context-Free Grammars .......................................................................................................1225

Francisco Álvaro, Joan-Andreu Sánchez, and José-Miguel Benedí

xxii

Page 20: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Document Recto-verso Registration Using a Dynamic Time Warping Algorithm ...................................1230Rabeux Vincent, Journet Nicholas, and Domenger Jean Philippe

Automatic Content Extraction on Semi-structured Documents ..............................................................1235José Eduardo Bastos dos Santos

Video Script Identification Based on Text Lines .....................................................................................1240Trung Quy Phan, Palaiahnakote Shivakumara, Zhang Ding, Shijian Lu,and Chew Lim Tan

Extending Page Segmentation Algorithms for Mixed-Layout DocumentProcessing ..............................................................................................................................................1245

Amy Winder, Tim Andersen, and Elisa H. Barney Smith

Better Digit Recognition with a Committee of Simple Neural Nets .........................................................1250Ueli Meier, Dan Claudiu Cireşan, Luca Maria Gambardella,and Jürgen Schmidhuber

Towards Improved Paper-Based Election Technology ..........................................................................1255Elisa H. Barney Smith, Daniel Lopresti, George Nagy, and Ziyan Wu

An Evaluation of HMM-Based Techniques for the Recognition of ScreenRendered Text ........................................................................................................................................1260

Sheikh Faisal Rashid, Faisal Shafait, and Thomas M. Breuel

A System for an Automatic Reading of Student Information Sheets ......................................................1265Afef Kacem, Asma Saïdani, and Abdel Belaïd

Wall Patch-Based Segmentation in Architectural Floorplans .................................................................1270Lluís-Pere de las Heras, Joan Mas, Gemma Sánchez, and Ernest Valveny

High Performance Layout Analysis of Arabic and Urdu Document Images ...........................................1275Syed Saqib Bukhari, Faisal Shafait, and Thomas M. Breuel

Automatically Discriminating between Digital and Scanned Photographs .............................................1280Rafael Dueire Lins, Gabriel de França Pereira e Silva, and Steven J. Simske

A Discriminative Model for On-line Handwritten Japanese Text Retrieval .............................................1285Cheng Cheng, Bilan Zhu, and Masaki Nakagawa

A Circular Grid-Based Rotation Invariant Feature Extraction Approachfor Off-line Signature Verification ............................................................................................................1289

Marianela Parodi, Juan C. Gómez, and Abdel Belaïd

Fringe Map Based Text Line Segmentation of Printed Telugu DocumentImages ....................................................................................................................................................1294

Vijaya Kumar Koppula and Atul Negi

Combining of Off-line and On-line Feature Extraction Approaches for WriterIdentification ............................................................................................................................................1299

Aymen Chaabouni, Houcine Boubaker, Monji Kherallah, Adel M. Alimi,and Haikal El Abed

xxiii

Page 21: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

On-line Arabic Handwritten Personal Names Recognition System Basedon HMM ..................................................................................................................................................1304

Sherif Abdelazeem and Hesham M. Eraqi

Using Earth Mover's Distance in the Bag-of-Visual-Words Modelfor Mathematical Symbol Retrieval .........................................................................................................1309

Simone Marinai, Beatrice Miotti, and Giovanni Soda

Enhancing Handwritten Word Segmentation by Employing Local SpatialFeatures ..................................................................................................................................................1314

Fotini Simistira, Vassilis Papavassiliou, Themos Stafylakis, and Vassilis Katsouros

Symbol Recognition by Multiresolution Shape Context Matching ..........................................................1319Feng Su, Tong Lu, and Ruoyu Yang

On-line Arabic Handwriting Recognition System Based on HMM ..........................................................1324Hany Ahmed and Sherif Abdel Azeem

Recognizing Text Elements for SVG Comic Compression and Its NovelApplications ............................................................................................................................................1329

Chung-Yuan Su, Ray-I Chang, and Jen-Chang Liu

Facilitating Understanding of Large Document Collections ....................................................................1334Jae Hyeon Bae, Weijia Xu, and Maria Esteva

Translation-Inspired OCR .......................................................................................................................1339Dmitriy Genzel, Ashok C. Popat, Nemanja Spasojevic, Michael Jahr,Andrew Senior, Eugene Ie, and Frank Yung-Fong Tang

Towards Searchable Digital Urdu Libraries - A Word Spotting Based RetrievalApproach ................................................................................................................................................1344

Ali Abidi, Imran Siddiqi, and Khurram Khurshid

Word Warping for Offline Handwriting Recognition ................................................................................1349Douglas J. Kennard, William A. Barrett, and Thomas W. Sederberg

Script-Free Text Line Segmentation Using Interline Space Model for PrintedDocument Images ...................................................................................................................................1354

Minwoo Kim and Il-Seok Oh

Text Localization in Web Images Using Probabilistic Candidate SelectionModel ......................................................................................................................................................1359

Liangji Situ, Ruizhe Liu, and Chew Lim Tan

Functional-Based Table Category Identification in Digital Library ..........................................................1364Seongchan Kim and Ying Liu

Chinese Chess Character Recognition with Radial Harmonic Fourier Moments ...................................1369Wang Kejia, Zhang Honggang, Ping Ziliang, and Haiying

Non-rigid Registration and Restoration of Double-Sided Historical Manuscripts ...................................1374Jie Wang and Chew Lim Tan

xxiv

Page 22: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

A Fast Appearance-Based Full-Text Search Method for Historical NewspaperImages ....................................................................................................................................................1379

Kengo Terasawa, Takahiro Shima, and Toshio Kawashima

Reliable Online Stroke Recovery from Offline Data with the Data-EmbeddingPen ..........................................................................................................................................................1384

Marcus Liwicki, Yoshida Akira, Seiichi Uchida, Masakazu Iwamura,Shinichiro Omachi, and Koichi Kise

Online Handwriting Recognition of Tamil Script Using Fractal Geometry ..............................................1389Rituraj Kunwar and A. G. Ramakrishnan

Connected Component Level Discrimination of Handwrittenand Machine-Printed Text Using Eigenfaces ..........................................................................................1394

Samuel J. Pinson and William A. Barrett

Recognition of Multi-oriented, Multi-sized, and Curved Text ..................................................................1399Yao-Yi Chiang and Craig A. Knoblock

Scenario Driven In-depth Performance Evaluation of Document LayoutAnalysis Methods ....................................................................................................................................1404

C. Clausner, S. Pletschacher, and A. Antonacopoulos

Recognition of Multiple Characters in a Scene Image Using Arrangementof Local Features ....................................................................................................................................1409

Masakazu Iwamura, Takuya Kobayashi, and Koichi Kise

Quality Evaluation of Character Image Database and Its Application ....................................................1414Hiroyuki Hase

Mathematical Formula Identification in PDF Documents ........................................................................1419Xiaoyan Lin, Liangcai Gao, Zhi Tang, Xiaofan Lin, and Xuan Hu

Panel DiscussionEvaluation of Fonts for Digital Publishing and Display ...........................................................................1424

C. Y. Suen, N. Dumont, M. Dyson, Y.-C. Tai, and X. Lu

CompetitionsInternational Conference on Document Analysis and Recognition (ICDAR2011) - Competitions Overview ..............................................................................................................1437

Haikal El Abed, Liu Wenyin, and Volker Märgner

ICDAR 2011 - Arabic Handwriting Recognition Competition ..................................................................1444Volker Märgner and Haikal El Abed

ICDAR 2011 - Arabic Recognition Competition: Multi-font Multi-size DigitallyRepresented Text ...................................................................................................................................1449

Fouad Slimane, Slim Kanoun, Haikal El Abed, Adel M. Alimi, Rolf Ingold,and Jean Hennebert

xxv

Page 23: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Online Arabic Handwriting Recognition Competition ..............................................................................1454Monji Kherallah, Najiba Tagougui, Adel M. Alimi, Haikal El Abed,and Volker Märgner

ICDAR 2011 - French Handwriting Recognition Competition .................................................................1459Emmanuèle Grosicki and Haikal El-Abed

ICDAR 2011 Chinese Handwriting Recognition Competition .................................................................1464Cheng-Lin Liu, Fei Yin, Qiu-Feng Wang, and Da-Han Wang

The ICDAR2011 Arabic Writer Identification Contest .............................................................................1470Abdelâali Hassaïne, Somaya Al-Maadeed, Jihad Mohamad Alja'am, Ali Jaoua,and Ahmed Bouridane

ICDAR 2011 Writer Identification Contest ..............................................................................................1475G. Louloudis, N. Stamatopoulos, and B. Gatos

Signature Verification Competition for Online and Offline Skilled Forgeries(SigComp2011) .......................................................................................................................................1480

Marcus Liwicki, Muhammad Imran Malik, C. Elisa van den Heuvel,Xiaohong Chen, Charles Berger, Reinoud Stoel, Michael Blumenstein,and Bryan Found

ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Textin Born-Digital Images (Web and Email) .................................................................................................1485

D. Karatzas, S. Robles Mestre, J. Mas, F. Nourbakhsh, and P. Pratim Roy

ICDAR 2011 Robust Reading Competition Challenge 2: Reading Textin Scene Images .....................................................................................................................................1491

Asif Shahab, Faisal Shafait, and Andreas Dengel

CROHME2011: Competition on Recognition of Online HandwrittenMathematical Expressions ......................................................................................................................1497

Harold Mouchère, Christian Viard-Gaudin, Dae Hwan Kim, Jin Hyung Kim,and Utpal Garain

ICDAR 2011 Book Structure Extraction Competition ..............................................................................1501Antoine Doucet, Gabriella Kazai, and Jean-Luc Meunier

ICDAR 2011 Document Image Binarization Contest (DIBCO 2011) ......................................................1506Ioannis Pratikakis, Basilis Gatos, and Konstantinos Ntirogiannis

The ICDAR 2011 Music Scores Competition: Staff Removal and WriterIdentification ............................................................................................................................................1511

Alicia Fornés, Anjan Dutta, Albert Gordo, and Josep Lladós

Historical Document Layout Analysis Competition .................................................................................1516A. Antonacopoulos, C. Clausner, C. Papadopoulos, and S. Pletschacher

Document Analysis Algorithm Contributions in End-to-End Applications:Report on the ICDAR 2011 Contest .......................................................................................................1521

Bart Lamiroy, Daniel Lopresti, and Tao Sun

xxvi

Page 24: 2011 International Conference on Document Analysis and ...toc.proceedings.com/13215webtoc.pdfBeijing, China 18 – 21 September 2011 IEEE Catalog Number: ISBN: CFP11227-PRT 978-1-4577-1350-7

Author Index

xxvii