tc10 and tc11 - uab barcelonadimos/downloads/dataanalysis_v03.pdfsubmissions to tc10/11 events 115...

78
TC10 and TC11 Overview, current status Data compiled in Q4 2019

Upload: others

Post on 20-Jan-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

TC10 and TC11Overview, current status

Data compiled in Q4 2019

Page 2: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Membership

15371544

1039 1039 10571057

894

1204

1235

1210

116154

0

200

400

600

800

1000

1200

1400

1600

1800

01

/06

/20

11

01

/09

/20

11

01

/12

/20

11

01

/03

/20

12

01

/06

/20

12

01

/09

/20

12

01

/12

/20

12

01

/03

/20

13

01

/06

/20

13

01

/09

/20

13

01

/12

/20

13

01

/03

/20

14

01

/06

/20

14

01

/09

/20

14

01

/12

/20

14

01

/03

/20

15

01

/06

/20

15

01

/09

/20

15

01

/12

/20

15

01

/03

/20

16

01

/06

/20

16

01

/09

/20

16

01

/12

/20

16

01

/03

/20

17

01

/06

/20

17

01

/09

/20

17

01

/12

/20

17

01

/03

/20

18

01

/06

/20

18

01

/09

/20

18

01

/12

/20

18

01

/03

/20

19

01

/06

/20

19

01

/09

/20

19

Membership TC11

Mailing list Twitter Followers

Membership TC10

170

Page 3: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Structure

Chair: Dimosthenis Karatzas

Vice chair: Gernot Fink

Dataset curator: Joseph Chazalon

Communications: Andreas Fischer

Education officer: Michael Blumenstein

Chair: Alicia Fornés

Vice chair: Jean-Christophe Burie

Dataset curator: Partha Pratim Roy

Communications: Christophe Rigaud

Education officer: K.C. Santosh

Webmaster: Muzzamil Luqman

Page 4: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

DAR Events

Page 5: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

119

0

378

99

129

387

106

150

500

145 150

0

126

151

464

141 140

433

0

100

200

300

400

500

600

DAS2008

ICFHR2008

ICDAR2009

DAS2010

ICFHR2010

ICDAR2011

DAS2012

ICFHR2012

ICDAR2013

DAS2014

ICFHR2014

ICDAR2015

DAS2016

ICFHR2016

ICDAR2017

DAS2018

ICFHR2018

ICDAR2019

Participants

Participants in TC10/11 events

Page 6: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Participants Flow

Page 7: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Submissions to TC10/11 events

115

161

454

91

162

422

131158

429

138

192

401

162135

409

131 125

403

80

118

277

65

117

278

91

132

265

73

127

234

7899

212

7797

228

24 3987 28 37 88 36 48 81

27 40 90 32 33 52 32 32 520

100

200

300

400

500

DAS2008

ICFHR2008

ICDAR2009

DAS2010

ICFHR2010

ICDAR2011

DAS2012

ICFHR2012

ICDAR2013

DAS2014

ICFHR2014

ICDAR2015

DAS2016

ICFHR2016

ICDAR2017

DAS2018

ICFHR2018

ICDAR2019

Submissions

Submitted Accepted Oral

Page 8: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Acceptance Rates of TC10/11 events

0.0%

10.0%

20.0%

30.0%

40.0%

50.0%

60.0%

70.0%

80.0%

90.0%

2008-09 2010-11 2012-13 2014-15 2016-17 2018-19

Acceptance Rates

ICDAR Acceptance Rate ICDAR Oral Acceptance Rate DAS Acceptance Rate

DAS Oral Acceptance Rate ICFHR Acceptance Rate ICFHR Oral Acceptance Rate

Page 9: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

TC10/11 Conference Quality

ICDAR (biannual): CORE A

ICFHR, DAS (biannual): CORE B

GREC (biannual): not listed

cf.

ICPR (biannual): CORE B

CVPR (annual): CORE A*

ICCV (biannual): CORE A*

Page 10: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Program Chairs (2008 – 19)

Page 11: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Participation in top roles (organisers / program Chairs)

Page 12: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Participation in Organising Committees

Page 13: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

# DAR Papers in non-IAPR Forums

2

5 4

1

5 4 57

11 112

5

4

9

1

22

4

9

2

8

4

24 1

1

2

2

0

5

10

15

20

25

30

35

2010 2011 2012 2013 2014 2015 2016 2017 2018 2019

# P

ub

licat

ion

s ACMM

ACCV

ECCV

ICCV

CVPR

Page 14: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

DAR Topics in non-IAPR Forums

7517

2

3

3

4

43

1 4 3

Scene Text

Document restoration

Document understanding

Finegrained classification

Forensics

Human-Document Interaction

OCR

Graphical documents

Font recognition

Handwriting recognition

Text style

Page 15: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Other offerings of DAR events• Organised by TC10/11 members

• CVPR2020 Workshop on Text and Documents in the Deep Learning Era• Satellite to CVPR• https://cvpr2020text.wordpress.com/

• IWRR – Int. W. on Robust Reading (2014, 2016, 2018)• Satellite to ACCV and ECCV• http://www.cvc.uab.es/iwrr2018/

• ASAR – Int. W. on Arabic and derived Script Analysis and Recognition (2017, 2018, 2019)• Co-sponsored by IEEE• https://asar.ieee.tn/

• DataDoc – Summer School on data science for document analysis and understanding (July 1-26, 2019)• https://datadoc.univ-lr.fr/

• Other events of interest• Document Recognition and Retrieval (DRR)

• http://drr2016.loria.fr/

• ACM DocEng• https://doceng.org/doceng2019/

• Document Intelligence 2019• Satellite to NeurIPS 2019• https://www.aclweb.org/portal/content/document-intelligence-2019-workshop-neurips

Page 16: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Calendar of Events

DAS 2020

May 17-20

Wuhan,

China

ICFHR 2020

September 8-10

Dortmund,

Germany

ICDAR/GREC 2021

September 5-10

Lausanne,

Switzerland

DAS 2022

???

???

ICFHR 2022

September 8-10

Hyderabad,

India

ICPR 2022

(Montreal, Canada)

Summer School

???

???

CVPR 2019 (Long

Beach, USA)

CVPR 2018 (Salt

Lake City, USA)

2017 2018 2019 2020

ICDAR 2017

Nov. 13-15

Kyoto, Japan

DAS 2018

24-27 April

Vienna, Austria

ICFHR 2018

August 5-8

Niagra Falls, USA

ICPR 2018

(Beijing, China)

ICPR 2020

(Milan, Italy)

ICDAR 2019 /

GREC

September 22-25

Sydney, Australia

ICCV 2019

(Seoul, Korea)

2021 2022

Page 17: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Forthcoming bids

Summer School 2021 : Q4 2019 (published)

ICDAR 2025 : during 2020

ICFHR 2024 : Q1 2020

DAS 2022 : Q3 2020

Page 18: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

ICDAR 2019 survey

Page 19: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Main Conference

Page 20: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Main Conference

• Posters too close

• Orals too long (spotlights better)

• Quality of some posters better tan some orals

• Rather expensive (registration, hotels, extra tickets)

• Info (program, rooms, social events…) too late (some already had the flights)

• Proceedings available (online) before the event starts

• Scientific quality sometimes low (old topics such as binarization, or simply engineering to reach 1% better than SoA)

• Papers describing datasets should not receive best paper awards

Page 21: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Program

• Posters too close

• Some posters without presenter

• Journal sessions NOT overlapped!

• Low quality of some papers

• Track for VIDEOs

• Posters in the afternoon → low attendance

Page 22: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Review Process

Page 23: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Review Process

Page 24: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Review Process

• Reviews low quality, based on opinions, not facts, reading too fast

• Reviews of workshops of low quality

• Area chairs should take care of poor reviews

• Rebutals are ignored

• Open review

Page 25: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Journal Track

• Authors who submitted:• Good initiative

• Some journal papers of lower quality than some orals

• Reviewers no aligned with the CFP of this journal track

• Process review long

• Instead of IJDAR → PR is more attractive

• Other comments:• Start earlier (CFP and submission)

• No coherence in the sessions for IJDAR (no common topic)

Page 26: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Workshops

GREC 2019: The 13th IAPR International Workshop on Graphics Recognition

HIP: The 5th International Workshop on Historical Document Imaging

ICDAR-OST: The 2nd International Workshop on Open Services and Tools for Document Analysis

FDAR: 2nd Future of Document Analysis and Recognition Workshop

HDI 2019: The 2nd International Workshop on Human-Document Interaction

CBDAR: The 8th International Workshop on Camera-Based Document Analysis and Recognition

ICDAR-WML: The 2nd International Workshop on Machine Learning

ASAR 2019: The 3rd International Workshop on Arabic and derived Script Analysis and Recognition

WIADAR: Workshop on Industrial Applications of Document Analysis and Recognition

IWCDF 2019: Second International Workshop on Computational Document Forensics

Page 27: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Workshops

• Too many interesting workshops in parallel

• Some rooms too small

• Workshop schedules before the travel arrangements

• Workshops should be merged in main tracks

• ICDAR-OST needs more time

• GREC again more fun and informative than expected (the best part of ICDAR)

• More time for discussions (like in forensics)

• HIP: the quizz very good

• WIADAR (Industrial apps) needs more promotion

• Most workshops with low quality papers

• Worshop should be a place to discuss, not to present rejected papers

• Some high quality papers could go in main ICDAR tracks (HIP, CBDAR, WML)

Page 28: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Tutorials

A. GMPRDIA: Graph-based Methods in Pattern Recognition and Document Image Analysis Recognition

B. Vision and Language: the Text Modality in Computer Vision

C. Deep Learning for Document Analysis, Text Recognition, and Language Modeling

D. Creating a Reproducible Research Environment

E. The NLP Canvas and AQuA System for Documents

• Tutorial schedules before the travel arrangements

• The Deep learning workshop ( C ) was too basic

Page 29: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Competitions

• Good innitiative• Too many competitions• Too many specialized topics (low impact)• Deadlines too close to the conference• Longer time periods (timeline)• More visibility• More continuity• Competition session longer. Winners more time to explain the method,

participants have time to talk to each other• Organizer: competition chairs slow responding, website not properly

updated

Page 30: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Doctoral Consortium

• Good innitiative, very useful for students

• Attendance very low (only some mentors)

• Reduce overlapping with other activities to increase participation

• Mentors not giving feedback to the student

• Mentoring period too short (and half time lost because of holidays)

• Instead of explaining the poster, do exercice/game

• Update the presentation made to students

• Info on contributions, writing, career preparation…

• Special track for phd supervisors ;-)

Page 31: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

TC10/11 Education / Training

Page 32: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Doctoral Consortium

0

5

10

15

20

25

30

20112013

20152017

2019

Students Mentors

Page 33: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

1st IAPR TC10 / TC11 Summer School on Document Analysis: Document Informatics

• Organizers: Santanu Chaudhury (India), Venu Govindaraju(USA), C. V. Jawahar (India)

• Participating students: 80 (69 Indian, 11 international)

• Program overview:

• Lectures of invited speakers (4 Indian, 4 international) from academia and industry on

• indexing of large document collections,

• content representation and manipulation,

• information and document retrieval,

• machine learning and analytics for large document repositories

• Hands-on practical sessions accompanying lectures

• Poster presentations by participating students

• Social events (sightseeing tours, banquet)

Birla Institute, Jaipur, India, January 23-28, 2017

http://cvit.iiit.ac.in/SSDA/

Page 34: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

2nd IAPR TC10 / TC11 Summer School on Document Analysis

• Organizers: Jean-Marc Ogier, Jean-Christophe Burie

• Participating students: 25 (France, Austria, Tunisia, Indonesia, Vietnam, Pakistan, Finland, Sweden, Germany, and Italy)

• Program overview:

• Lectures of invited speakers (4 French, 5 international) from academia and industry on

• Content representation and manipulation

• Document indexing/retrieval in large corpus of documents

• Machine learning for document analysis and understanding

• Review of OCR methods and handwriting recognition techniques

• Historical documents and new challenges

• Human document interactions

• Text and graphics recognition in a complex environment

La Rochelle, France, July 2-6, 2018

http://cvit.iiit.ac.in/SSDA/

Page 35: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

3rd IAPR TC10 / TC11 Summer School on Document Analysis: Document Informatics

• Theme: Deep Learning Applications for Document Analysis

• Organizers: Pakistan Pattern Recognition Society (PPRS) –Faisal Shafait

• Participating students: 129 (30 Industry, 6 international students, 93 local students)

• Programming competition: Sketch recognition system, over the 4 days of the school

• Poster Competition

• Scholarships: 13 grants (disadvantaged backgrounds, remote areas, academic excellence, international)

National University of Sciences and Technology (NUST), Islamabad, Pakistan,

August 19-23, 2019

www.pprs.org.pk/events/ssda2019.html

Page 36: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Tutorials

2

3

2 2

3

4

3

5

0

1

2

3

4

5

6

DAS2014

ICFHR2014

ICDAR2015

DAS2016

ICFHR2016

ICDAR2017

DAS2018

ICFHR2018

ICDAR2019

# Tu

tori

als

Page 37: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

TC10/11 Resources

Page 38: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Datasets (TC11)

Page 39: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Competitions

22

19

11

25

27

0

5

10

15

20

25

30

ICDAR 2011 ICDAR 2013 ICDAR 2015 ICDAR 2017 ICDAR 2019

# C

om

pe

titi

on

s

Page 40: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Summary of the Dicsussions2nd Workshop on the Future of

Document Analysis and Recognition

Dimosthenis Karatzas (Computer Vision Centre, Spain)

C.V. Jawahar (IIIT, India)

Koichi Kise (Osaka Prefecture Univ., Japan)

Page 41: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Towards a Vision for the Future

• Joint Planning• Improve the Decision Making Process• Broaden the Research Themes• Strengthen the Community and Improve the Diversity• Improve the Impact of Research / Publications• Identify and Solve Grand Challenges

http://www.cvc.uab.es/fi2019/files/Future_of_DAR_v01.pdfGet the white paper here:

Page 42: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Discussions on Kialo

1. ICDAR / ICFHR / DAS / GREC into a single, annual conference

2. Decision making process like the model of TC PAMI’s CVPR / ICCV.

3. ICDAR / IJDAR journal track

4. ICDAR is still a good name

5. ICDAR should not accept non-image analysis papers

http://www.cvc.uab.es/fi2019/?c=technical

Participate:

Deadline: October 31st

Page 43: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Kialo Discussion #1

We should fuse ICDAR / ICFHR / DAS / GREC into a single, annual conference.On even years, our community celebrates three different events: ICFHR, DAS and GREC, while in parallel the key conference of IAPR is organised. GREC is already co-located with ICDAR. ICFHR and DAS are quickly growing into fully fledged conferences. ICFHR started as a workshop series and was formally renamed into a conference in 2008, while it organises a full range of satellite activities. DAS’ workshop character is maintained basically through the work group discussions. Between the two events, they gather about 300 participants (there is certain overlapping, although many people have to select one or two of the many events organised in even years). Do you think that organising a single, big conference every year, maybe collocating these events to start with, would serve the community better?

Page 44: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Kialo Discussion #2

We should adopt a decision making process in the community involving increased direct participation of a general assemblyThe IAPR DAR community comprises the members of TC10 and TC11. The TC10and TC11 chairs can decide on the structure of their leadership teams. ICDAR hasan advisory board, which provides feedback to TC10 and TC11 chairs. The TC10/11meeting that takes place during ICDAR tends to be informative, and only involvesthe community (all registered participants), in the voting of the next venue forICDAR.

Other communities, like the TC PAMI, make use of a general assembly with widerdecision powers. Any member of the community can bring a motion for vote tothe general assembly. The vote result is binding.

Both models have advantages and disadvantages. Do you consider that we shouldrevise the decision making process of our community, to enable more directparticipation for a general assembly?

Page 45: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Kialo Discussion #3

A journal publication should be linked with the ICDAR conference. The ICDAR / IJDAR track introduced in 2019 is a good idea.In some countries, it is more important to have journal publications than conference ones. Thus journal publications are combined with the conferences (instead of “proceedings”, special issues are produced for example). In ICDAR 2019 we started a “journal track” which provides space in the conference to present a paper accepted previously at IJDAR.

Page 46: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Kialo Discussion #4

ICDAR is still a good name for the conference. It projects well our activity to the rest of the research community.Is “document analysis” a topic that resonates well? In other cycles, like machine learning or computer vision conferences, this seems to be growing “old” being associated with half a century old OCR research.

The term “document” is understood by our community as a non-restrictive term that refers to any type of written communication (e.g. we all accept reading text in scene images as a valid topic for ICDAR), but maybe the term is not perceived in the same way in other communities, making us the “paper OCR people”.

At the same time, there seems to be renewed interest in “document analysis” elsewhere: the topic resurged as a subject area in CVPR 2019 for example, while a workshop on “Document Intelligence” is organised in the context of NeurIPS 2019.

Page 47: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Kialo Discussion #5

Papers that do not explicitly involve image analysis should NOT be accepted in ICDAR.Our community has historically tackled a core computer vision problem: interpreting written communication in images. But there is a lot in Document Analysis that does not imply Document Image Analysis. Should we be more open, and allow works that apply machine learning and pattern recognition to document understanding at large? Would analysing all tweets or Facebook posts out there be “document analysis”? Analysing the behaviour of readers, representation of extracted knowledge from a collection of (digitised or physical) documents are topics that should be accepted in ICDAR? Or are these tackled in other forums and it is better to restrict ourselves to the more comfortable, narrow definition requiring “image analysis”?

Page 48: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

During the WorkshopPhase 0: Ice Breaking activities

Page 49: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Ice breaking activities

• Multiple choice game (Kahoot)

• Ranking questions

• I go to both DAS and ICFHR most of the times.

• ICDAR is a well-known conference in the

wider computer vision and pattern

recognition research community.

• I publish my best quality research in ICDAR.

• I like the way the way strategic decisions are

being made in the community

Page 50: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

During the WorkshopPhase 1: Creating a common vision

Page 51: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Proposed Challenge

How can we realize this ?

2000 participants in ICDAR 2025

Page 52: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Creating a common vision

1 hour during which we worked in the “world full of opportunity” before coming back to reality. We have no limits in what we can do (money is not a limit, reaching out to 5 million researchers is not a problem). Challenges and barriers will be discussed in the second half, at this point we want to capture all possible ideas. We should avoid “idea stoppers”.

We asked participants to write statements on post-its. These could be in the form “We must … so that …” or “Wouldn’t it be nice if ….?”.

We read out aloud post-its offered to us, and grouped them together. Broader topics arise.

The resulting groups provide us the themes that inform our common vision statement.

Before the break, we asked people to put stickers on the resulting groups to indicate what themes they considered more important (a max of 2 or 3 stickers per person)

Synthesizing the themes into a single statement gave us the common vision (see next slide)

Page 53: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 54: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Themes and Common Vision

DiversityFree Access to PublicationsInteresting ProblemsYoung ResearchersBetter QualityBetter ParticipationIndustryVisibility

THEMES COMMON VISION

52642275

VOTES

We envisage a DIVERSE community, with good links with the INDUSTRY, attracting YOUNG researchers

Our research is driven by INTERESTING PROBLEMS, and leads to HIGH QUALITY PUBLICATIONS.

Our events are WELL KNOWN outside the document analysis community, well PARTICIPATED, by academic researchers and industry alike, and welcome researchers from DIFFERENT DISCIPLINES.

Page 55: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

During the WorkshopPhase 2: Discussion in groups

Page 56: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

5 Bold Steps

All tables are given the same task, which is to come up with 5 bold steps for reaching the common vision.

This way the activity is scalable, and we can have a comparison / contrasting of views between the different tables at the end. All tables are given a specific template to work on.

Moderators were selected.

This phase is the time to get realistic and identify the Challenges (negative points) and Supports(positive points), for each of the themes that informed the common vision.

We expected from each group 5 bold steps, that we should take to reach the common goal.

Vision, Mission and the 5 bold steps

Our Common VisionVISION THEME

#1

VISION THEME

#2

VISION THEME

#3

VISION THEME

#4VISION THEME

#5

VISION THEME

#6

1

2

3

4

5

1 Vision themes and vision statementCopy over to your board the themes created in the 1st part of the workshop. Enter these themes into the vision boxes around the vision statement. Copy over the vision statement at the centre of the board.

2 Supports and challenges

Looking at the vision themes identify the supports and challenges: what will help you, or slow you down in reaching your vision? Write these around the vision themes. You can use green post-its for supports and red post-its for challenges for example.

3 Five bold steps

Based on all the discussions you have had and the supports and challenges you have identified, create an action plan. What are the next 5 steps to take in order to reach your aims. Propose bold but realistic steps and filter them down to 5.

Page 57: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 58: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Team 1:Moderator: Andreas Dengel

• ICDAR web site as a portal

• Collect large dataset by asking industry

• Define new research subjects by collecting datasets

• One day event to invite industry

• Define grand challenges

Page 59: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Team 2Moderator: Wataru Ohyama

• Having a joint TC11 / ACM DocEng committee

• Encourage young researchers

• Improve the review process• define a review guideline

• Large datasets with industry

• Improve visibility• stronger press connection, etc.

Page 60: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Team 3Moderator: Bertrand Couasnon

• Diversity improvement• organizing workshops in different conferences

• Making an annual big event OR co-locating smaller conferences and workshops

• Industry connection

Page 61: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Team 4:Moderator: Rajiv Jain

• What is the definition of documents?• having workshop together?

• Real world competition datasets working with Industry• hackathon

• ICDAR competition platform• like KAGGLE• sharing the platform• single point access

• ICDAR related workshopswith other communities

(Vision / HCI / NLP / ML)

Page 62: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

During the WorkshopPhase 3: Final Presentations

Page 63: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Final Presentations

Page 64: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 65: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 66: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 67: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117
Page 68: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Major areas where action was demanded

1. Better coordination between TC10/11 events• ICDAR, ICFHR, DAS, GREC • Mid-long term strategy• Clear decision making processes

2. Improved diversity and participation• Connection with Industry• Young researchers• Expanding the scope of DAR

3. Grand Challenges• Definition• Resources• Promotion

Task Force

Strategic plan 2025

White paper v2.0

Page 69: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Online QuestionnaireGoogle form sent to approx 60 people

Page 70: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Q1. Who are we? How would you define the DAR community? What do we do / what do we research on?

• Document IMAGE Analysis (computer vision)

• Definition of Document? Now mainly IMAGE

• Document Engineering, but we move too slow to higher-levels of DAR (processing, understanding info)

• Our community should be close to NLP and multimedia docs (video, camera-based), scene text, digital born (pdf, html)

Present State

Page 71: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Please rate from 1 to 5 these topics as relevant or not relevant to publish in ICDAR. If you reviewed a (good) paper on these topics, would you accept it for publication in ICDAR, or reject it as out of topic?

1. Reading behaviour analysis (e.g. analysis of gaze sequences during reading)

2. Analysis of social media (text and image) postings

3. Electronic document analysis (Web pages, EDI, PDF, ...)

4. Augmented documents and human-document interaction (e.g. dynamically projecting information on a document)

5. Recognising sign language (gestures in video)

6. Document summarisation

7. Extracting and linking information from document collections or between documents and external big data

8. Ontologies for modelling document content

9. Crowdsourcing (e.g. user interfaces for annotation, gamification, serious games)

10. Legibility and layout quality analysis (e.g. presentation modes, influences of typography)

Page 72: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

AMBITION. What should be the Grand Challenges driving the activities of our community?• Systems that understand contents→ components in intelligent agents

• End-to-end pipeline for understanding and interacting with document collections

• Info extraction, question-answering, summarization

• Multimodal retrieval

• Optimal Human-in-the-loop systems

• Fake news

• Ground-truth free OCR

• Layout analysis, table recognition

• Public BIG datasets

• Connecting our research with communities in NLP, DH, brain studies…

• Address some of the new topics (broader scope)

• Attract industry, visibility

Future

Page 73: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

What are your thoughts about the FUTURE of our community? Where would you like to see our community in 5 years’ time?

• Move to new challenges (some problems are solved)

• Integrate research from NLP

• Encompass more people from Digital libraries, semantic docprocessing

• Being an AA conference

• Be active in machine learning community

• Lack of CORE research (we are testers of CVPR novelties).

• Interaction with other communities (NLP, HCI, …)

• More topics (way of writing and parkinson)

Page 74: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

Are there any concrete actions that you think the TC10 and TC11 should take in the mid/long term future? Think in terms of organisation of our community (TC10/TC11), our events (ICDAR, DAS, ICFHR, GREC), academic and industry participation, young researchers in our field, etc

• TCs should analyze data in a regular basis (no static snapshots)

• Joint commission with NLP (ACL), DH, Library science

• Extend topics for ICDAR

• Annual ICDAR, merge DAS and ICFHR

• Merge TC10 and TC11

• Competitions on more challenging tasks

• Too many events, workshops integrated into ICDAR

• Committees with senior and young people for fostering changes

• More target-oriented marketing

• More participation from industry• short stays of PhDs in companies, industrial challenges and datasets (privacy)

• Less but longer running competition cycles

• Organize workshops in related conferences (CVPR, KDD, NACCL)

Page 75: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

What are the key strengths (differentiating added values) of our communities?• International

• Interdisciplinary• We also have experience in computer vision, NLP, Knowledge management…

• Expertise in vision and document analysis

• Technically interesting work while friendly community (welcome tonewbie)

• Real applications (real challenges in industry)

SWOT

Page 76: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

What are the key weaknesses of our community?• Focussing too much on reaching a 0.2% in accuracy

• Too isolated compared to NLP

• Lack of diversity. Limited application if only document IMAGES

• Low impact of ICDAR, IJDAR

• We need to be more open to industrial needs

• Self-centered community (No renewal in committee and people)

• Lacking front end research (we apply techniques for other computer visionfields)

• Low ability to attract new talent. Low visibility.

• Lack of clarity (what is a document, what is the task)

Page 77: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

What are important opportunities we should be taking advantage of?• Interactions with other communities: NLP, multimedia

• Hot topics (cultural heritage preservation) involve DAR• Industry and society are interested in our topics

• Open challenges: Semantic Analysis. We should address this beforeother communities do it

• Make the document strong again (many aspects are important)

• Smaller conferences (ICDAR compared to CVPR) can foster interaction

• Increase visibility to attract outsiders and foster citation of our Works

• AI is everywhere

Page 78: TC10 and TC11 - UAB Barcelonadimos/downloads/DataAnalysis_v03.pdfSubmissions to TC10/11 events 115 161 454 91 162 422 131 158 429 138 192 401 135 409 131 125 403 80 118 277 65 117

What important threats do you see for our community?• Ongoing funding for the different lines of research• No further progress without NLP, DH…• Others reinventing our wheels, being embedded in other communities

• NLP is extending to process document IMAGES

• DAR researchers are moving to Computer Vision (no more Docs)• DAR seen as secondary for CV (Good papers to CVPR, ICDAR is second league)• Becoming isolated researchers not addressing real world problems• Big old guys control the community• A fresh vision for the future• Stay away from AI rise• Lack of participation from new groups/researchers