mathematical webs accessibility csun 2014

35
Miquel Centelles, Mireia Ribera, Inmaculada Rodríguez Adaptabit Group – University of Barcelona CSUN Conference 2014

Upload: adaptabit

Post on 27-May-2015

857 views

Category:

Education


1 download

DESCRIPTION

This paper presents a research concerning the conversion of non-accessible web pages containing mathematical formulae into accessible versions through an OCR (Optical Character Recognition) tool. The objective of this research is twofold. First, to establish criteria for evaluating the potential accessibility of mathematical web sites, i.e. the feasibility of converting non-accessible (non-MathML) math sites into accessible ones (Math-ML). Second, to propose a data model and a mechanism to publish evaluation results, making them available to the educational community who may use them as a quality measurement for selecting learning material. Results show that the conversion using OCR tools is not viable for math web pages mainly due to two reasons: many of these pages are designed to be interactive, making difficult, if not almost impossible, a correct conversion; formula (either images or text) have been written without taking into account standards of math writing, as a consequence OCR tools do not properly recognize math symbols and expressions. In spite of these results, we think the proposed methodology to create and publish evaluation reports may be rather useful in other accessibility assessment scenarios.

TRANSCRIPT

Page 1: Mathematical Webs Accessibility CSUN 2014

Miquel Centelles, Mireia Ribera, Inmaculada RodríguezAdaptabit Group – University of Barcelona

CSUN Conference 2014

Page 2: Mathematical Webs Accessibility CSUN 2014

The visionThe problemSetting the stageOur point of viewOur solution: MathML, OCR

(InftyReader) and linked dataResults and future work

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 2

Page 3: Mathematical Webs Accessibility CSUN 2014

Teaching methodologies have shifted from content-based to skills-based learning.

A key business for teachers is the selection of web resources which serve as reference, extension and motivation to their students.

This selection is mainly based on content and source quality, but rarely considers accessibility criteria.

Accessibility criteria will be increasingly important.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 3

Page 4: Mathematical Webs Accessibility CSUN 2014

Still lots of non-accessible math web sites.

Why? Interactivity. Formulae in graphics or

videos. Authoring tools automatic conversion to

MathML▪ Not fully reliable▪ Often convert to images

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 4

Page 5: Mathematical Webs Accessibility CSUN 2014

Concerning accessibility in maths, several initiatives have been driven by publishers and libraries.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 5

Page 6: Mathematical Webs Accessibility CSUN 2014

Concerning the semantic meaning to mathematical formulae several initiatives link MathML with the semantic web. Christoph Lange: a proposal to describe

generic mathematical formulae. OpenMath: a lightweight ontology to endorse

the meaning of mathematical symbols. HELM: the pioneer in representing structures

of mathematical knowledge in RDF.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 6

Page 7: Mathematical Webs Accessibility CSUN 2014

To assess the potential accessibility of non-accessible math web sites? Definition: Potential accessibility = The

feasibility of converting them into accessible webs

To publish assessment results in a semantically-empowered way

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 7

Page 8: Mathematical Webs Accessibility CSUN 2014

How to accessibilize a Math web? Converting to MathML

Re-writing formulae<mrow>

a &InvisibleTimes; <msup>x 2</msup> + b

&InvisibleTimes; x + c

</mrow> Describing formulae in

alternative text“A times square x plus b times x plus c”

Converting through OCR<mrow>

a &InvisibleTimes; <msup>x 2</msup> + b

&InvisibleTimes; x + c

</mrow>March 2014 CSUN 2014 - Potential accessibility of mathematical webs 8

Page 9: Mathematical Webs Accessibility CSUN 2014

OCR is the best option: Does not require expertise OK for low resources Can be done by the student

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 9

Page 10: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 10

Page 11: Mathematical Webs Accessibility CSUN 2014

First Conceived for Web pages.

After General format for exchanging

mathematics. Now

Provides accessible content for people with disabilities.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 11

Page 12: Mathematical Webs Accessibility CSUN 2014

MathML support still incomplete but…

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 12

Page 13: Mathematical Webs Accessibility CSUN 2014

Pros Recognizes math symbols, converts

them into ▪ LaTeX, ▪ XHTML and MathML.

Cons Strong requirements:

▪ Pure B&W▪ High resolution (600 dpi)▪ Standard ISO 80000-2:2009

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 13

Page 14: Mathematical Webs Accessibility CSUN 2014

The four principles of linked data publishing model:

Use URIs as names for things. Use HTTP URIs so that people can look up

those names. When someone looks up a URI, provide

useful information, using the standards (RDF, SPARQL)

Include links to other URIs. so that they can discover more things.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 14

Page 15: Mathematical Webs Accessibility CSUN 2014

Opportunities for the data in accessibility reports:

To customize answers to user queries. To generate data-enriched reports for

managers, technicians (webmasters) and policy decision makers.

To enrich search engines results with accessibility results used as website quality indicators.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 15

Page 16: Mathematical Webs Accessibility CSUN 2014

TWO KEY DECISIONS

Reuse a formal vocabulary: EARL

Use an open-source CMS: Drupal.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 16

Page 17: Mathematical Webs Accessibility CSUN 2014

EARL = Evaluation and Report Language

It is a simple vocabulary that describes test results, such as those generated by web accessibility evaluation tools.

It uses the RDF data model to define terms for expressing test results.

It is a W3C Working Draft.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 17

Page 18: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 18

Page 19: Mathematical Webs Accessibility CSUN 2014

A new controlled vocabularies for the Test Case class: Containing 10 suitable criteria based on

requirements of both InftyReader OCR software and ISO 80000-2:2009

Examples:▪ C2-INFTYREADER: Image resolution must be

equals or greater than 600 dpi.▪ C3-ISO: An explicitly defined function not

depending on the context is printed in Roman (upright) type, e.g. sin, exp, ln.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 19

Page 20: Mathematical Webs Accessibility CSUN 2014

A new controlled vocabularies for the Test Result class: Containing 5 categories, based on the

percentage of formulae correctly converted into MathML.

Examples:▪ Failed conversion [0%-20%]: This web has a

very low potential accessibility.▪ Successful conversion [80%-100%]: This web

has the maximum potential accessibility.

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 20

Page 21: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 21

Page 22: Mathematical Webs Accessibility CSUN 2014

Drupal is the first CMS with Semantic Web services

▪ Use by non-experts▪ Drupal 7 publishes data in RDF format.

Data model in RDF

Data model in Drupal 7

Content type RDF class

Field RDF property

Node RDF resourceMarch 2014 CSUN 2014 - Potential accessibility of mathematical webs 22

Page 23: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 23

Page 24: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 24

Page 25: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 25

Page 26: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 26

Page 27: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 27

Page 28: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 28

Page 29: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 29

Page 30: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 30

Page 31: Mathematical Webs Accessibility CSUN 2014

Most websites not accessible at all Interactive Non-interactive

▪ Formula images ▪ With very poor quality▪ without alternative text.

▪ Formulae not following standards => OCR not working

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 31

Page 32: Mathematical Webs Accessibility CSUN 2014

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 32

Page 33: Mathematical Webs Accessibility CSUN 2014

Useful methodology?

Future work: Using math ontologies

▪ OpenMath▪ OMDoc

Adapt formulas to RDFa

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 33

Page 34: Mathematical Webs Accessibility CSUN 2014

Questions, suggestions, [email protected]

http://www.ub.edu/adaptabit/[email protected]

March 2014 CSUN 2014 - Potential accessibility of mathematical webs 34

Page 35: Mathematical Webs Accessibility CSUN 2014

CSUN 2014 - Potential accessibility of mathematical webs

Specification: Analysis of data sources. URI design. Publishing license.

Vocabularies and ontologies Transform into RDFPublication Exploitation

35March 2014