virtual biodiversity vibrant document mark-up for different users and purposes david king, the open...

13
Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, [email protected] David Morse, The Open University, UK, [email protected] MTSR2013, Thursday 21 November 2013 ViBRANT Virtual Biodiversity

Upload: anthony-fox

Post on 04-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Document Mark-Upfor Different Users and Purposes

David King, The Open University, UK, [email protected]

David Morse, The Open University, UK, [email protected]

MTSR2013, Thursday 21 November 2013

ViBRANTVirtual Biodiversity

Page 2: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Page 3: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

The Biologia Centrali-Americana

Page 4: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Why is it useful?

Baseline of species in central America• climate change• invasive species

Original manuscripts and in so many forms• images

• re-keyed

• OCR

Page 5: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Taxonomists

Want to understand species• Name

• Concept

• Context

• Habitat

• Location

• Relationships

• Traits

Page 6: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Computer Scientists

Want to understand data• Content

• Disambiguation

• Structure

• Cues

• Variations

• Intended – Stylistic

• Unintended – Processing Errors

Page 7: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Differences in Practice

Page 8: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Differences in Practice<div type="taxon synonymy"> <p elementid="BCA-aves-v3p1-2240"> <hi rend="genus"> <hi

rend="italic">Vireolanius</hi> </hi> <hi rend="species"> <hi

rend="italic">melitophrys</hi> </hi>, <bibl rend="primary"> <author>Du Bus</author>, <title>Esq. Orn.</title>

T25 genus 1647 1658 VireolaniusT26 specificepithet 1659 1670

melitophrys

The taxonomist wants this

But the computer scientist is happy with this

Page 9: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

More Differences in Practice

Dr. Merrill also found a nest5; but this was in a bunch of Spanish moss (Tillandsia) about eight feet from the ground.

The nest is a beautiful structure placed on or close to the ground, and is composed of loose moss (Hypnum) intermingled with dead leaves and stems; the lining is composed of the fruit-stalks of the moss thickly felted together13.

They approach the Shrikes (Laniidæ); and, indeed, we think it not at all improbable that their more immediate relationship with the African genus Laniarius, which they strongly resemble in many points of coloration, will some day have to be reconsidered; but to do so here would lead us into a discussion far beyond the limits of the present work.

One specimen he killed was in the midst of an innumerable column of Tepegua ants (Eciton mexicanum), upon which he says it was doubtless feeding.

Page 10: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Conclusions

Exploit or Explore

Ease of Immediate Use or Future Flexibility

Page 11: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

For data scientists:

The ViBRANT corpus:

https://git.scratchpads.eu/v/vibrantcorpus.git

paper in preparation

Page 12: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Thank you to:

BHL (www.biodiversitylibrary.org)

For scanning and hosting the original material

INOTAXA (www.inotaxa.org)

For making the re-keyed data available

ViBRANT (vbrant.eu)

For funding this work as part of workpackage 7

ViBRANT is funded by the European Union 7th Framework Programme within the Research Infrastructures group. Contract no. RI-261532. Period, Dec. 2010 to Nov. 2013. Coordinator: Dr Vince Smith.

Page 13: Virtual Biodiversity ViBRANT Document Mark-Up for Different Users and Purposes David King, The Open University, UK, David.King@open.ac.ukDavid.King@open.ac.uk

Virtual BiodiversityViBRANT

Any questions?

David King

[email protected]