biscicol ievobio

18
John Deck, University of California, Berkeley Brian Stucky, Colorado University, Boulder Nico Cellinese, University of Florida, Gainesville Neil Davies, University of California, Berkeley Rob Guralnick, Colorado University, Boulder Chris Meyer, Smithsonian Institution Tom Orrell, Smithsonian Institution Richard Pyle, Bishop Museum Kate Rachwal, University of Florida, Gainesville Russell Watkins, University of Florida, Gainesville BiSciCol: a Tagging and Tracking Infrastructure for Biological Science Collections

Upload: john-deck

Post on 10-May-2015

963 views

Category:

Business


0 download

DESCRIPTION

A general presentation of the BiSciCol project showing use cases, describing technical background, and describing application possibilities.

TRANSCRIPT

Page 1: BiSciCol ievobio

John Deck, University of California, BerkeleyBrian Stucky, Colorado University, Boulder

Nico Cellinese, University of Florida, GainesvilleNeil Davies, University of California, BerkeleyRob Guralnick, Colorado University, Boulder

Chris Meyer, Smithsonian InstitutionTom Orrell, Smithsonian Institution

Richard Pyle, Bishop MuseumKate Rachwal, University of Florida, Gainesville

Russell Watkins, University of Florida, Gainesville

BiSciCol: a Tagging and TrackingInfrastructure for Biological Science

Collections

Page 2: BiSciCol ievobio

Biological Science Collections Tracker working towards building an infrastructure

designed to tag and track scientific collections and all of their derivatives.

National Science Foundation funded 2010 – 2014

Partners are University of Florida, Colorado University, Bishop Museum, UC Berkeley, Smithsonian Institution, University of Arizona

Relies on globally unique identifiers (GUIDs) to track objects

Implements a Linked Data approachProvides support for the Global Names

Architecture

Page 3: BiSciCol ievobio

OutlineUse Cases – Why this is importantTechnical Background

Globally Unique Identifiers (GUIDs)RDF/ BiSciCol ImplementationCombining GraphsBiSciCol Taxonomy / GNUB Integration

Tools and Potential Tools in DevelopmentHow the BiSciCol Application WorksOnline search interfaceAlert systemAnnotation Network Service

How to get Involved

Page 4: BiSciCol ievobio

Use Case: Notify people interested in a collecting event that identifications have been made

(Biocode Event)

(Essig Museum Specimen)

(Smithsonian Tissue)

(Bishop Museum Tissue)

(Genbank Sequence)

(BOLD Barcode)

(Key/Person) (Blast)

(Taxon)(Taxon)

(Blast)

(Taxon)

Page 5: BiSciCol ievobio

Use Case: For Taxon X, find specimens with recent modification dates for image or tissue samples.

(Identification)

(Taxon)

DateLastModified“2011-06-01”

DateLastModified“2011-06-20”

(Bishop Museum Tissue)

(CalPhotos Image)

(Essig Museum Specimen)

Page 6: BiSciCol ievobio

Use Case: Map all place names/localities associated with Taxon X

(Specimen) (Identification)(Location)

(Specimen)(Location)

(Taxon)

(Identification)

Page 7: BiSciCol ievobio

Use Case: Provide collection permit information and use restrictions on tissue samples from SpecimenX

(Bishop Museum Tissue)

(Smithsonian Tissue)

(Collecting Permit)

(Essig Specimen) (Biocode Event)

Page 8: BiSciCol ievobio

Technical Background

Globally Unique Identifiers (GUIDs)RDF/ BiSciCol ImplementationCombining GraphsBiSciCol Taxonomy / GNUB Integration

Page 9: BiSciCol ievobio

Creating Globally Unique Identifiers (GUIDs)

Globally unique (mandatory) Persistent (not mandatory, but very helpful) Resolvable (not mandatory, but very helpful)

Resolution/Domain + Identifier

JDeckSpecimen1 (A named identifier)http://mycollection.org/specimen/

http://mycollection.org/specimen/JDeckSpecimen1http://mycollection.org/specimen/uuid=7217D220-836A-11DF-8395-0800200C9A66

Examples:

http://example.org/urn:lsid:example.org:specimen/7217D220-836A-11DF-8395-0800200C9A66

+1-541-914-4739 (Unique, at least for phones)7217D220-836A-11DF-8395-0800200C9A66 (opaque)

http://example.org/urn:lsid:example.org:specimen/

Page 10: BiSciCol ievobio

BiSciCol Implementation of the Resource Description Framework (RDF)

PredicateAn RDF

Statement:Subject Object

relatedTo (Transitive):

relatedToGUID1 GUID2 GUID3

relatedTo GUID1 <-> GUID2GUID2 <-> GUID3GUID1 <-> GUID3

ORPredicate

GUID1 GUID2

A Simple BiSciCol Graph

(graph=set of RDF Statements):

relatedTo

a aDate Date

GUID1 GUID2 GUID3

relatedTo

Event

“2011-06-20”“2011-05-01”

Tissue

“2011-06-01”

Specimen

a Date

Page 11: BiSciCol ievobio

Combining Graphs

SPARQLQuery language for RDFqueries multiple graphs

A set of institutions we are interested in

XML(Graph)

(Graph)N3/Turtle

(Graph)

Web Page Tags (RDFa/Microformats)

Page 12: BiSciCol ievobio

BiSciCol / Global Names Architecture Integration

Building a framework for linking taxon concepts to GUIDsLinking Occurrences to Taxons

GUID3(Taxon)

GUID2(IdentificationKey/Person/Blast)

GUID1(Occurrence) relatedTorelatedTo

The Global Names Architecture is a BiSciCol partner that is creating resolvable Identifiers for all taxons through its global names usage bank service (GNUB).

Linking Taxons to GNUB grants us search capabilities for taxon names across all systems.

Taxon = defined by Darwin Core Taxon Class

GUID3(Taxon)

Linking Taxons to Taxons (e.g. GNUB to myAuthority)

rela

tedTo

GUID1(Identification)

GUID2(Taxon)

relatedTo

GUID4(Identification)

relatedTo

Page 13: BiSciCol ievobio

Tools & Potential Tools in Development

What the BiSciCol Application DoesOnline search interfaceAlert systemAnnotation Network Service

Page 14: BiSciCol ievobio

What the BiSciCol Application Doeshttp://code.google.com/p/biscicol/

Search

Service

MapServic

e

XML(Graph)

(Graph)

N3/Turtle(Graph)

BiSciCol Application (Java)

Internet

SPARQL

Model ClassCombine graph

query results into asingle graph

(model)

Object ClassWork with graph

results in memory(Ancestors, Siblings,

Descendants)

Render ClassProvides image/textrepresentation of

results

RDFa web tags (indexer -> graph)

Page 15: BiSciCol ievobio

Online Search Interface Demonstration

Page 16: BiSciCol ievobio

Alert System (Proposed)

BiSciCol Applicatio

nDiscovers

/Traverses Relations

hips

XML(Graph)

(Graph)

N3/Turtle(Graph)

RSS Feed, Emails, etc..

Application to store identifiers1) User X stores objects: JDeckSpecmen1 JDeckEvent1,etc..2) Schedules Jobs 3) Runs Queries Recent Changes New identifications

Ser

vice

s

Page 17: BiSciCol ievobio

Annotation Network Service (Proposed)

Facilitating Feedback from Third Parties

Generate annotation &Store in a service

Relate AnnotationGUIDTo Specimen GUID

Now discoverable and Linked into BiSciCol

Page 18: BiSciCol ievobio

“Create stable identifiers, link them to other stable identifiers,

and put them on the web.”

How to Get Involved

http://biscicol.blogspot.com/http://code.google.com/p/biscicol/