how the encyclopedia of life is wrangling organismal attribute data

Post on 01-Sep-2014

338 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

Lightning talk presented at iEvoBio 2013 in Snowbird, Utah

TRANSCRIPT

eol.org@eol@cydparr

How the Encyclopedia of Life is wrangling organismal attribute data

How EOL works

EOL

Crowds

Harvest

Third party applications

EOL Today

Key Milestones in 2013

1.1 million species pages

240+ content providers

3.3 million unique annual visitors from 235 countries

DistributionMolecularBiology

Multiple topicsTypeInformation

HabitatConservationStatus

ThreatsMorphology

ConservationManagement

TrendsSize

AssociationsUses

TrophicStrategyCyclicity & Life Cycle

PopulationBiologyReproduction

MigrationTaxonomy

LifeExpectancyIdentification

BehaviourEcology

Diseases

0 100000 200000 300000 400000 500000 600000 700000 800000

Number of text objectsSu

bjec

t of t

ext o

bjec

t

Text mining, crowdsourcing, standardizing see http://eol.org/info/fellows

Co-occurrence, term extraction & linked data

Thessen & Devries

EnvO habitat terms Pafilis et al.Altitude Specificity of Flower Coloration

Wright

Morphological impacts of extinction risk in fish

Chang

Butterfly-hostplant associations Ferrer-Parris et al.

Species Interactions Poelen & Mungall et al.

14 datasets containing 25k taxa, 422k interactions, for 3k locations

alpha version of ingestion, normalization, aggregation

alpha version of web APIalpha version of data

exports

Dr. Katy Börner ledInformation Visualization MOOC

GLoBI http://globalbioticinteractions.wordpress.com/

EOL TraitBank

Funded: Marine focus

Virtuoso triple store, re-using URIs where possible5 datasets 128,050 data points for 20,896 taxa

Harvest and display on data tabDownloads, fancy searchingMachine access

Uploads & harvests will be by spreadsheetand Darwin Core Archive

Support for annotation and curation

Please contact me to be part of the private beta

Easy access to analyzable trait data

“Are blue organisms more common in high altitudes?”

“Does the evolution of mammalian bacula appear to be related to the pattern of promiscuous mating?”

“What organisms should I collect to fill in gaps in genome quality tissue collections?”

• Look for trait, download for all taxa• Create a collection of taxa, download all data• Use Reol: an R interface to EOL (Banbury, O’Meara) http://reolblog.wordpress.com/• Find more specialized data repositories

But also . . .

ThanksFunding & other contributionsSloan FoundationSmithsonian InstitutionDavid RubensteinMarine Biological LaboratoryHarvard UniversityOur content partnersThousands of individual contributors, and hundreds of volunteer curators

Image credits

Jenny from Taipei

Cynthia ParrChief Scientist @eol

@cydparr parrc@si.edu

Alexandria Archive: Sarah Kansa, Eric Kansa, 34 other zooarchaeologists

GLoBI: Jorrit Poelen (lead/software), Chris Mungall (ontologies), James Simons (biologist) and Robert Reiz (software). Datasets shared by: Peter D. Roopnarine, Rachel Hertog, Carlos García-Robledo, James Simons, Jenny L. Wrast, C. Barnes, International Council for the Exploration of the Sea (ICES), Jose R. Ferrer Paris, Senol Akin, Malcolm Storey (BioInfo.org.uk), Ivy E. Baremore, Joel Sachs (SPIRE), Colt W. Cook, David A. Blewett

Quick math

In Phenoscape57 publications had 565,158 anatomical trait descriptions for 2,527 kinds of organisms= 223 traits/organism

In ZFIN 38,189 trait descriptions for 4,727 genes for Zebra Fish

1.9 million species on the planet

= LOTS OF TRAITS

Anatolia Zooarchaeology Case Study led by Alexandria Archive Institute1. 14 different sites2. 34+ zooarchaeologists3. Decoding, cleanup, metadata documentation4. 220,000+ specimens5. 450 entities linked to 143 EOL taxon concepts6. Anatomical entities linked to Uberon.org7. Biometrics linked to measurement ontology 8. Collaborative analysis

http://opencontext.org/

top related