nif resource curation and automated resource discovery?

19
NIF Resource Curation and Automated Resource Discovery?

Upload: jed

Post on 25-Feb-2016

56 views

Category:

Documents


1 download

DESCRIPTION

NIF Resource Curation and Automated Resource Discovery?. NIF Resources. NIF is cataloging websites that house information about databases, atlases, software tools, data, transgenic mice and other things that we consider of value to the neuroscience community. Definition of Resource. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: NIF Resource Curation and Automated Resource Discovery?

NIF Resource Curationand

Automated Resource Discovery?

Page 2: NIF Resource Curation and Automated Resource Discovery?

NIF Resources

• NIF is cataloging websites that house information about databases, atlases, software tools, data, transgenic mice and other things that we consider of value to the neuroscience community.

Page 3: NIF Resource Curation and Automated Resource Discovery?

Definition of Resource

• Individual resource boundary: shall be considered an individual resource if it is maintained by a single entity, and has the properties of one or more individual web pages that are related by a theme and html links.

Page 4: NIF Resource Curation and Automated Resource Discovery?

Resource Nomination

Registry(4500)

Public Registry(2300)

NIF Web(499,952)

Level 2/3(29)

User Feedback*Automated tools Web

Crawl

RegistrySubset

Nomination

Check: -Links

-Annotation-Vocabulary

*Automated updates Level 2 tools

*In Development

Page 5: NIF Resource Curation and Automated Resource Discovery?

Resource is NominatedNIF Staff, Contact at Meetings, Web Form

In NIF already?

Assign Metadata-short name, long name, url

-description (short description 1-3 sentences, longer description)-parent organization (physical location, university)

-support (grant numbers)-keywords (species, technique, structure, age, level, disease, topic)

Decision: Should it be included?

Assign resource type

Do not includeKeep Record

Page 6: NIF Resource Curation and Automated Resource Discovery?

Resources Difficult to Categorize• Link aggregates• Large organizations (NIH)• Poorly documented databases• Private data sites• Clinical trials that are still recruiting 

– Experimental protocol • Commercial entities• Journals

– JOVE– supplemental materials

Page 7: NIF Resource Curation and Automated Resource Discovery?

CINdy the resource curation tool

Page 8: NIF Resource Curation and Automated Resource Discovery?
Page 9: NIF Resource Curation and Automated Resource Discovery?

Resource Ontology (BRO)• Data Resource: provides access to data;

database, atlas, book• Software Resource: software programs or

source code• Material Resource: reagents, tissue samples or

organisms• Funding Resource: grants or contracts• Training Resource: educational materials,

training programs• Job Resource: employment opportunities• People Resource: access to individual people’s

web sites

Page 10: NIF Resource Curation and Automated Resource Discovery?

NIF Service vs BRO Service

Page 11: NIF Resource Curation and Automated Resource Discovery?

Solutions Consolidating Classes• Synonyms where appropriate: ex. Material

storage service vs. Material storage repository.

• Temporary mapping, where appropriate– *Deprecated terms must be maintained*

• Data loss

• Moving forward with a joint descriptive terminology!

Page 12: NIF Resource Curation and Automated Resource Discovery?

Evolution of the NIF Resource Ontology

Object Function Target Audience

Data Type Data Format

Materials -Biomaterials -Reagents

Software

People

Grants

Jobs

Information

Service -Storage -Production

Funding

Job Service

Community-building

General

Kids

Student

Medical

Researcher

Structured -Database -Atlas

Unstructured -Journal -Webpage

Text

RDF Text

Picture

Video

Page 13: NIF Resource Curation and Automated Resource Discovery?
Page 14: NIF Resource Curation and Automated Resource Discovery?

Resource Boundary?• Software Library

– Software tool• Plugin: I2B2

• Our solution: use url as a uniqueness qualifier– Our problem: a single url may house several

resources– Individual plugins can have individual urls

Page 15: NIF Resource Curation and Automated Resource Discovery?

Boundary cont.• Individual resource boundary: shall be

considered an individual resource if it is maintained by a single entity, and has the properties of one or more individual web pages that are related by a theme and html links.

• Solution to random boundary problem: Human Curator

Page 16: NIF Resource Curation and Automated Resource Discovery?

Issues of Scope• Single line or short paragraph + keywords

– Resource discovery problem*Stanford ontologies description is very short

(as are many) finding this resource by keyword will be difficult unless we index the content of the website.

• Data dump– Small vs. Large databases– Updates

Page 17: NIF Resource Curation and Automated Resource Discovery?

Internal referencing• Stanford example:

– License: “same as bioportal” – does not match any license types in any list.

– Problem: non standard terminology, reference to another project (no url), can create loops • also true in publications: ex., used same protocol

as paper X, which used the same protocol as paper Y

– Automated text mining tools have a hard time recognizing these

Page 18: NIF Resource Curation and Automated Resource Discovery?

What can we gain from automated systems?

• Basic information: Name, url, contact info

• Some keywords• Some descriptive text

• No resource boundary• No resource description

Page 19: NIF Resource Curation and Automated Resource Discovery?

How do we help the computers?

• Common naming project (neurocommons)

http://sharedname.org/page/Main_Page• Automated uri’s • Community building:

– Shared data models– Shared ontology– RDF entity tags? (mouse vs mouse)