vivo researcher networking update
DESCRIPTION
VIVO Researcher Networking Update. April 5, 2011 1-2 p.m. Leslie McIntosh Vivo National Evaluator Washington University. Ellen J. Cramer Special Projects Lead Cornell University. Jonathan Corson- Rikert Vivo Development Lead Cornell University. University of Florida - PowerPoint PPT PresentationTRANSCRIPT
April 5, 2011 1-2 p.m.
VIVO Researcher Networking Update
Leslie McIntoshVivo National EvaluatorWashington University
Jonathan Corson-RikertVivo Development LeadCornell University
Ellen J. CramerSpecial Projects LeadCornell University
VIVO Collaboration
Cornell UniversityDean Krafft (Cornell PI)
Manolo BeviaJim Blake
Nick CappadonaBrian CarusoElly Cramer
Medha DevareElizabeth Hines
Huda KhanBrian Lowe
Joseph McEnerneyHolly Mistlebauer
Stella MitchellAnup Sawant
Christopher WestlingTim Worrall
Rebecca YounesJon Corson-Rikert
University of FloridaMike Conlon (VIVO and UF PI)
Beth AutenChris Barnes
Cecilia BoteroKerry Britt
Erin BrooksAmy Buhler
Ellie BushhousenLinda Butson
Chris CaseChristine Cogar
Valrie DavisMary Edwards
Nita FerreeRolando Garcia-Milan
George HackChris HainesSara HenningRae Jesano
Margeaux JohnsonMeghan Latorre
Yang LiPaula Markes
Hannah NortonNarayan Raum
Alexander RockwellSara Russell Gonzalez
Nancy SchaeferDale SchepplerNicholas Skaggs
Syraj SyedMatthew Tedder
Michele R. TennantAlicia Turner
Stephen Williams
Indiana UniversityKaty Borner (IU PI)
Kavitha ChandrasekarBin Chen
Shanshan ChenRyan CobineJeni Coffey
Suresh DeivasigamaniYing Ding
Russell DuhonJon Dunn
Poornima GopinathJulie Hardesty
Brian KeeseNamrata Lele
Micah LinnemeierNianli Ma
Robert H. McDonaldAsik Pradhan Gongaju
Mark PriceMichael Stamper
Yuyin SunChintan TankAlan Walsh
Brian WheelerFeng Wu
Angela Zoss
Ponce School of MedicineRichard J. Noel, Jr. (Ponce PI)
Ricardo Espada ColonDamaris Torres Cruz
Michael Vega Negrón
This project is funded by the National Institutes of Health, U24 RR029822"VIVO: Enabling National Networking of Scientists”
The Scripps Research Institute
Gerald Joyce (Scripps PI)Catherine Dunn
Brant KelleyPaula King
Angela MurrellBarbara NobleCary Thomas
Michaeleen Trimarchi
Washington University School of Medicine in St. Louis
Rakesh Nagarajan (WUSTL PI)Kristi L. HolmesCaerie HouchinsGeorge JosephSunita B. Koul
Jasmine OwensLeslie D. McIntosh
Weill Cornell Medical CollegeCurtis Cole (Weill PI)
Paul AlbertVictor Brodsky
Mark BronnimannAdam Cheriff
Oscar CruzDan Dickinson
Richard HuChris Huang
Itay KlazKenneth Lee
Peter MicheliniGrace Migliorisi
John RuffingJason Specland
Tru TranVinay Varughese
Virgil Wong
An open-source semantic web application that enables the discovery of research and scholarship across disciplines in an institution.
Populated with detailed profiles of faculty and researchers; displaying items such as publications, teaching, service, and professional affiliations.
A powerful search functionality for locating people and information within or across institutions.
Participating InstitutionsInstitution Acad.
StaffStudent Pop.
City Pop.
Public/Private
Med School
Cornell (Ithaca) 1,639 20.9K 100K BothUniversity of Florida 4,534 50.7K 258K Public YesIndiana University (Bloomington) 2,973 42.4K 175K PublicPonce School of Medicine 200 475 442K Private YesThe Scripps Research Institute 225 ~225 43K Private
Washington University School of Medicine
1,772 ~500 2.8M Private Yes
Weill-Cornell Medical College 1,235 410 8.2M Private Yes
Lessons Learned in VIVO Implementation
Data, Data, Data
Get the Data• Who owns the data?• Where are the data
sources?• What permissions do you
need to use the data?
Manage the Data• Who owns the data now?• Do you need to create a
data management system?• How will you refresh your
data? How often?
Your data are only as good as the source.
Manage Expectations
Contribute to the Community
More to open-source than contributing code– Data– Documentation– IRC communication– Listservs– Lessons learned
vivoweb.orgvivo.sourceforge.net
VIVO Cornell: In-house to National Cloud
2003-2007 Development of research profiles using ontologies in a database-driven website to meet the needs of the Life Sciences initiative.
2007 Converted to Semantic Web standards. Expanded to include disciplines across the institution
2007–2011+ With NIH grant, moved to national and international network of institutions and organizations and their faculty and researcher profiles
VIVO Cornell: Data Sources
VIVO Cornell: Data Sources
Repurposing and re-using data
Local Outreach
• Provost Office - institutional support• Data providers – HR, Annual faculty reporting,
Grants, Courses, Other• Librarian VIVO liaisons -subject areas• Web developers - repurposing of data• Department editors - training
NetworkingOther sites piloting or adopting VIVO technologyArizona State University, Duke University, IICA, Los Alamos National Laboratory, Northwestern University, Stony Brook University, University of Arkansas, University of Buffalo, University of Colorado – Boulder, University of Delaware, University of Oregon, University of Virginia, USDA
Integration partnersAPA (Digital Trust), Duke (Widgets), Harvard University (Harvard Profiles), Indiana University (HUBzero), Orchid, Stony Brook University (UMLS), University of Hong Kong (Knowledge Exchange), University of Pittsburgh (Digital Vita), Weill Cornell Medical College (Google Refine).
International efforts• ANDS-Vitro Consortium (Griffith, QUT, University of Melbourne, VeRSI)• Chinese Academy of Sciences • IICA (Inter-American Institute for Cooperation on Agriculture) isconsidering options like VIVO for a researcher network for their SIDALCApplication and there is a pilot VIVO implementation at the El Colegio de Postgraduados of Mexico.
VIVO update part III
• VIVO core design principles• Enhancements during the NIH grant• Planned development• VIVO at web scale• Mini-grants and collaborations• Building community and sustainability
First, it’s about data
• Consistent formatting, in a language of the Web• Self-describing– Ontology– Context inherent in the data
• Distributed• De-referenceable• Reusable without (or with) modification• Persistent independently of any application
VIVO is not just people or profiles
• Anything can be a type (and have individuals)• All individuals have the same structure– Varying attributes & relationships– Inheritance
• Extend the ontology without modifying the app– Tradeoffs of generality vs. optimal interface
Highlights of recent improvementsLinkedOpen Data
Application
navigation
theming
scalability
MVC structure
VIVOCore
Ontology
eagle-iresearchresources
self-editing
externalauthentication
HarvesterVisualizations
page templates
grants
HR data
Pubmed
Drupal importer
Deliverables by August, 2011LinkedOpen Data
Application
navigation
theming
scalability
MVC structure
VIVOCore
Ontology
self-editing
externalauthentication
Visualizations
page templates
Map of Science
GeoMap
role-basedauthorization
aggregatorsoftware
RDF to Solrindexer
local/national
search UI
linkingbetween
VIVOs
Search-related functionalities
Bioportalsubmission
Harvester
more pubformats
nationalgrant data
Drupal importer
“National” search
• NIH mandated no reliance on sustained centralized infrastructure
• Aggregation of RDF from multiple sources– Harvard Profiles, Collexis, and likely others
• Solr indexing leveraging the VIVO ontology• Aggregator and indexing will be configurable
to harvest any desired set of sources
National networking & search
Ponce VIVO
WashU VIVO
IU VIVO
CornellIthaca VIVO
WeillCornell VIVO
VIVOaggregatortriple store
OtherVIVOs
OtherCTSA
VIVOsHarvardProfiles
RDF
OtherVIVOs
OtherRDF
Future CTSAtriple store
Futurestate or regional
triple store
FutureCTSASolr
index
OtherRDF
Solrsearchindex
Linked Open Data
futureSolr
index
VIVOnationalnetworksearch
UF VIVO
Scripps VIVO
VIVO at web scale
• Connections directly between VIVOs– Multiple campuses of 1 institution– Multiple institutions within a consortium– Data resides & served from home institution
• Individuals linked by URI or common identifier• Updates via linked data harvesting or pingback
As the linked data cloud grows
• Search enhanced by authoritative, structured, and updated data– Retrieval and filtering by type & relationship, not just text– Enables better data mining and analysis– Reduces reporting burden
• Unique semantic advantages– Categorization implicit in defined ontologies– Common references to shared terminologies– ORCID and other initiatives leading to common
references to individuals
Community development• VIVOweb.org• VIVO on sourceforge– Fully open source (BSD license)– Subversion repository – download or check out– Active development and implementation mail lists &
forums– Installation and upgrade documentation– Wiki-based documentation effort– Supplemental materials
• Many ways to contribute and benefit
Mini-grants address key areas
• Controlled vocabularies (Stony Brook)• Author IDs and disambiguation (ORCID)• Widgets to re-use VIVO data in standard web pages
(Duke)• Direct output to biosketches and CVs (Pittsburgh)• Connection to the HUBzero scientific simulation and
grid services platform, via Joomla CMS (IU)• Google Refine for data cleanup and export (Weill
Cornell)
VIVO Ecosystem Evolution
Community collaborations
• ORCID• Connections to institutional repositories, as
other libraries implement VIVO• Library of Congress support for Exhibit API
with VIVO as one target• Dataset metadata discovery and registry work,
with Australian VIVO consortium
Questions yet to address• What access points and services need to be
provided for national (or international) research networking to succeed?– How will people be able to integrate this data into their
daily workflow and research process?– How will boundaries between public and private data
and services work?• Federating group privileges as well as identities
across multiple VIVOs and to other research-enabling tools
Thank you