penn vivo ecosystem
TRANSCRIPT
Bob Krall
Michael Winkler
University of Pennsylvania
Strategic Origins of Penn VIVO
The Penn Libraries undertook a strategic agenda of deliverables
Research engagement as theme Operate at the enterprise level
Develop researcher profiling infrastructure
Institute research data services
Develop actionable insight and intelligence of researcher activities, priorities, and needs
Establish collaboration infrastructure with researchers
Penn VIVO An extensible set of tools, that will
evolve into a containing framework
Aggregate, manage, and publish research, research data, and researcher profiles
Not an administrative mandate!
Photo credit: danblackonleadership
Insight as an Enterprise Asset
Penn becomes a Clinical & Translational
Science Award (CTSA) site
Research profile mandate
Specification of VIVO ontology
FEDS requires refactoring to support new
demands & scale
Beyond CTSA, new demands develop on
campus for activity data
Provost’s office and school administration
seek research intelligence and tracking
Competitive grant renewals
Accreditation efforts
Funding agency compliance
Focus on gaining insight into the output and
impact of faculty and researcher activities
and scholarship
Researcher data & networking is
recognized as a competitive enterprise
asset
Photo Credit: https://asunews.asu.edu/20120619_arizonabiomedicalcenter
Evolution of Penn VIVO
PSOM innovates FEDS to track researcher activities, generate C.V.s, & produce websites
ITMAT & CTSA need to satisfy NIH researcher public profile requirements
ITMAT and Libraries collaborate to implement VIVO for PSOM
Libraries establish Penn VIVO for University-wide researcher profiling system
Penn VIVO Beginnings - FEDS
Faculty Expertise Data System (FEDS)
developed in Penn School of Medicine in
2000
Designed to aggregate information on
faculty expertise for tenure and promotion
Manual input, data feeds, and semi-
automated harvesting of PubMed citations
Not a public facing system
Used by PSOM & other health science
schools at Penn
Over time, other products have been
added – Used to build web pages, C.V.s
Enhanced over the years, but no
systematic refactoring
PSOM looking at costs to extend FEDS
Photo Credit: http://commons.wikimedia.org/wiki/File:Eniac.jpg
Evolution to Penn VIVO
Data complexity Many sources, internal & external
Data integrations require efforts for disambiguation,
resolution, and cleansing
Require more diverse set of publication sources
More types of activities and data
Diverse targets for data New requirements for profile data and emphasis
Websites for individuals, departments, labs, centers,
schools
Individual researcher C.V.s and BioSketch
Internal intelligence, external comparatives
Integrate various administrative workflows: Proposal preparation
Data management
Profile management
Research cores and facilities use
Compliance efforts
Publication of output
Branding
We need a tool that is stable, ergonomic, and well supported
by a community for sustainability, and that gives us
a comprehensive and flexible understanding of researcher
impact and efforts.- Dean Denis Kinane, Penn School of Dental
Medicine
Ecosystem for Researcher Data
System must be:
Enterprise level & discipline
comprehensive
Comprehensive locus of relevant data
aggregation
Ergonomic – faculty time is expensive
Hub for researcher administrative efforts
Ecosystem includes:
Diverse data sources – Enterprise, school,
external sources
Profile, website, & C.V. synchronization
Ability to aggregate & disambiguate
Tie into data management infrastructure
Facilities, equipment, protocols
Researcher ID – registration & tracking
Photo Credit: http://www.oercommons.org/courses/mimicry-the-orchid-and-the-bee/view
Why VIVO for Penn?
Previous attempts to deploy profiling services in PSOM were unsuccessful
Libraries had been investigating VIVO as an enterprise solution
VIVO provides easy to understand hook for the ecosystem
VIVO is extensible enough to meet heterogeneous needs, but encourages source system rationalization
Embarked on VIVO pilot in partnership with PSOM
Why Symplectic Elements?
Initially, we needed to replace labor
intensive publication gathering
process
But we saw that Elements could:
Support FEDS functionality & more
Scale to all schools at Penn
FEDS a subset of faculty &
researchers
Broader publication harvesting
Inclusion of other relevant activity
data
Provide one stop management &
aggregation hub
Host non-public data
Linking Investigators & Resources
Faith Coldren • [email protected]
Other parts of ecosystem
Developing research data services
Data management planning
Data publication & dissemination
service
Data management tool investigation
Researcher ID management service –
ORCID
Publication linking & acquisitions
Teaching & training activities
Assessment as a core service
Impact of research & activities
Collect, sustain, and expose rich set of
data on researchers and faculty
activities
Metridoc as BI infrastructure to
produce analysis of activities
Photo Credit: http://digitalheritage.org/2010/10/apples/
Why the Penn Libraries?
Decentralized, RCM budgeting and IT
environment at Penn
Libraries bring sustainability through
shared funding & partnerships
Variety of accrediting bodies
Existing research repository services
Routing publications to domain or local
repos
Capacity for research data services
○ Data management planning
○ Data standards and best practice consulting
○ Data repository service
Strategic engagement of researchers –
changing nature of academic &
research libraries
Photo Credit: http://www.upenn.edu/gazette/1112/feature5_1.html
Penn VIVO
Budgeting through
Penn’s allocated cost
model
Establishes stable and
reliable service model
Transformative for
library services
Embed library services
with researcher activities
The Libraries are uniquely centered to
support researcher and faculty
profiling services at Penn that are rich
sources of insight, easy to use and
well supported, fiscally sustainable,
and sensitive to the diverse methods
of scholarly communications.
- H. Carton Rogers, Director and Vice
Provost of Libraries
Penn VIVO
VIVO
Faculty
Enhancement
& Approval
Faculty
Identifiers
(ORCID)
Scholarly
CommonsElements
PENN VIVO – PARTS
Targets
Sources
Sources• Diverse set of internal & external sources
• Difficult to gather
• Inconsistent formatting
• Requires disambiguation & consistency
enhancement
• Significant effort for researchers to maintain
VIVO• Public facing website
• Integrates with repository as avenue for
research output delivery
• Links researcher profile data together in
semantic network across boundaries
• Safely expose researcher activity data to
the network
Elements• Scales data collection to diverse University-
wide and school-specific sources
• Automates the collection of publications
• Aggregates data for comprehensive view of
individuals, teams, labs, departments,
schools
• Ergonomic review and editing
• Hub for use and reuse of profile data
• Profile management partnership with
researchers is a platform for deeper
collaboration
Targets• Expose Penn research to semantic discovery
• Format & publish Curriculum Vitae
• Expose network accessible researcher
profiles, NIH Biosketch, & C.V.s
• Content for individual, department, or school
webpages
• Deposit research output in Penn and/or
domain repositories – e. g. PubMed Central
• Comply with federally-mandated open access
repositories & OSTP open data requirements
• Support researcher assessment & reporting,
and build activities intelligence
Next Steps
VIVO live in July 2013
Penn joins DuraSpace as a VIVO Founding
Sponsor
Elements live in Jan 2014
Feed FEDS until C.V. production & website
generation are available
Replace FEDS during 2014
Research Data Services Hub – Fall 2013
Developed with Penn Office of Research Services
Data planning management
Data repository services
Data collection & management tools
Data publication & persistence services
Rolling deployment of Penn VIVO to the
University - 2014
Onboard next client schools
Abramson Cancer Center Proposal
Launch Penn Researcher Activities and Profiling
Services as a comprehensive set of tools
Photo Credit: Hansueli Krapf
See Penn’s VIVO in Action
Read Ivy Leaves about the VIVO Pilot with Perelman School of Medicine
http://vivo.upenn.edu/
Robert Krall – [email protected]
Michael Winkler – [email protected]
More About Penn VIVO
Introduction to VIVO
Open source semantic web
application
Enables the discovery of
researchers across institutions
Showcases researcher activities
& accomplishments
Data reside & are controlled
locally
Penn’s VIVO is live & available
Penn Libraries have joined the
VIVO community
Penn’s VIVO has the largest set
of researcher publications
(213,000+) of all other VIVO
institutions…combined!
VIVO is about Brands
VIVO describes Research
Publications
Teaching
Funding
Expertise
Facilities
Of Individuals
Teams
Departments
Schools
Universities
Domains
Introducing Symplectic Elements
Leading Research Information Management System
Aggregates data from heterogeneous systems – local and external
Automatic harvesting of publication data
Provides an ergonomic user interface for faculty & researchers
Secure internal system can host non-public data
Slated to replace FEDS
Feeds data to VIVO, assessment systems, websites, C.V.s, and funding agencies
Local data Contact, Title, Identifiers, &
Appointments
Research interests and expertise
Research facilities and resources
Research grants and results
Teaching activities
Public data Editorial activities
Presentations & Collaborations
Researcher IDs
Publications Peer reviewed articles
Monographs and books
Conference proceedings & Working
papers
Software
Curated data sets
ELEMENTS IS ABOUT DATA
Elements is about Publications
Aggregates from multiple data sources
Assists in author disambiguation, de-
dupping, and publication format
classification
Discovers publications in books, articles,
conference proceedings, and patents
Sources include:
• arXiv
• CiNii
• Dblp
• Mendeley
• CrossRef
• AltMetrics
• PubMed
• RePEc
• Scopus
• Web of Science
• British Library
• Google Books
Penn VIVO
VIVO
Faculty
Enhancement
& Approval
Faculty
Identifiers
(ORCID)
Scholarly
CommonsElements
PENN VIVO – PARTS
Targets
Sources
Sources• Diverse set of internal & external
sources
• Difficult to gather
• Inconsistent formatting
• Requires disambiguation & consistency
enhancement
• Significant effort for researchers to
maintain
VIVO• Public facing website
• Integrates with repository as avenue for
research output delivery
• Links researcher profile data together in
semantic network across boundaries
• Safely expose researcher activity data
to the network
Elements• Scales data collection to diverse
University-wide and school-specific
sources
• Automates the collection of publications
• Aggregates data for comprehensive
view of individuals, teams, labs,
departments, schools
• Ergonomic review and editing
• Hub for use and reuse of profile data
• Profile management partnership with
researchers is a platform for deeper
Targets• Expose Penn research to semantic
discovery
• Format & publish Curriculum Vitae
• Expose network accessible researcher
profiles, NIH Biosketch, & C.V.s
• Content for individual, department, or
school webpages
• Deposit research output in Penn and/or
domain repositories – e. g. PubMed
Central
• Comply with federally-mandated open
access repositories & OSTP open data
requirements
• Support researcher assessment &
reporting, and build activities intelligence
Next Steps
Implement Elements – Due in Oct 2013 Working with Abramson Cancer Center
Demonstrate funded research output
Select next implementation partner Vet, Dental, GSE – Currently use FEDS
Nursing – Motivated administration, teaching
SEAS – Trust relationship as early adopter
Participate in governance of VIVO community
Work with PSOM to replace FEDS –2014
Incorporate Penn VIVO into publication repository workflow
Work with Symplectic to develop Penn C.V. template
Use Penn VIVO to produce NIH Biosketch
Rollout Penn VIVO to the University
Hansueli Krapf – Used under CC-BY License
Improves ability to collect researcher activity data from heterogeneous sources
Provides ergonomic workflow for researchers to manage, maintain, & enhance profiles Elements can, in time, replace FEDS, and is
available to the whole of Penn
Provides platform that leverages researcher activities data Assess research impact
Expose contributions to new knowledge
Improves collection of Penn research output
Improves compliance with federal deposit mandates
Increases researcher brand & networking
Scales to encompass more data, more schools, & more researchers
IN SUMMARY, PENN VIVO
See Penn’s VIVO in Action
Read Ivy Leaves about the VIVO Pilot with Perelman School of Medicine
Learn about the VIVO Community
Talk to the Penn Libraries –[email protected]
http://vivo.upenn.edu/
FIND OUT MORE ABOUT VIVO AT PENN
Google – semantics on display
Simple understanding of relationships
VIVO – built for semantic searching
Sophisticated description of researcher activities
VIVO networking – scaling discovery across
institutions
DISCOVERY ENHANCED WITH SEMANTIC DATA
VIVO semantically links People
Publications
Organizations
And more!
And supports Searching &
browsing inside & outside institutions
Network visualizations
VIVO IS ABOUT NETWORKS
Local data Contact, Title, Identifiers,
& Appointments
Research interests and expertise
Research grants and results
Teaching activities
Public data Editorial activities
Presentations
Collaborations
Publications Peer reviewed articles
Monographs and books
Conference proceedings
Working papers
Software
Curated data sets Industry standard ontology
VIVO IS POWERED BY DATA