penn vivo ecosystem

Bob Krall

Michael Winkler

University of Pennsylvania

Strategic Origins of Penn VIVO

The Penn Libraries undertook a strategic agenda of deliverables

Research engagement as theme Operate at the enterprise level

Develop researcher profiling infrastructure

Institute research data services

Develop actionable insight and intelligence of researcher activities, priorities, and needs

Establish collaboration infrastructure with researchers

Penn VIVO An extensible set of tools, that will

evolve into a containing framework

Aggregate, manage, and publish research, research data, and researcher profiles

Not an administrative mandate!

Photo credit: danblackonleadership

http://www.google.com/url?sa=i&source=images&cd=&cad=rja&docid=NVY0nMnnKafgiM&tbnid=2TGvINEGEC22OM:&ved=0CAYQjhwwAA&url=http://danblackonleadership.info/archives/417/spyglass-on-a-map&ei=AEEFUvzCE8KGyAGUvYDQCQ&psig=AFQjCNEsQXPKYyhXJgsw9O2xg4Z6-FlyhQ&ust=1376162432400384

Insight as an Enterprise Asset

Penn becomes a Clinical & Translational

Science Award (CTSA) site

Research profile mandate

Specification of VIVO ontology

FEDS requires refactoring to support new

demands & scale

Beyond CTSA, new demands develop on

campus for activity data

Provost’s office and school administration

seek research intelligence and tracking

Competitive grant renewals

Accreditation efforts

Funding agency compliance

Focus on gaining insight into the output and

impact of faculty and researcher activities

and scholarship

Researcher data & networking is

recognized as a competitive enterprise

asset

Photo Credit: https://asunews.asu.edu/20120619_arizonabiomedicalcenter

https://asunews.asu.edu/20120619_arizonabiomedicalcenter

Evolution of Penn VIVO

PSOM innovates FEDS to track researcher activities, generate C.V.s, & produce websites

ITMAT & CTSA need to satisfy NIH researcher public profile requirements

ITMAT and Libraries collaborate to implement VIVO for PSOM

Libraries establish Penn VIVO for University-wide researcher profiling system

Penn VIVO Beginnings - FEDS

Faculty Expertise Data System (FEDS)

developed in Penn School of Medicine in

2000

Designed to aggregate information on

faculty expertise for tenure and promotion

Manual input, data feeds, and semi-

automated harvesting of PubMed citations

Not a public facing system

Used by PSOM & other health science

schools at Penn

Over time, other products have been

added – Used to build web pages, C.V.s

Enhanced over the years, but no

systematic refactoring

PSOM looking at costs to extend FEDS

Photo Credit: http://commons.wikimedia.org/wiki/File:Eniac.jpg

http://commons.wikimedia.org/wiki/File:Eniac.jpg

Evolution to Penn VIVO

Data complexity Many sources, internal & external

Data integrations require efforts for disambiguation,

resolution, and cleansing

Require more diverse set of publication sources

More types of activities and data

Diverse targets for data New requirements for profile data and emphasis

Websites for individuals, departments, labs, centers,

schools

Individual researcher C.V.s and BioSketch

Internal intelligence, external comparatives

Integrate various administrative workflows: Proposal preparation

Data management

Profile management

Research cores and facilities use

Compliance efforts

Publication of output

Branding

We need a tool that is stable, ergonomic, and well supported

by a community for sustainability, and that gives us

a comprehensive and flexible understanding of researcher

impact and efforts.- Dean Denis Kinane, Penn School of Dental

Medicine

Ecosystem for Researcher Data

System must be:

Enterprise level & discipline

comprehensive

Comprehensive locus of relevant data

aggregation

Ergonomic – faculty time is expensive

Hub for researcher administrative efforts

Ecosystem includes:

Diverse data sources – Enterprise, school,

external sources

Profile, website, & C.V. synchronization

Ability to aggregate & disambiguate

Tie into data management infrastructure

Facilities, equipment, protocols

Researcher ID – registration & tracking

Photo Credit: http://www.oercommons.org/courses/mimicry-the-orchid-and-the-bee/view

http://www.oercommons.org/courses/mimicry-the-orchid-and-the-bee/view

Why VIVO for Penn?

Previous attempts to deploy profiling services in PSOM were unsuccessful

Libraries had been investigating VIVO as an enterprise solution

VIVO provides easy to understand hook for the ecosystem

VIVO is extensible enough to meet heterogeneous needs, but encourages source system rationalization

Embarked on VIVO pilot in partnership with PSOM

Why Symplectic Elements?

Initially, we needed to replace labor

intensive publication gathering

process

But we saw that Elements could:

Support FEDS functionality & more

Scale to all schools at Penn

FEDS a subset of faculty &

researchers

Broader publication harvesting

Inclusion of other relevant activity

data

Provide one stop management &

aggregation hub

Host non-public data

Linking Investigators & Resources

Faith Coldren • [email protected]

Other parts of ecosystem

Developing research data services

Data management planning

Data publication & dissemination

service

Data management tool investigation

Researcher ID management service –

ORCID

Publication linking & acquisitions

Teaching & training activities

Assessment as a core service

Impact of research & activities

Collect, sustain, and expose rich set of

data on researchers and faculty

activities

Metridoc as BI infrastructure to

produce analysis of activities

Photo Credit: http://digitalheritage.org/2010/10/apples/

http://digitalheritage.org/2010/10/apples/

Why the Penn Libraries?

Decentralized, RCM budgeting and IT

environment at Penn

Libraries bring sustainability through

shared funding & partnerships

Variety of accrediting bodies

Existing research repository services

Routing publications to domain or local

repos

Capacity for research data services

○ Data management planning

○ Data standards and best practice consulting

○ Data repository service

Strategic engagement of researchers –

changing nature of academic &

research libraries

Photo Credit: http://www.upenn.edu/gazette/1112/feature5_1.html

http://www.upenn.edu/gazette/1112/feature5_1.html

Penn VIVO

Budgeting through

Penn’s allocated cost

model

Establishes stable and

reliable service model

Transformative for

library services

Embed library services

with researcher activities

The Libraries are uniquely centered to

support researcher and faculty

profiling services at Penn that are rich

sources of insight, easy to use and

well supported, fiscally sustainable,

and sensitive to the diverse methods

of scholarly communications.

- H. Carton Rogers, Director and Vice

Provost of Libraries

Penn VIVO

VIVO

Faculty

Enhancement

& Approval

Faculty

Identifiers

(ORCID)

Scholarly

CommonsElements

PENN VIVO – PARTS

Targets

Sources

Sources• Diverse set of internal & external sources

• Difficult to gather

• Inconsistent formatting

• Requires disambiguation & consistency

enhancement

• Significant effort for researchers to maintain

VIVO• Public facing website

• Integrates with repository as avenue for

research output delivery

• Links researcher profile data together in

semantic network across boundaries

• Safely expose researcher activity data to

the network

Elements• Scales data collection to diverse University-

wide and school-specific sources

• Automates the collection of publications

• Aggregates data for comprehensive view of

individuals, teams, labs, departments,

schools

• Ergonomic review and editing

• Hub for use and reuse of profile data

• Profile management partnership with

researchers is a platform for deeper

collaboration

Targets• Expose Penn research to semantic discovery

• Format & publish Curriculum Vitae

• Expose network accessible researcher

profiles, NIH Biosketch, & C.V.s

• Content for individual, department, or school

webpages

• Deposit research output in Penn and/or

domain repositories – e. g. PubMed Central

• Comply with federally-mandated open access

repositories & OSTP open data requirements

• Support researcher assessment & reporting,

and build activities intelligence

Next Steps

VIVO live in July 2013

Penn joins DuraSpace as a VIVO Founding

Sponsor

Elements live in Jan 2014

Feed FEDS until C.V. production & website

generation are available

Replace FEDS during 2014

Research Data Services Hub – Fall 2013

Developed with Penn Office of Research Services

Data planning management

Data repository services

Data collection & management tools

Data publication & persistence services

Rolling deployment of Penn VIVO to the

University - 2014

Onboard next client schools

Abramson Cancer Center Proposal

Launch Penn Researcher Activities and Profiling

Services as a comprehensive set of tools

Photo Credit: Hansueli Krapf

http://commons.wikimedia.org/wiki/File:2012-01-11_11-40-34_Spain_Canarias_Jand%C3%ADa.jpg

See Penn’s VIVO in Action

Read Ivy Leaves about the VIVO Pilot with Perelman School of Medicine

http://vivo.upenn.edu/

Robert Krall – [email protected]

Michael Winkler – [email protected]

More About Penn VIVO

https://vivo.upenn.edu/

http://www.library.upenn.edu/docs/publications/ivyleaves/ILfall2012.pdf


mailto:[email protected]


Introduction to VIVO

Open source semantic web

application

Enables the discovery of

researchers across institutions

Showcases researcher activities

& accomplishments

Data reside & are controlled

locally

Penn’s VIVO is live & available

Penn Libraries have joined the

VIVO community

Penn’s VIVO has the largest set

of researcher publications

(213,000+) of all other VIVO

institutions…combined!

http://beta.vivosearch.org/statistics

VIVO is about Brands

VIVO describes Research

Publications

Teaching

Funding

Expertise

Facilities

Of Individuals

Teams

Departments

Schools

Universities

Domains

Introducing Symplectic Elements

Leading Research Information Management System

Aggregates data from heterogeneous systems – local and external

Automatic harvesting of publication data

Provides an ergonomic user interface for faculty & researchers

Secure internal system can host non-public data

Slated to replace FEDS

Feeds data to VIVO, assessment systems, websites, C.V.s, and funding agencies

Local data Contact, Title, Identifiers, &

Appointments

Research interests and expertise

Research facilities and resources

Research grants and results

Teaching activities

Public data Editorial activities

Presentations & Collaborations

Researcher IDs

Publications Peer reviewed articles

Monographs and books

Conference proceedings & Working

papers

Software

Curated data sets

ELEMENTS IS ABOUT DATA

Elements is about Publications

Aggregates from multiple data sources

Assists in author disambiguation, de-

dupping, and publication format

classification

Discovers publications in books, articles,

conference proceedings, and patents

Sources include:

• arXiv

• CiNii

• Dblp

• Mendeley

• CrossRef

• AltMetrics

• PubMed

• RePEc

• Scopus

• Web of Science

• British Library

• Google Books

Penn VIVO

VIVO

Faculty

Enhancement

& Approval

Faculty

Identifiers

(ORCID)

Scholarly

CommonsElements

PENN VIVO – PARTS

Targets

Sources

Sources• Diverse set of internal & external

sources

• Difficult to gather

• Inconsistent formatting

• Requires disambiguation & consistency

enhancement

• Significant effort for researchers to

maintain

VIVO• Public facing website

• Integrates with repository as avenue for

research output delivery

• Links researcher profile data together in

semantic network across boundaries

• Safely expose researcher activity data

to the network

Elements• Scales data collection to diverse

University-wide and school-specific

sources

• Automates the collection of publications

• Aggregates data for comprehensive

view of individuals, teams, labs,

departments, schools

• Ergonomic review and editing

• Hub for use and reuse of profile data

• Profile management partnership with

researchers is a platform for deeper

Targets• Expose Penn research to semantic

discovery

• Format & publish Curriculum Vitae

• Expose network accessible researcher

profiles, NIH Biosketch, & C.V.s

• Content for individual, department, or

school webpages

• Deposit research output in Penn and/or

domain repositories – e. g. PubMed

Central

• Comply with federally-mandated open

access repositories & OSTP open data

requirements

• Support researcher assessment &

reporting, and build activities intelligence

Next Steps

Implement Elements – Due in Oct 2013 Working with Abramson Cancer Center

Demonstrate funded research output

Select next implementation partner Vet, Dental, GSE – Currently use FEDS

Nursing – Motivated administration, teaching

SEAS – Trust relationship as early adopter

Participate in governance of VIVO community

Work with PSOM to replace FEDS –2014

Incorporate Penn VIVO into publication repository workflow

Work with Symplectic to develop Penn C.V. template

Use Penn VIVO to produce NIH Biosketch

Rollout Penn VIVO to the University

Hansueli Krapf – Used under CC-BY License

http://commons.wikimedia.org/wiki/File:2012-01-11_11-40-34_Spain_Canarias_Jand%C3%ADa.jpg

Improves ability to collect researcher activity data from heterogeneous sources

Provides ergonomic workflow for researchers to manage, maintain, & enhance profiles Elements can, in time, replace FEDS, and is

available to the whole of Penn

Provides platform that leverages researcher activities data Assess research impact

Expose contributions to new knowledge

Improves collection of Penn research output

Improves compliance with federal deposit mandates

Increases researcher brand & networking

Scales to encompass more data, more schools, & more researchers

IN SUMMARY, PENN VIVO

See Penn’s VIVO in Action

Read Ivy Leaves about the VIVO Pilot with Perelman School of Medicine

Learn about the VIVO Community

Talk to the Penn Libraries –[email protected]


FIND OUT MORE ABOUT VIVO AT PENN


http://www.library.upenn.edu/docs/publications/ivyleaves/ILfall2012.pdf

http://vivoweb.org/


Google – semantics on display

Simple understanding of relationships

VIVO – built for semantic searching

Sophisticated description of researcher activities

VIVO networking – scaling discovery across

institutions

DISCOVERY ENHANCED WITH SEMANTIC DATA

https://google.com/


http://vivosearch.org/

VIVO semantically links People

Publications

Organizations

And more!

And supports Searching &

browsing inside & outside institutions

Network visualizations

VIVO IS ABOUT NETWORKS

Local data Contact, Title, Identifiers,

& Appointments

Research interests and expertise

Research grants and results

Teaching activities

Public data Editorial activities

Presentations

Collaborations

Publications Peer reviewed articles

Monographs and books

Conference proceedings

Working papers

Software

Curated data sets Industry standard ontology

VIVO IS POWERED BY DATA

penn vivo ecosystem

Education

researcher data system

profile data

data feeds

data complexity

penn libraries

evolution of penn

data new requirements

researcher profiles