taming the wilderness of open research information
DESCRIPTION
Ergebnisse eines Projekts mit Studierenden der Hochschule Hannover sowie aktuelle Entwicklungarbeiten im Kontext Forschungsinformation (VIVO) am Open Science Lab der TIB Vortrag von Dr. Ina Blümel und Gabriel Birke auf der i-Know conference 2014 (https://i-know.tugraz.at/) am 18. September 2014 in GrazTRANSCRIPT
Dr. Ina Blümel, Gabriel Birkei-Know conference
September 18, 2014
Taming the Wilderness of Open Research InformationStudent project at HS Hannover, participants: Wendinda Carine Donessonne, Felix Kommnick, Elena Liventsova, Rahima Medshid, Bengt Olschewski, Anna Petersmeier, Tatiana Walther, Jana Wolf
Research Information: Paradigms
• Institutional• research management as driving force: reporting tools, etc. • mostly proprietary CRIS implementations at institutional and partly
national level, … (Pure, Converis, et al)• “closed world”
• Community based / discovery layer• merging & linking research information from various sources• Supporting scientists to establish networks, see success of
ResearchGate, academia.edu, etc.
2
3
VIVO
• Model for linkable research information with LOD ontologies
• Open source software• Originally developed at Cornell with
NSF funding, now supported by a consortium at DuraSpace
• Numerous implementations, previously primarily in the English-language bio/medical area (CTSA)
• Research profiles, visualisations, …
4
“feed” VIVO
1. External data sources, esp. websites (harvesting)
2. Internal data sources (Web API or other type of access)
3. Individual customization to suit professional needs
Challenge: • From the vast array of research inf. objects on the web to
structured research information • If possible, automatically
Sources
Science 2.0 community• Websites with publications,
projects, information about organizations, persons, ...
• with structured and unstructured information
Identify websites with repetitive, similarly structured content, worth setting up a harvesting pipeline!
5
Setting, Task
• 16 weeks project• 6th semester bachelor students of library and information
science• supported by an information and a computer scientist
• identify and document research information items on the websites
• map to the VIVO ontology
• certain steps re-defined or split up during running project according to students needs / prior knowledge
6
7
8
9
10
Steps
11
12
Challenges
13
• inconsistent publication data, entered as freeform text in CMS, e.g., up to 13 different versions of journal volume representation
• templates don’t provide RI in machine-readable formats
Challenges
• Variable content, stable structures • Duplicates with different structure (publications, persons, …)
http://www.hiig.de/ausgewahlte-publikationen/ http://www.hiig.de/ausgewaehlte-veroffentlichungen/
14
Man and machine drawing same conclusions?
http://www.hiig.de/kooperationen/
Partners are marked with a logo (image) Luckily „alt“-tags available
15
Challenges
Results
16
• Discovery layer with aggregated research information
• Also approach for bootstrapping institutional research information systems from available web sources
• no substitute, but complementary to those systems
17
• Community building• VIVOcamp13, first workshop for EU VIVO community, SWIB13 satellite
November 2013• VIVO Bootcamp at ELAG Conference (European Library Automation
Group) in Bath, June 2014, „hands-on„• euroCRIS LOD group participation
• Policy & Standards Making: Position paper DINI AG FIS• Supervising bachelor thesis for extending VIVO ontology• DFG application: “German Academic Web”
Some activities (beside VIVO implementation)
Thank you for your attention!