rapid organisation of health research data -
TRANSCRIPT
Rapid Organisation of Health Research Data (ROHRD) Phase 1
Michael Soljak1, Vasa Curcin2, Azeem Majeed1, Yutong Cai1 1 Department of Primary Care and Public Health, 2 Department of Computing, Imperial College London
rapidhealthdata.wordpress.com
Overview Public health, biomedical and health services research will be critical for UK‘s future development.
However, the data sources used in research display significant variations in quality, coding dictionaries,
security restrictions, and access methods. ROHRD will introduce a systematic semantically-enabled
archive of health data sets available within different medical departments at Imperial College, together
with the standardized profiling services for data investigation, with the view of facilitating internal and
external collaborations with other academic institutions and the NHS.
Methods Structured interviews with stakeholders and data owners supported by data gathering infrastructure, initially
using EndNote’s database reference function, which is widely used and available amongst health
researchers
Ontology for capturing the data source metadata- we are investigating the use of Protégé ontology software,
and within it the use of SNOMED-CT clinical terms (http://www.ihtsdo.org/snomed-ct/) to enable
interoperability
Provenance information tracking the transformations applied to raw source data to produce user-accessible
specialised result datasets
Standardized analytical workflows for data profiling and associated web reporting components
A data access and analysis portal to the repository portal for the ICL School of Public Health
Online and group-based training to access and use data via the portal
Objectives To develop metadata for current and planned data sources at Imperial’s Department of Primary Care and Public Health (PCPH) To produce a data discovery report for the research data held by School of Public Health To develop a Research Data Management Plan for PCPH To integrate the PCPH primary care and national data repositories within the same access framework To create a generic, extensible data access portal To develop standardized analytical workflows for standard aggregation, filtering, selection and reporting tasks from the data To produce policies and forms for reuse of aggregated health research data
Key challenges
Confidentiality issues in dealing with person-level
healthcare data- current national information
governance policy requires this to be held on a
standalone workstation.
Local policies approved by data providers (e.g.
NHS IC) in reusing or repurposing health data not fit
for the digital age.
Lack of methods for interactive investigation of
available data sets.
Guaranteeing high quality research data needed to
increase the impact of research and to maintain
excellence.
.
Contacts For further details visit: Dr Michael Soljak, [email protected] rapidhealthdata.wordpress.com
Dr Vasa Curcin, [email protected]
Mr Yutong Cai, [email protected]
A project funded under JISC’s Managing Research Data Programme 2011-13
Danger
Health
Data