rapid organisation of health research data -

1
Rapid Organisation of Health Research Data (ROHRD) Phase 1 Michael Soljak 1 , Vasa Curcin 2 , Azeem Majeed 1 , Yutong Cai 1 1 Department of Primary Care and Public Health, 2 Department of Computing, Imperial College London rapidhealthdata.wordpress.com Overview Public health, biomedical and health services research will be critical for UKs future development. However, the data sources used in research display significant variations in quality, coding dictionaries, security restrictions, and access methods. ROHRD will introduce a systematic semantically-enabled archive of health data sets available within different medical departments at Imperial College, together with the standardized profiling services for data investigation, with the view of facilitating internal and external collaborations with other academic institutions and the NHS. Methods Structured interviews with stakeholders and data owners supported by data gathering infrastructure, initially using EndNote’s database reference function, which is widely used and available amongst health researchers Ontology for capturing the data source metadata- we are investigating the use of Protégé ontology software, and within it the use of SNOMED-CT clinical terms (http://www.ihtsdo.org/snomed-ct/) to enable interoperability Provenance information tracking the transformations applied to raw source data to produce user-accessible specialised result datasets Standardized analytical workflows for data profiling and associated web reporting components A data access and analysis portal to the repository portal for the ICL School of Public Health Online and group-based training to access and use data via the portal Objectives To develop metadata for current and planned data sources at Imperials Department of Primary Care and Public Health (PCPH) To produce a data discovery report for the research data held by School of Public Health To develop a Research Data Management Plan for PCPH To integrate the PCPH primary care and national data repositories within the same access framework To create a generic, extensible data access portal To develop standardized analytical workflows for standard aggregation, filtering, selection and reporting tasks from the data To produce policies and forms for reuse of aggregated health research data Key challenges Confidentiality issues in dealing with person-level healthcare data- current national information governance policy requires this to be held on a standalone workstation. Local policies approved by data providers (e.g. NHS IC) in reusing or repurposing health data not fit for the digital age. Lack of methods for interactive investigation of available data sets. Guaranteeing high quality research data needed to increase the impact of research and to maintain excellence. . Contacts For further details visit: Dr Michael Soljak, [email protected] rapidhealthdata.wordpress.com Dr Vasa Curcin, [email protected] Mr Yutong Cai, [email protected] A project funded under JISCs Managing Research Data Programme 2011-13 Danger Health Data

Upload: others

Post on 09-Feb-2022

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Rapid Organisation of Health Research Data -

Rapid Organisation of Health Research Data (ROHRD) Phase 1

Michael Soljak1, Vasa Curcin2, Azeem Majeed1, Yutong Cai1 1 Department of Primary Care and Public Health, 2 Department of Computing, Imperial College London

rapidhealthdata.wordpress.com

Overview Public health, biomedical and health services research will be critical for UK‘s future development.

However, the data sources used in research display significant variations in quality, coding dictionaries,

security restrictions, and access methods. ROHRD will introduce a systematic semantically-enabled

archive of health data sets available within different medical departments at Imperial College, together

with the standardized profiling services for data investigation, with the view of facilitating internal and

external collaborations with other academic institutions and the NHS.

Methods Structured interviews with stakeholders and data owners supported by data gathering infrastructure, initially

using EndNote’s database reference function, which is widely used and available amongst health

researchers

Ontology for capturing the data source metadata- we are investigating the use of Protégé ontology software,

and within it the use of SNOMED-CT clinical terms (http://www.ihtsdo.org/snomed-ct/) to enable

interoperability

Provenance information tracking the transformations applied to raw source data to produce user-accessible

specialised result datasets

Standardized analytical workflows for data profiling and associated web reporting components

A data access and analysis portal to the repository portal for the ICL School of Public Health

Online and group-based training to access and use data via the portal

Objectives To develop metadata for current and planned data sources at Imperial’s Department of Primary Care and Public Health (PCPH) To produce a data discovery report for the research data held by School of Public Health To develop a Research Data Management Plan for PCPH To integrate the PCPH primary care and national data repositories within the same access framework To create a generic, extensible data access portal To develop standardized analytical workflows for standard aggregation, filtering, selection and reporting tasks from the data To produce policies and forms for reuse of aggregated health research data

Key challenges

Confidentiality issues in dealing with person-level

healthcare data- current national information

governance policy requires this to be held on a

standalone workstation.

Local policies approved by data providers (e.g.

NHS IC) in reusing or repurposing health data not fit

for the digital age.

Lack of methods for interactive investigation of

available data sets.

Guaranteeing high quality research data needed to

increase the impact of research and to maintain

excellence.

.

Contacts For further details visit: Dr Michael Soljak, [email protected] rapidhealthdata.wordpress.com

Dr Vasa Curcin, [email protected]

Mr Yutong Cai, [email protected]

A project funded under JISC’s Managing Research Data Programme 2011-13

Danger

Health

Data