comparison among eu-adr, omop, mini-sentinel and matrice
TRANSCRIPT
Comparison Among EU-ADR, OMOP, Mini-SentinelAnd MATRICE Strategies For Data Extraction And
Management
Rosa Gini (1) (2)Patrick B Ryan, Jeffrey S Brown, Edoardo Vacchi, Massimo Coppola, Walter Cazzola, Preciosa M Coloma, Roberto Berni,
Gayo Diallo, Paul Avillach, Gianluca Trifiro, Jose L Oliveira, Peter R Rijnbeek, Johan van der Lei,Miriam CJM Sturkenboom, Martijn J Schuemie
(1) Agenzia regionale di sanita della Toscana, Florence(2) Department of Medical Informatics, Erasmus medical center,
Rotterdam
Montreal, August 2013
Comparison of Data Management in Networks
Disclosure
My PhD is funded by the Italian Ministry of Health in the context ofthe MATRICE project
All coauthors participate in one or more among the EU-ADR,MATRICE, Mini-Sentinel and OMOP networks
The content of this presentation is pertinent with the objectives ofEMIF, a project involving some coauthors and myself and funded bythe Innovative Medicines Initiative, a joint undertaking between theEuropean Union and the pharmaceutical industry association EFPIA
The final version of the presentation is my sole responsibility
Comparison of Data Management in Networks
Challenges in studies from a network of databases
E Y
C
Studies investigate causalitybetween exposure E and response
Y by measuring associationconditional on observed
confounding C
Comparison of Data Management in Networks
Challenges in studies from a network of databases
E Y
ME MY
U
In a database study
E and Y are replaced bytheir measurements ME andMY recorded in the database
Deriving ME from E andMY from Y might addunobserved confounding
Comparison of Data Management in Networks
Challenges in studies from a network of databases
E Y
M1E M1
Y
M2E M2
Y
M3E M3
Y
M4E M4
Y
M5E M5
Y
M6E M6
Y
In a network database-specificdata derivation must be handled
specific data items arecollected from localhealthcare and public healthdata sources
individual-level data cannotbe pooled: datamanagement is local, mustbe documented forinvestigators to haverelevant information at hand
technological solutions areadopted
Comparison of Data Management in Networks
Objective of this presentation
Introduce a conceptual framework to compare how this process is handledin four networks
Comparison of Data Management in Networks
Four networks
OMOP
EU-ADR
MATRICE
MiniSentinel
Comparison of Data Management in Networks
Four networks
OMOP
EU-ADR
MATRICE
MiniSentinel
Pharmacoepi/Safety
Public Health/Health Services Research
Comparison of Data Management in Networks
Four networks
OMOP
EU-ADR
MATRICE
MiniSentinel
National
International
Comparison of Data Management in Networks
Conceptual framework
D1Original DBs
D2Global Schema
D3Derived data
D4Datasets for
analysis
T1
Reorganization
T2
Data derivation
T3
Implementation ofstudy design
Split the local data management in three data transformation steps
T1 Reorganization
T2 Data derivation
T3 Implementation of study design
Discussion
Comparison of Data Management in Networks
Conceptual framework
D1Original DBs
D2Global Schema
D3Derived data
D4Datasets for
analysis
T1
Reorganization
T2
Data derivation
T3
Implementation ofstudy design
T1: what is reorganization?
Original data Understand the data items originally collected in eachnode of the network
Global schema Create a global schema
Lossless map Create a map from the original local data into the globalschema, no loss of information
Possibly recode Duplicate coded data items into a common coding system
Study independent Does not depend on the specific studyDiscussion
Comparison of Data Management in Networks
Conceptual framework
D1Original DBs
D2Global Schema
D3Derived data
D4Datasets for
analysis
T1
Reorganization
T2
Data derivation
T3
Implementation ofstudy design
T2: what is data derivation?
Create study variables Study variables which are not among the dataitems originally collected need to be derived
Examples Acute myocardial infarction, upper gastro-intestinal bleeding,diabetes
Algorithms To derive new variables, apply algorithms to the data itemsrepresented in the global schema
Discussion
Comparison of Data Management in Networks
Conceptual framework
D1Original DBs
D2Global Schema
D3Derived data
D4Datasets for
analysis
T1
Reorganization
T2
Data derivation
T3
Implementation ofstudy design
T3: what is implementation of study design?
Avoid pooling individual-level data Perform locally as much datapreparation as possible
Operations Person-time splitting, matching, aggregating, (estimating?)
Transformation results Might be local estimates or datasets for furtherpooled analysis
Discussion
Comparison of Data Management in Networks
Comparing T1: reorganization
Original data Rather homogeneous in two national networks (Mini-S,MATRICE), in OMOP: combinations of EMR + claims data,in EU-ADR: all sort of combinations
Global schema In EU-ADR and MATRICE tables organized persetting of data collection (hospitals vs GPs vs pharmaciesetc), in OMOP tables organized per content (diagnosisvs procedures vs drugs etc), in Mini-S mixed approach
Recoding in global schema OMOP yes, others no
Documentation In EU-ADR informal documents, in others: transformationexecuted by coded procedures in SQL, SAS or ad-hocprogramming languages.
Metadata on local context None
Comparison of Data Management in Networks
Comparing T2: data derivationWhich study variables In MATRICE: chronic diseases, in others: mostly
acute conditionsAssessment All: internal/external comparison of incidence/prevalence
ratesDiversity EU-ADR exploits local diversity, others: homogeneous
algorithmsValidation Different strategies
PPV from external gold standard EU-ADR and Mini-S: chartreview of recorded diagnostic codes/free text
All indices from internal gold standard MATRICE:population-based study within the network
From performance OMOP: performing best in terms ofstudy results
Metadata on validity NoneExecution EU-ADR: local autonomous procedures, others: execution of
common script (SAS, SQL, novel Domain Specific Language)
Comparison of Data Management in Networks
Comparing T3: implementation of study design
Process Automatic in all networks
Tools In OMOP and Mini-S existing tools (SAS, SQL, R), inEU-ADR and MATRICE developed ad-hoc (Jerboa andTheMatrix/Morpheus)
Results In OMOP: study results, in others: datasets for pooledanalysis
Comparison of Data Management in Networks
Discussion
Bias in measurement OMOP was able to estimate quantitatively bias inmeasurement of some acute adverse drug reactions - and itis relevant
Not a big deal? According to evidence from OMOP and EU-ADR ,improving data derivation strategy doesn’t improve detectionof acute, short-time adverse reactions from exposure to drugs
Calibration This is likely not the case in general: what about storing andautomatically using validity indices for calibration?
Wrap up
Comparison of Data Management in Networks
Wrap up
Framework A conceptual framework was introduced splitting datacollection in three steps
T3: implementation of study design Very similar across networks, ad-hocvs existing scripting tools
T1: reorganization EU-ADR pooled the most heterogeneous data withleast formal documentation, differences in global schemas arenot substantial
T2: data derivation Differences in data derivation process and rationalefor its validity
Comparison of Data Management in Networks
Exploiting diversity in EU-ADR
additional information
Incidence ratebased onrecommendedquery
Incidence rate based on additionaldata (% increase)
Event Database HOSP-main GP Additionalinformation fromDEATH
Additionalinformation fromconcept withrefinement
AMI Aarhus 101.4 126.5 (+25%)
ARS 77.8 90.2 (+15%)
HSD 58.7 59.1 (+0.5%)
IPCI 148.4PHARMO 93.4Lombardy 82.5
Avillach, Coloma et al, 2012
Comparison of Data Management in Networks
Population-based validation study in MATRICEMINHEALTH
Centrally
LHU
ABC has IHD according to both algorithm 1 and 2CBA has IHD according to 1 but not according to 2BAC has IHD according to no algorithmCAB has IHD according to no algorithm
ID IHD1 IHD2XYW 1 1WYX 1 0YXW 0 0WXY 0 0
Centrally
GP
ABC has IHDCBA has not IHDBAC has IHDCAB has IHD
ID IHDXYW 1WYX 0YXW 1WXY 0
CentrallyID IHD1 IHD2 IHDXYW 1 1 1WYX 1 0 0YXW 0 0 1WXY 0 0 0
P1 has IHD according to both algorithm 1 and 2, and has a diagnosisP2 has IHD according to 1 but not according to 2, and has no diagnosisP3 has IHD according to no algorithm, but has a diagnosisP4 has IHD according to no algorithm, and has no diagnosis
A 7→ XB 7→ YC 7→ W
A 7→ XB 7→ YC 7→ W
publickey
publickey
TheMatrix Morpheus
1Comparison of Data Management in Networks
Impact of acute myocardial infarction derivation onperformance
Outcome definition
AU
C for
pairs w
ith M
DR
R<
=1.2
5
Ryan, 2012
Comparison of Data Management in Networks
Risk of upper GI bleeding in exposed to different drugs,different derivation strategies
49 (12)
0
1
2
3
4
5
6
7
8
9 U
GIB
UG
IB25
UG
IB50
UG
IB75
UG
IB
UG
IB25
UG
IB50
UG
IB75
UG
IB
UG
IB25
UG
IB50
UG
IB75
UG
IB
UG
IB25
UG
IB50
UG
IB75
UG
IB
UG
IB25
UG
IB50
UG
IB75
Heparin Prednisolone Indometacin Ibuprofen Aspirin
Rela
tive
ris
ke
stim
ate
UGIB: all eligible codes to identify UGIB
UGIB25: only codes with PPV of more than 25%
UGIB50: only codes with PPV of more than 50%
UGIB75: only codes with PPV of more than 75%
Valkhoff et al, 2012
Comparison of Data Management in Networks
Non causal associations of drugs with UGIB and ALI
Schuemie et al, 2013
Comparison of Data Management in Networks
A conjecture
Y MY
RDC
PDC
PCL
RIC
I EME
1Y outcome, MY measure of Y , RDC remote direct cause of Y , PDC proximal direct cause, RIC remote indirect cause, PCL
proximal cause mediating RIC
Comparison of Data Management in Networks
OMOP
Where US
Goal Methodological research about use of electronic healthcaredata to explore the real-world effects of medical products
What The Observational Medical Outcomes Partnership (OMOP)is a public-private partnership initiated in 2008, managed byFoundation for the National Institutes of Health, chaired bythe Food and Drug Administration
Support Pharmaceutical industry with active engagement fromacademia, industry, healthcare providers in US andinternationally.
Comparison of Data Management in Networks
Mini-Sentinel
Where US
Goal Create an active surveillance system to monitor the safety ofFDA-regulated medical products
Funding FDA
Network Health insurers, includes several integrated delivery system
Comparison of Data Management in Networks
Exploring and Understanding Adverse Drug Reactions byIntegrative Mining of Clinical Records and Biomedical
Knowledge (EU-ADR) ProjectWhere Europe
Objective Design, development, and validation of a computerizedsystem that exploits data from electronic healthcare recordsand biomedical databases for the early detection of adversedrug reactions
Funding Information and Communication Technologies (ICT) area ofthe European Commission under the VII FrameworkProgramme
Network Project completed in 2012, network still in place for newstudies
Partners of the original project Aarhus University Hospital, Aarhus Sygehus, Denmark; Agenzia regionaledi Sanita, Italy; AstraZeneca AB, Sweden; Erasmus University Medical Center, Netherlands; FundacioIMIM, Spain; Health Search - Italian College of General Practitioners, Italy; London School of Hygiene &Tropical Medicine, UK; PHARMO Cooperatie UA, Netherlands; Societa Servizi Telematici SRL, Italy;Tel-Aviv University, Israel; Universita di Milano-Bicocca, Italy; Universite Victor-Segalen Bordeaux II,France; University of Aveiro IEETA, Portugal; University of Nottingham, UK; University of Santiago deCompostela, Spain; University Pompeu Fabra, Spain
Comparison of Data Management in Networks
MATRICE
Where Italy
Goal Design and develop an automatic system to support localclinical governance of chronic disease management qualityassessment and regional/national chronic disease quality ofcare surveillance
Funding Italian Ministry of Health
Partners National Agency for Regional Health Searvices, ItalianMinistry of Health, Regional Agency for Public Health ofTuscany, National Research Council, 5 Local Health Units,College of Italian General Practitioners, Medical InformaticsDepartment of Erasmus Medical Center
Timeframe 2011-2014
Network Italian Local Health Units and Italian Regions, network ofGPs
Comparison of Data Management in Networks
Global schema in MATRICE: Italian AdministrativeDatabases
PERSONSPERSON ID
GENDERDATE OF BIRTH
STARTDATEENDDATE
GP ID
HOSPPERSON ID
START DATEMAIN DIAGNOSIS
SEC DIAGNOSIS 1-5PROCEDURE CODE 1-6PROCEDURE DATE 1-6
EXEPERSON ID
EXEMPTION CODEEXE START DATE
DRUGSPERSON ID
DRUG EXP START DATEATCDDD
OUTPATPERSON IDPROC CODE
PROC START DATE
Comparison of Data Management in Networks
OMOP Common Data Model
� � � � � �� � � � � � � � � �
� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � � � � � �� � � � � � � � � � � � � � � �
� � � � � � � � � � � � � � � � � � � � � � � � �� � � � � � � � �
� � � � ! " # $ � % � � $ � &� � ' � � � ( ) � � � ( � � � � � � � � � � � � �� � � � � � � � �� � � � � � * � � � � �+ � � � � � � �
� � , � � �� � � � ,
� � � � � � � �� � � � � � � � � � � � �
- � � � � � � � � * � �� � � � � � ) � � '
� � * �
Comparison of Data Management in Networks
Mini-Sentinel Common Data Model
Table Description Key data elements
Enrollment Contains records for all individuals who were health
plan members of the data partner during the periodincluded in the data extract
Unique person identifier
Start and end dates of coverageFlags to indicate medical and pharmacy
coverage
Demographics Includes everyone in the data partner database and is
not limited to members included in the enrollment table
Unique person identifier
Date of birthSex, race, and ethnicity
Outpatient pharmacy dispensing Includes each outpatient pharmacy dispensing picked up
by an individual
Unique person identifier
Dispensed date
NDCDays supplied and amount dispensed
Encounter Contains one record for each time an individual sees a
provider in the ambulatory setting or is hospitalized;
multiple encounters per day are possible if they occurin a different care settings
Unique person identifier
Encounter identifier
Encounter type (e.g., inpatient, outpatient,emergency department)
Start and end date for encounter
Discharge status and disposition
Diagnosis Linked to the encounter table in a many-to-one relationshipso that all of the associated diagnoses are recorded in the
diagnosis table
Unique person identifierEncounter identifier
Diagnosis code
Type of code (e.g., ICD-9-CM)
Procedure Linked to the encounter table in a many-to-one relationshipso that all of the associated procedures are recorded in the
procedure table
Unique person identifierEncounter identifier
Procedure code
Type of code (e.g., ICD-9-CM, CPT4)
Death Contains one record per death Unique person identifierDate of death
Data source (e.g., National Death Index, State)
Cause of Death Contains one record per cause of death Unique person identifier
Cause of death diagnosis codeData source (e.g., National Death Index, State)
Curtis et al, 2012
Comparison of Data Management in Networks
EU-ADR Global schema
Avillach, Coloma et al, 2012
Comparison of Data Management in Networks