phelan mcdermid syndrome data network (pms …the pms_dn integrates 1) patient reported outcomes...

19
AN ONLINE STATISTICAL TOOL TO GENERATE RESEARCH HYPOTHESES ON PHELAN-MCDERMID SYNDROME i2b2 / tranSMART Phelan-McDermid Syndrome Data Network (PMS_DN) PHELAN-MCDERMID SYNDROME FOUNDATION WEBSITE: http://22q13.org PMS_DN I2B2/TRANSMART: https://pmsdn.hms.harvard.edu U SER G UIDE

Upload: others

Post on 18-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

AN ONLINE STATISTICAL TOOL TO GENERATE RESEARCH HYPOTHESES ON PHELAN-MCDERMID SYNDROME

i2b2 / tranSMART

Phelan-McDermid Syndrome Data Network (PMS_DN)

PHELAN-MCDERMID SYNDROME FOUNDATION WEBSITE: http://22q13.org

PMS_DN I2B2/TRANSMART: https://pmsdn.hms.harvard.edu

USER GUIDE

Page 2: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

1 | P a g e Version 1.0 11.2016

TABLE OF CONTENTS Getting Started: A Quick Reference Tutorial _________________________________________ 2

1. Introduction ______________________________________________________________ 4

1.1 Phelan-McDermid Syndrome Data Network (PMS_DN) _____________________________ 4

1.2 I2b2/tranSMART ____________________________________________________________ 4

2. APPLYING FOR ACCESS to PMS_DN ____________________________________________ 5

2.1 Level 1 Access ______________________________________________________________ 5

2.2 Level 2 Access ______________________________________________________________ 8

3. Using i2b2/tranSMART stastical tools to query PMS_DN data ______________________ 8

3.1 PMS_DN i2b2/tranSMART Home Page __________________________________________ 8

3.2 Generate Summary Statistics: Use Case Example _________________________________ 10

3.3 Advanced Workflow ________________________________________________________ 14

3.4 Data Export (Level 2 access only) ______________________________________________ 16

Appendix A: Extract-Transform-Load Process _______________________________________ 17

Contributors _________________________________________________________________ 18

Acknowledgement ____________________________________________________________ 18

Page 3: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

2 | P a g e Version 1.0 11.2016

GETTING STARTED: A Quick Reference Tutorial

REGISTERING – Level 1 Access Go to https://pmsdn.hms.harvard.edu. Use your eRA Commons or Google account. First time users of the platform will be required to complete a short registration request form and agree to the PMS_DN Terms of Use. Your request will then be sent to an administrator to process. Once access is granted, you will receive an email notification. You will then be able to log in and will be presented with the list of data integrated in the PMS_DN i2b2/tranSMART system. Summary Statistics and Advanced Workflow options will now be available for access.

ORGANIZATION STEP 1: Explore ontologies by opening yellow folders to view data The home page is divided into two sections: the left side contains the search tree for registry data and the right side contains the cohort selection boxes (Subset 1 and Subset 2). COHORT (SUBSET) SELECTION STEP 2: Drag and drop criteria from left to select subset of individuals Subset selection criterion can be very simple or more comprehensive, using combinations of the Boolean logic ‘and’ (entries in stacked subset boxes), ‘or’ (entries in the same subset box), and ‘not’ (by clicking the Exclude option for the contents of a box). USING STATISTICAL TOOLS TO QUERY DATA STEP 3: Generate Summary Statistics

1) After selecting cohort(s), click on Generate Summary Statistics. 2) Subsets can be verified at the top of the Summary Statistics section. The i2b2/tranSMART application

automatically generates a table with subject totals and statistical analysis by age, sex and race for each subset, if data are available.

3) Drag and drop any of the variables (from left side to anywhere on the right side of the home page) to generate statistical analysis based upon that variable.

DOWNLOADING DATA Level 2 Access Level 2 Access allows researchers to see individual patient-level data and download the anonymized information to their own computer. Once approved for Level 1 access, you may begin the pre-submission application for Level 2 access.

https://pmsdn.hms.harvard.edu. Log in to your Level 1 account at Click the Profile dropdown menu in the top right corner of the screen, and select Level 2. Complete and submit the Level 2 request form. The Data Network Specialist will then contact you to discuss your study and provide the Data Access Application. Each applicant must submit an application that includes a technical proposal, a lay summary of the work, the applicant’s CV, and a copy of the approval or exemption letter from your current Institutional Review Board (IRB), or showing the study was determined to be non-Human Subjects Research by the IRB. Applications can be reviewed without IRB approval,

however IRB approval or exemption is required prior Level 2 approval.

THE FUNDAMENTALS https://pmsdn.hms.harvard.edu

Page 4: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

3 | P a g e Version 1.0 11.2016

PMS_DN i2b2/tranSMART HOME PAGE

Page 5: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

4 | P a g e Version 1.0 11.2016

1. INTRODUCTION

1.1 Phelan-McDermid Syndrome Data Network (PMS_DN)

The Phelan-McDermid Syndrome_Data Network (PMS_DN) is a joint venture between the Phelan-McDermid Syndrome Foundation and the Department of Biomedical Informatics at Harvard Medical School that builds on the Phelan-McDermid Syndrome International Registry (PMSIR). The PMSIR is a patient registry and database containing patient-reported clinical and developmental history on over 1,000 people with PMS. The PMSIR also collects genetic reports in order to establish a detailed, accurate genetic data set. Data exports and permission to recruit study participants through the PMSIR are granted through an application and review process. For more information, please contact [email protected].

PMSIR family members have the option to consent to participate in the PMS_DN. The PMS_DN is one of 20 Patient Powered Research Networks funded by the Patient-Centered Outcomes Research Institute (PCORI) to participate in PCORnet, a national patient-centered clinical research network. The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical notes through Natural Language Processing (NLP).

1.2 I2b2/tranSMART

I2b2 (or Informatics for Integrating Biology and the Bedside) is a scalable informatics framework designed primarily for translational research. TranSMART is an application layer to i2b2. It allows all of the functionality of i2b2, plus the ability to perform complex statistical analysis, load data using a simple Excel spreadsheet, and the ability to integrate genomic data and additional phenotype data from multiple sources.

The primary objectives of i2b2/tranSMART are the integration of clinical, biological, and ‘omics data (including such data as genomic test results) in one place, and the generation of hypotheses by investigators interested in using the data for their research.

Page 6: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

5 | P a g e Version 1.0 11.2016

2. APPLYING FOR ACCESS TO PMS_DN

Investigators have two levels of access. Level 1 users can generate summary statistics of patients’ de-identified patient reported outcomes and from knowledge extracted from clinical notes to develop new hypotheses about PMS. Level 2 users may see and download de-identified patient-level data, including curated genetic data, which is not viewable in Level 1. They can also view the de-identified sentences where the NLP engine extracted knowledge from the clinical notes. Level 2 users can opt to cross check the knowledge extracted by the NLP against the sentences from which they were extracted as part of a knowledge validation workflow.

ACCESS CONTROL LEVELS WITHIN THE PMS_DN I2B2/TRANSMART USER INTERFACE

Level 0: No access

Level 1: Access to aggregated counts only Only online statistical tools from i2b2/tranSMART No view or download of patient-level data Need an account and a few sentences on why the investigator is interested in accessing the data No IRB approval needed for the investigator

Level 2: Access to patient-level data Level 1 access, plus access to patient-level data to view or download Need an IRB-approved protocol for this study and approval from PMSF data access committee

2.1 Level 1 Access

To apply for Level 1 access (required before applying for Level 2 access):

1. Visit https://pmsdn.hms.harvard.edu and click the button for the email type consistent with the email address you plan on using for registration. For validation purposes, you must register with an e-mail address from the following: eRA Commons, NIH, Google (Gmail), GITHUB, Harvard Medical School, University of Pittsburgh or Boston Children's Hospital. If you do not have an e-mail account from one of these sources, we recommend that you create a Gmail account.

2. You will then be asked to complete a short registration request form and agree to our Terms of Use. Note: IRB approval or exemption is not required for Level 1 access.

3. Once submitted, your request will be sent to our Data Network Specialist, who will ensure your application is reviewed in a timely manner. Our aim is to grant approval within 2-3 business days.

4. You may apply for Level 2 access once Level 1 access is granted.

5. Please download the PMS_DN Quick user guide for an overview and instructions: https://s3.amazonaws.com/hms-dbmi-docs/PMSDN_Quick_guide.pdf

Page 7: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

6 | P a g e Version 1.0 11.2016

Login page

Page 8: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

7 | P a g e Version 1.0 11.2016

Registration pages

Page 9: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

8 | P a g e Version 1.0 11.2016

2.2 Level 2 Access Once approved for Level 1 access, you may begin the pre-submission application for Level 2 access through the following steps:

1. Log in to your Level 1 account at https://pmsdn.hms.harvard.edu.

2. Click the Profile dropdown menu in the top right corner of the screen, and select Level 2.

3. Complete and submit the Level 2 request form.

4. The Data Network Specialist will then contact you to discuss your study and provide the Data Access Application. Each applicant must submit an application that includes a technical proposal, a lay summary of the work, the applicant’s CV, and a copy of the approval or exemption letter from your current Institutional Review Board (IRB), or showing the study was determined to be non-Human Subjects Research by the IRB. Applications can be reviewed without IRB approval, however IRB approval or exemption is required prior Level 2 approval.

For questions regarding any of the application processes described above, please contact the Data Network Specialist at [email protected].

3. USING I2B2/TRANSMART STASTICAL TOOLS TO QUERY PMS_DN DATA

Below you will find examples of potential research questions and directions on how to navigate through the system to query the data.

3.1 PMS_DN i2b2/tranSMART Home Page The home page is the primary interface for data exploration and cohort formation. This page is divided into two sections: the left side contains the ranked search tree for data and the right side contains the cohort selection boxes (Subset 1 and Subset 2).

Home page

Page 10: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

9 | P a g e Version 1.0 11.2016

3.1.1 Left side of home page

The left side of the home page contains the search tree of data. Data is categorized as in a medical record and is divided into data from the registry (patient reported outcomes) and clinical notes.

Left side of home page: Search tree divided into registry

data and clinical notes

Page 11: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

10 | P a g e Version 1.0 11.2016

3.1.2 Right side of home page

The right side of the home page is organized in two columns of boxes, where data subsets can be selected and entered.

3.2 Generate Summary Statistics: Use Case Example

The following pages present an example of how cohorts can be selected and entered into the subset boxes for statistical analysis of specific research questions.

Subset selection criterion can be quite complex, using combinations of the logical ‘and’ (entries in stacked subset boxes), ‘or’ (entries in the same subset box), and ‘not’ (by clicking the ‘exclude’ option for the contents of a box).

Right side of home page: Subsets selection

Page 12: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

11 | P a g e Version 1.0 11.2016

USE CASE: HAS THE PATIENT BEEN DIAGNOSED WITH HYPOTONIA?

STEP 1 Select the cohort of patients into the subsets, Subset 1 and Subset 2.

In this example, two subsets were created: 1) patients NOT diagnosed with hypotonia in Subset 1 and 2) patients diagnosed with hypotonia in Subset 2. The value ‘Unsure’ is not used in this case.

STEP 2 Click on ‘Generate Summary Statistics.’

Subsets can be verified at the top of the ‘Summary Statistics’ section. The i2b2/tranSMART application automatically generates a table with subject totals and statistical analysis by age, sex and race for each subset, if data are available.

Summary Statistics: Subject totals and analysis

with age as a variable

SUBSET2: Diagnosed with Hypotonia

SUBSET1: Not diagnosed with Hypotonia

Page 13: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

12 | P a g e Version 1.0 11.2016

Summary Statistics: Statistical analysis with sex and race

Page 14: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

13 | P a g e Version 1.0 11.2016

IS THERE AN ASSOCIATION BETWEEN SEX AND HYPOTONIA?

STEP 3 ‘Drag and drop’ any of the variables (from left side to the right side of the home page) to generate statistical analysis – in this case ‘sex’.

Analysis generated after dragging and dropping the variable ‘sex’

SUBSET2: Diagnosed with Hypotonia

SUBSET1: Not diagnosed with Hypotonia

A chi-squared analysis is run and a p-value is generated. For this research question, the p-value is not significant at a 95% confidence

level. The sex ratio in subset 1 (Not diagnosed with hypotonia) is not

significantly different than the sex ratio in subset 2 (Diagnosed with hypotonia).

Page 15: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

14 | P a g e Version 1.0 11.2016

3.3 Advanced Workflow

STEP 1 Select the cohort of patients into Subset 1 or Subset 1 and 2 (depending upon the analysis).

Some more advanced analytical workflows require two cohorts to be identified.

STEP 2 Click on “Advanced Workflow” and select the type of analysis.

Advanced Workflow

Advanced Workflow: Analysis

Page 16: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

15 | P a g e Version 1.0 11.2016

STEP 3 Select independent variable and dependent variable (you can transform a categorical variable into a continuous variable and vice versa).

Then you can drag and drop variables as dependent and independent variables for the correlation analysis.

To transform the variable from this starting point, click on ‘Enable’ next to the ‘Binning’ category.

Select variables

‘Binning’ interface

Page 17: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

16 | P a g e Version 1.0 11.2016

STEP 4 Click on ‘Run.’

Results of the analysis appear in a different window with boxplot and table for ANOVA analysis.

3.4 Data Export (Level 2 access only)

To view and export clinical data from selected cohorts, click ‘Grid View’ after generating summary statistics. A flexible table will be generated with all subjects defined by the cohorts, along with associated clinical data.

You can deselect any column and remove it from the table through the use of a dropdown menu available in the column headers. Subsets of rows or the entire data table can be exported as an Excel-compatible table by clicking the export button at the bottom of the page.

Grid view and column editing

Page 18: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

17 | P a g e Version 1.0 11.2016

APPENDIX A: EXTRACT-TRANSFORM-LOAD PROCESS

The Extract-Transform-Load process: The process used to gather data from source registry and

integrate into i2b2/tranSMART.

Page 19: Phelan McDermid Syndrome Data Network (PMS …The PMS_DN integrates 1) patient reported outcomes from the PMSIR, 2) curated genetic reports, and 3) knowledge extracted from EHR clinical

18 | P a g e Version 1.0 11.2016

CONTRIBUTORS Megan O’Boyle, PI

Liz Horn, PhD, MBI, Co-PI, Network Director

Paul Avillach, MD, PhD, Co-PI

Gerladine Bliss, MSc, Research Director

Andria Cornell Mann, MSPH, Project Manager

Rebecca Davis, MS, LGC, Data Network Specialist

Jackie Malasky, MPH, Family Engagement Specialist

Cartik Saravanamuthu, MS, MS, PhD

Maxime Wack, MD, MSc

Claire Hassen-Khodja, MD, MSc

Thomas DeSain

Andre Rosa

Cassandra Perry, MS, CGC

ACKNOWLEDGEMENT This work was conducted with support from the Patient-Centered Outcomes Research Institute (PCORI). PCORI is an independent, non-profit organization authorized by Congress in 2010. Its mission is to fund research that will provide patients, their caregivers and clinicians with the evidence-based information needed to make better-informed health care decisions. PCORI is committed to continuously seeking input from a broad range of stakeholders to guide its work. More information is available at www.pcori.org.