a formal representation for numerical data presented in published clinical trial reports

36
UCLA MII A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD Medical Imaging Informatics Group University of California, Los Angeles

Upload: butch

Post on 29-Jan-2016

46 views

Category:

Documents


0 download

DESCRIPTION

A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports. Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD Medical Imaging Informatics Group University of California, Los Angeles. Problem: Querying Free Text CTRs. Clinical Trial Reports (CTRs). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

A Formal Representation for Numerical Data Presented in

Published Clinical Trial Reports

Maurine Tong BS, William Hsu PhD, Ricky K Taira PhD

Medical Imaging Informatics Group

University of California, Los Angeles

Page 2: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Problem: Querying Free Text CTRs

Clinical Trial Reports (CTRs)

Patient Recruitment

Internal/External Validity Testing

Disease Modeling

QueryProcessor

Informatics Applications

Representation

Page 3: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Why Focus on Numerical Info

• Predictive disease modeling• Ex: Bayesian Belief Networks

• Key to identifying trial quality• Hypothesis testing context and measures

• Key to synthesizing evidence• What is the context for reported probabilities

•P ( effect | cause, context )

Internal Validity

Disease Modeling

Patient Recruitment

Page 4: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Background and Prior Work• Ontologies for Experiments and Clinical Trials

• Ontology of Clinical Research (OCRe) Sim et al.• Ontology of Scientific Experiments (EXPO) Soldatova et al.

• Standardizing and sharing clinical trial data• BRIDG, CDISC, SNOMED CT

• Representing individual sections of a clinical trial report• Eligibility criteria: EliXR, Weng et al. • Scientific claims: Blake et al.

These systems primarily help to improve patient recruitment. Our focus is on modeling numerical information for quality

assessment and disease modeling

Page 5: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Problem: Fragmentation

Page 6: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Methods: Requirements Analysis• What are the queries to be supported by the representation?

Study Quality Disease Modeling

Page 7: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Methods: Requirements Analysis• Study quality queries

• What is the p-value (population parameter associated with hypothesis?

• What is the statistical test used to calculate the p-value?

• What is the power of the sample size tested?

• …

Study Quality

and experts

James Sayre, PhDBiostatician

Consulted textbooks

Page 8: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Methods: Requirements Analysis• Disease modeling queries• What are the prior

probabilities?• Can we estimate posterior

probabilities from p-values or other reported information?

• …

Disease Modeling

Consulted experts, textbooks and literature

Thomas Belin, PhDBiostatician

Page 9: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Methods: Initial Design

• Conceptual model of representation

• Domain: Metastatic Melanoma

Flaherty KT. et al. N Engl J Med. 2010 Aug 26;363(9):809-19

Page 10: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pop. Stats

Sample Pop. Intervention Baseline Measurements

 Variables  <240m

g240mg

320 / 360mg

720 mg<24mg 240mg320 /

360mg720 mg  

Prevalence of MAP kinase pathway mutation

40-60%                

Age   23-86      

Confirmed histology refractory to standard treatment

 0:5,1:16,

2:5, >2:23     

PLX4032 Formulation                  Crystalline     n=3/6 n=3/6 n=3/6 n=3/6      Microprecipitated bulk powder

    n=34 n=34 n=34 n=34      

Plasma samples (uM x hr)

   100 +/-

50350+/-

78650+/-

1001500+/-1000

<240mg

240mg      

CT Studies                      Total Response Rate              100% 34%  67% 80%Partial Response                 02,6 02,4

Page 11: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pop. Stats

Sample Pop. Intervention Baseline Measurements

 Variables  <240m

g240mg

320 / 360mg

720 mg<24mg 240mg320 /

360mg720 mg  

Prevalence of MAP kinase pathway mutation

40-60%                

Age   23-86      

Confirmed histology refractory to standard treatment

 0:5,1:16,

2:5, >2:23     

PLX4032 Formulation                  Crystalline     n=3/6 n=3/6 n=3/6 n=3/6      Microprecipitated bulk powder

    n=34 n=34 n=34 n=34      

Plasma samples (uM x hr)

   100 +/-

50350+/-

78650+/-

1001500+/-1000

<240mg

240mg      

CT Studies                      Total Response Rate                 67% 80%Partial Response                 02,6 02,4

A

Process Model

Page 12: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pop. Stats

Sample Pop. Intervention Baseline Measurements

 Variables  <240m

g240mg

320 / 360mg

720 mg<24mg 240mg320 /

360mg720 mg  

Prevalence of MAP kinase pathway mutation

40-60%                

Age   23-86      

Confirmed histology refractory to standard treatment

 0:5,1:16,

2:5, >2:23     

PLX4032 Formulation                  Crystalline     n=3/6 n=3/6 n=3/6 n=3/6      Microprecipitated bulk powder

    n=34 n=34 n=34 n=34      

Plasma samples (uM x hr)

   100 +/-

50350+/-

78650+/-

1001500+/-1000

<240mg

240mg      

CT Studies                      Total Response Rate                 67% 80%Partial Response                 02,6 02,4

B Global Variable List

Page 13: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pop. Stats

Sample Pop. Intervention Baseline Measurements

 Variables  <240m

g240mg

320 / 360mg

720 mg<24mg 240mg320 /

360mg720 mg  

Prevalence of MAP kinase pathway mutation

40-60%                

Age   23-86      

Confirmed histology refractory to standard treatment

 0:5,1:16,

2:5, >2:23     

PLX4032 Formulation                  Crystalline     n=3/6 n=3/6 n=3/6 n=3/6      Microprecipitated bulk powder

    n=34 n=34 n=34 n=34      

Plasma samples (uM x hr)

   100 +/-

50350+/-

78650+/-

1001500+/-1000

<240mg

240mg      

CT Studies                      Total Response Rate                 67% 80%Partial Response                 02,6 02,4

C Variable Characterization

Page 14: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pop. Stats

Sample Pop. Intervention Baseline Measurements

 Variables  <240m

g240mg

320 / 360mg

720 mg<24mg 240mg320 /

360mg720 mg  

Prevalence of MAP kinase pathway mutation

40-60%                

Age   23-86      

Confirmed histology refractory to standard treatment

 0:5,1:16,

2:5, >2:23     

PLX4032 Formulation                  Crystalline     n=3/6 n=3/6 n=3/6 n=3/6      Microprecipitated bulk powder

    n=34 n=34 n=34 n=34      

Plasma samples (uM x hr)

   100 +/-

50350+/-

78650+/-

1001500+/-1000

<240mg

240mg      

CT Studies                      Total Response Rate                 67% 80%Partial Response                 02,6 02,4

…D Statistical Hypothesis

Testing

Page 15: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Results: Implementation

Page 16: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Example 1: Capturing context

• Demonstration of how the representation captures context for the observations of an intervention group.

• Query• Domain: Lung Cancer • In Johnson et al., what is the context (e.g.,

intervention, population characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group (HDG)?

Johnson DH. et al. J Clin Oncol. 2004 Jun 1;22(11):2184-91.

Page 17: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Steps to Capture Context

1. Find the node in the process model

2. Find corresponding column

3. Find variable of interest

4. Backtrack through the process model to obtain context for observations and get associated data to backtracked node

5. Construct logical representation of context

6. Repeat steps 4-5 until the start node

Page 18: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 1: Find the node in process model

This node represents the progression free survival time point for high dose group.

Page 19: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 2: Find corresponding column

This column represents the numerical data and data elements associated with this node

Page 20: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 3: Find variable of interest

Page 21: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 4: Backtrack & Obtain Data

Obtain context by looking at linked nodes in process model

Page 22: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 5: Construct logical context

Data modeling is straightforward from semantics of process model link and node

Cell name: BevacizumabCell Location #: 474

Drug: BevacizumabDose: 15 mg/kg

How was it administered:

Vehicle: Intravenous infusion Duration: Over 90 minutes Cycle: 3 weeks Maximum dose: 18 doses Exception: Well tolerated Resulting Action: New duration Duration: 30-60 minutes

Page 23: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Step 6: Repeat steps 4-5 until start• Continue backtracking through process model

• Aggregate associated data

• Repeat until first node

Context for Adverse Event (Node #740):• Name of n847

Page 24: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Example 1: Capturing context

• Demonstration of how the representation captures context for the observations of an intervention group.

• Query• What is the context (e.g., intervention, population

characteristics, measurement methodology) associated with progression free survival (PFS) in the high dose group?

Page 25: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Example 1: Capturing context

• Data:

• AssociatedContext:

Context for Adverse Event (Node #740):1 ) INTERVENTION:

Bevacizumab (Node #474)2) POPULATION CHARACTERISTICS:

High Dose Bev (Arm #3)Eligibility Criteria: Stage 3 Recurrent NSCLC (Node #847) No Prior Chemotherapy (Node #628) Other criteria (Node #748)Baseline characteristics of the patient (Node #222)

3) METHODS:Progression Free Survival

Page 26: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Example 2: Comparisons

• Comparison of outcomes in the intervention vs. control arms

• Query• Compare PFS for intervention and

control arm

• Context from two nodes can be placed on the same chart

Page 27: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Example 3: Analyses

• How was the p-value calculated?

• Visualization includes:• Data• Test Statistics• P-value• Statement

Page 28: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Pilot Evaluation

• Can representation answer user queries from requirements analysis?

• Preliminary evaluation questions• Characteristics of the trial• Quality of the trial• Significance of the science

Page 29: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Evaluation: Objectives

• Objective 1• Utility of the representation to accurately identify

numerical data to support key contributions made by a clinical trial report

• Objective 2• Intuitiveness of the representation through

reproducibility of the visualization by different users

Page 30: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Evaluation: Study Design• Study design

• 2-arm study

• Status quo group using paper copy

• Intervention group using proposed representation

• Participants (n=6)• Graduate students in biology, biostatistics, informatics, or engineering

• Statistical methods• Student’s paired t-test

• Gold standard

• Established by graduate student supervised by domain expert

• 4 clinical trial papers in NSCLC

• J Clin Oncol. 2004 Jun 1;22(11):2184-91.

• J Clin Oncol. 2008 May 20;26(15):2442-9.

• Lancet Oncol. 2012 Jan;13(1):33-42.

• J Clin Oncol. 2011 Nov 1;29(31):4113-20.

Page 31: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Evaluation: Questions

• What is the purpose of this trial?

• What is the sample size for each experimental arm?

• How was the primary outcome assessed?

• How many patients experienced positive outcomes in this trial?

• How was the data analyzed?

Page 32: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Evaluation: Results• Users of the representation was able to accurately

identify numerical data that support key

contributions as compared with status quo

• User visualizations was reproducible• 68.1% ± 6.45% was of the gold standard was

reproduced by users

Accuracy

SD Time SD

Representation 79% 18% 30 9%

Status Quo 76% 9% 34 7%

Page 33: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Discussion• Our work supports queries related to study quality

and disease modeling

• We developed a representation to associate appropriate

context from numerical data within clinical trial reports

• The pilot evaluation shows that the utility of the

representation is promising

• To extend this work:

• Instantiate using automatic methods and capture numerical

data using NLP methods

• Develop an interface to support frequently-asked queries

for specific clinical trial reports

• Test in journal club setting

Page 34: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Conclusion

• We are establishing a systematic way of extracting information from clinical trial reports in a machine-understandable way

• The overarching objective is to have a computer reason on this representation to facilitate clinical decision making

Page 35: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

Acknowledgements

• James Sayre, PhD, Biostatician• Domain experts• Research participants• NLM Training Grant• NLM R01-LM009961

Page 36: A Formal Representation for Numerical Data Presented in Published Clinical Trial Reports

UCLA

MII

THANK YOU