toward rapid learning in cancer treatment selection: an analytical

10
Toward rapid learning in cancer treatment selection: An analytical engine for practice-based clinical data Samuel G. Finlayson a , Mia Levy b , Sunil Reddy c , Daniel L. Rubin c,a Harvard Medical School, Boston, MA, United States b Vanderbilt University School of Medicine, Nashville, TN, United States c Stanford University School of Medicine, Stanford, CA, United States article info Article history: Received 30 September 2015 Revised 6 January 2016 Accepted 15 January 2016 Available online 2 February 2016 Keywords: Rapid learning Learning health system Precision medicine Melanoma Interactive data analysis abstract Objective: Wide-scale adoption of electronic medical records (EMRs) has created an unprecedented opportunity for the implementation of Rapid Learning Systems (RLSs) that leverage primary clinical data for real-time decision support. In cancer, where large variations among patient features leave gaps in tra- ditional forms of medical evidence, the potential impact of a RLS is particularly promising. We developed the Melanoma Rapid Learning Utility (MRLU), a component of the RLS, providing an analytical engine and user interface that enables physicians to gain clinical insights by rapidly identifying and analyzing cohorts of patients similar to their own. Materials and methods: A new approach for clinical decision support in Melanoma was developed and implemented, in which patient-centered cohorts are generated from practice-based evidence and used to power on-the-fly stratified survival analyses. A database to underlie the system was generated from clinical, pharmaceutical, and molecular data from 237 patients with metastatic melanoma from two aca- demic medical centers. The system was assessed in two ways: (1) ability to rediscover known knowledge and (2) potential clinical utility and usability through a user study of 13 practicing oncologists. Results: The MRLU enables physician-driven cohort selection and stratified survival analysis. The system successfully identified several known clinical trends in melanoma, including frequency of BRAF muta- tions, survival rate of patients with BRAF mutant tumors in response to BRAF inhibitor therapy, and sex-based trends in prevalence and survival. Surveyed physician users expressed great interest in using such on-the-fly evidence systems in practice (mean response from relevant survey questions 4.54/5.0), and generally found the MRLU in particular to be both useful (mean score 4.2/5.0) and useable (4.42/5.0). Discussion: The MRLU is an RLS analytical engine and user interface for Melanoma treatment planning that presents design principles useful in building RLSs. Further research is necessary to evaluate when and how to best use this functionality within the EMR clinical workflow for guiding clinical decision making. Conclusion: The MRLU is an important component in building a RLS for data driven precision medicine in Melanoma treatment that could be generalized to other clinical disorders. Ó 2016 Elsevier Inc. All rights reserved. 1. Introduction The promise of leveraging vast medical record data to guide clinical decision making has created growing support for the devel- opment of ‘‘Rapid Learning Systems” (RLSs) that gather and lever- age practice-based clinical evidence for real-time clinical decision support [1]. The need for such systems is particularly evident within the field of oncology, where controlled clinical trial evidence is only available to guide therapy in a minority of patients [2–5]. We developed an analytical engine component of the RLS for Melanoma, called the Melanoma Rapid Learning Utility (MRLU). The analytic engine and graphical user interface (GUI) enables users to quickly build and analyze cohorts of patients through an interactive web application. The MRLU could be useful as a key component of future systems aimed at providing evidence-based practice in the era of electronic medical records (EHRs). In 2012, the Institute of Medicine (IoM) released a landmark report calling for an immediate and fundamental shift in U.S. healthcare. Noting that traditional pathways of knowledge genera- tion and transmission ‘‘can no longer keep pace,” the IoM called for http://dx.doi.org/10.1016/j.jbi.2016.01.005 1532-0464/Ó 2016 Elsevier Inc. All rights reserved. Corresponding author at: 1201 Welch Rd Office P285, Stanford, CA 94305, United States. Tel.: +1 650 736 6923. E-mail address: [email protected] (D.L. Rubin). Journal of Biomedical Informatics 60 (2016) 104–113 Contents lists available at ScienceDirect Journal of Biomedical Informatics journal homepage: www.elsevier.com/locate/yjbin

Upload: truongliem

Post on 20-Jan-2017

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Toward rapid learning in cancer treatment selection: An analytical

Journal of Biomedical Informatics 60 (2016) 104–113

Contents lists available at ScienceDirect

Journal of Biomedical Informatics

journal homepage: www.elsevier .com/locate /y jb in

Toward rapid learning in cancer treatment selection: An analyticalengine for practice-based clinical data

http://dx.doi.org/10.1016/j.jbi.2016.01.0051532-0464/� 2016 Elsevier Inc. All rights reserved.

⇑ Corresponding author at: 1201 Welch Rd Office P285, Stanford, CA 94305,United States. Tel.: +1 650 736 6923.

E-mail address: [email protected] (D.L. Rubin).

Samuel G. Finlayson a, Mia Levy b, Sunil Reddy c, Daniel L. Rubin c,⇑aHarvard Medical School, Boston, MA, United StatesbVanderbilt University School of Medicine, Nashville, TN, United Statesc Stanford University School of Medicine, Stanford, CA, United States

a r t i c l e i n f o

Article history:Received 30 September 2015Revised 6 January 2016Accepted 15 January 2016Available online 2 February 2016

Keywords:Rapid learningLearning health systemPrecision medicineMelanomaInteractive data analysis

a b s t r a c t

Objective: Wide-scale adoption of electronic medical records (EMRs) has created an unprecedentedopportunity for the implementation of Rapid Learning Systems (RLSs) that leverage primary clinical datafor real-time decision support. In cancer, where large variations among patient features leave gaps in tra-ditional forms of medical evidence, the potential impact of a RLS is particularly promising. We developedthe Melanoma Rapid Learning Utility (MRLU), a component of the RLS, providing an analytical engine anduser interface that enables physicians to gain clinical insights by rapidly identifying and analyzingcohorts of patients similar to their own.Materials and methods: A new approach for clinical decision support in Melanoma was developed andimplemented, in which patient-centered cohorts are generated from practice-based evidence and usedto power on-the-fly stratified survival analyses. A database to underlie the system was generated fromclinical, pharmaceutical, and molecular data from 237 patients with metastatic melanoma from two aca-demic medical centers. The system was assessed in two ways: (1) ability to rediscover known knowledgeand (2) potential clinical utility and usability through a user study of 13 practicing oncologists.Results: The MRLU enables physician-driven cohort selection and stratified survival analysis. The systemsuccessfully identified several known clinical trends in melanoma, including frequency of BRAF muta-tions, survival rate of patients with BRAF mutant tumors in response to BRAF inhibitor therapy, andsex-based trends in prevalence and survival. Surveyed physician users expressed great interest in usingsuch on-the-fly evidence systems in practice (mean response from relevant survey questions 4.54/5.0),and generally found the MRLU in particular to be both useful (mean score 4.2/5.0) and useable (4.42/5.0).Discussion: The MRLU is an RLS analytical engine and user interface for Melanoma treatment planningthat presents design principles useful in building RLSs. Further research is necessary to evaluate whenand how to best use this functionality within the EMR clinical workflow for guiding clinical decisionmaking.Conclusion: The MRLU is an important component in building a RLS for data driven precision medicine inMelanoma treatment that could be generalized to other clinical disorders.

� 2016 Elsevier Inc. All rights reserved.

1. Introduction

The promise of leveraging vast medical record data to guideclinical decision making has created growing support for the devel-opment of ‘‘Rapid Learning Systems” (RLSs) that gather and lever-age practice-based clinical evidence for real-time clinical decisionsupport [1]. The need for such systems is particularly evidentwithin the field of oncology, where controlled clinical trial

evidence is only available to guide therapy in a minority of patients[2–5]. We developed an analytical engine component of the RLS forMelanoma, called the Melanoma Rapid Learning Utility (MRLU).The analytic engine and graphical user interface (GUI) enablesusers to quickly build and analyze cohorts of patients through aninteractive web application. The MRLU could be useful as a keycomponent of future systems aimed at providing evidence-basedpractice in the era of electronic medical records (EHRs).

In 2012, the Institute of Medicine (IoM) released a landmarkreport calling for an immediate and fundamental shift in U.S.healthcare. Noting that traditional pathways of knowledge genera-tion and transmission ‘‘can no longer keep pace,” the IoM called for

Page 2: Toward rapid learning in cancer treatment selection: An analytical

S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113 105

‘‘computing capabilities and analytic approaches to develop real-time insights from routine patient care” [6]. The IoM report high-lights a rapidly growing interest in the community to developand implement rapid learning healthcare systems (RLSs) that arecapable of gathering and leveraging clinical evidence to enablereal-time, precision decision support in the clinic [1]. The RLS isan example of the repurposing of primary clinical data to improvehealthcare, which falls under the broader term of the ‘‘learninghealth system” [7,2].

Electronic health record (EHR) data has long been recognized asbeing a potential source of ‘‘practice-based evidence” that can sup-plement traditional forms of medical evidence in guiding clinicaldecision making [8]. Rapid learning systems represent a modernparadigm for precision clinical practice, in which knowledge minedfrom electronic medical records is seamlessly integrated into theclinical workflow of physicians [9–12]. A functional RLS thereforerequires several components, including clinical databases suppliedwith EHR data, information pipelines that facilitate rapid transfor-mation and filtration of clinical data to identify cohorts of interest,analytic platforms, and clinical decision support (CDS) utilities thatprovide physicians with relevant clinical insights at point of care(Fig. 1). While CDS for precision medicine is a primary goal of rapidlearning, the implementation of RLS will also provide infrastruc-ture that will support and be enriched by nearly all areas of clinicalinformatics.

The need for rapid learning systems is particularly evident inoncology [2–5]. Tumor molecular profiling offers great opportunityto identify those patients most likely to benefit from targeted ther-apies, but it results in small sub-populations of patients that can beused for evidence generation. With many tumor molecular alter-ations occurring in as few as 1% of patients, it is challenging tolearn from the clinical outcomes of the mere 5% of cancer patientswho participate in clinical trials [13]. While randomized prospec-tive clinical trials and clinical practice guidelines remain the pre-dominant evidence base for clinical decision making in oncology,in order to realize the promise of precision cancer medicine, it willbe important to be able to learn from the experiences of all cancerpatients. Oncology is thus an ideal environment in which to deployRLS, with the prospect of informing decision making about treat-ments based on observational evidence when clinical trial data isnot available.

Fig. 1. Key components of a rapid learning system. (A) Patient data and outcomes extractData from clinical databases is consolidated and transformed. (C) Patient cohorts of inteconducted to identify or confirm clinical trends. (E) Results may be shared, motivatingClinical knowledge bases developed from a combination of the scientific literature, clinic(H) Clinical decision support utilities employing leverage clinical knowledge bases for pMRLU focuses primarily on (C) and (D), though our work also involved components (A)

Though much has been done in describing the opportunitiesand requirements of RLS, few efforts have yet been undertakento actually produce a RLS for cancer. The Athena and CancerLinQprojects have recently been undertaken to aggregate data andimplement clinical practice guidelines and initial analyses[14,15]. However, these and other related efforts remain limitedwith respect to providing an analytical engine with physician-facing analytic tools for real-time decision support; in fact, to ourknowledge, efforts to build analytical engines and physician-facing tools that enable real-time exploration of clinical data havebeen limited to date [16–18].

In this paper, we present the Melanoma Rapid Learning Utility(MRLU), which we believe is the first implementation of an analyt-ical engine and physician-facing tool that could be used at the coreof a RLS for Melanoma. We chose Melanoma as the focus of our RLSengine because Melanoma is one of the few cancers whose inci-dence rates and total deaths continue to increase [19], unlike mostother cancers, and thus efforts to improve decision making for thisdisease could be particularly impactful. The large heterogeneityboth within and between patients with respect to tumor featuresand tumor response is particularly challenging in Melanoma treat-ment planning [20,21]. This inherent complexity adds to the exist-ing challenge of finding and evaluating clinical evidence to guidepatient care, and makes Melanoma an excellent candidate for rapidlearning.

The scope of work necessary to produce and deploy a function-ing RLS in a hospital setting is large. In building the MRLU, wefocused our efforts specifically on developing an interactive ana-lytic interface that could provide easy comparisons of treatmentefficacy for patients with user-specified characteristics. By makinglatent clinical knowledge physician-accessible, such tools may fos-ter a virtuous ‘‘learning” cycle, in which data-driven decision-making is informed by clinical encounters from the past, theresults of which will be captured in the same EMR system andtherefore be used to guide care in the future. The resultant poolof continuously expanding and improving health data is centralto the notion of a learning health system.

To deploy a fully operational RLS, however, the other compo-nents of the RLS (Fig. 1) would be required, including the incorpo-ration of practice guidelines and other clinical knowledge bases.Fully operational RLS infrastructures will therefore be capable of

ed from electronic medical records and used to build up large clinical databases. (B)rest are defined and extracted from featurized clinical databases. (D) Data analysisnew basic and public health research or facilitate the design of clinical trials. (F/G)al guidelines, and dynamically generated results from rapid learning data analysis.ersonalized medicine. (I) Patient outcomes may be impacted by CDS utilities. Theand (B).

Page 3: Toward rapid learning in cancer treatment selection: An analytical

106 S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113

more robust rule-based and statistical learning than the MRLU. Ourwork represents an essential step toward such systems, and, whenintegrated with clinical information systems and additional com-ponents, could provide physicians with access to real-time,practice-based insights for precision medical practice.

2. Materials and methods

2.1. Data

We acquired clinical, demographic, tumor molecular profiling,and pharmacy data from clinical databases for 200 patients atthe Vanderbilt-Ingram Cancer Center and 37 patients from Stan-ford Hospitals and Clinics with metastatic melanoma. All patientdata were de-identified through the creation of new, randompatient IDs and the offsetting of all dates. The data collectedincluded:

� Clinical data (�240 rows, 1 per patient): sex, age, days to deathfrom first drug treatment, institution of origin.

� Pharmaceutical data (�2000 rows, 1 per prescription perpatient): drug class, drug name, and ordering date.

� Tumor molecular profiling data (�240 rows, 1 per patient):BRAF status, NRAS status, and a Boolean indicating whetherother mutations were reported.

Pharmaceutical data were normalized by generic, trade, anddevelopment names and mapped onto their drug class accordingto the MyCancerGenome index of anticancer agents [22]. Combina-tion therapies were identified as consisting of two or more drugs ofthe same class that were administered on the same date. Uniquecombinations of treatments were identified and representedwithin the MRLU as distinct treatment regimens. For example,we separated those patients receiving Carboplatin and Paclitaxeltogether from those patients receiving only Paclitaxel. Eightpatients were identified who received combination therapies in

Fig. 2. System diagram. Shiny application links J

which there were drugs of two or more classes. Since no more thanany 2 of these patients received the same treatment protocol, andmulti-class protocols could introduce potential biases in outcomesanalyses stratified by drug class, we excluded these 8 patients fromthe cohort. All other patients were included.

The timestamp of the first anti-cancer therapy was labeled asthe baseline time point for each patient, and each subsequenttreatment time point was considered a follow up. The baselinetime point served as a reference point used in conjunction withother timestamps in the medical records to calculate the patientage at first treatment, time to second treatment, and days to deathrelative to the baseline time point. While calculating the time tosecond treatment and the days to death relative to the initiationof first treatment, we captured the date of right censoring for allpatients (i.e., those who either left the database or whose treat-ments were ongoing at the time of our data analysis). These datawere added to the database as well as genetic results compiledfrom the EHR.

2.2. System implementation

An overview of the MRLU and its architecture are shown inFig. 2. The MRLU was implemented using a MySQL database andRStudio’s Shiny package [23]. Shiny enables the rapid prototypingof interactive JavaScript web pages back-ended by the R statisticalcomputing language. The MySQL database contains all the patientdata and serves as a data cache for accessing the analyzing the databy Shiny applications. The MRLU accesses the data in its MySQLdatabase through the DBI and RMySQL packages and displays themusing Hadley Wickham’s ggplot2 [24–26]. We deployed Shiny Ser-ver to host the MRLU on a GNU/Linux server [27].

We defined the requirements for the MRLU and its interfaceduring several consultations with practicing clinical oncologistsat the Stanford and Vanderbilt cancer centers. During this process,we determined that the core functionality required by clinicianusers of the tool could be partitioned into three basic tasks: (1)

avaScript frontend with R, MySQL backend.

Page 4: Toward rapid learning in cancer treatment selection: An analytical

Fig. 3. Cohort assembly page. Displaying summary visualizations for all for all female, NRAS – patients between the ages of 50 and 75. (Red: Patients from Stanford. Green:Patients from Vanderbilt) (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.).

S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113 107

cohort selection, (2) outcomes analysis, and the (3) examination anddownload of raw data. We sought in developing the MRLU to notonly maximize usability, but also to minimize the risk of inadver-tent data dredging or selection bias (see Section 3).

(1) Cohort selection (Fig. 3) is the process of identifying a groupof patients matching particular characteristics of interest.In the RLS scenario, a clinician would describe to the MRLUthe characteristics of a patient for whom decision supportabout treatment is sought. The cohort is specified by meansof a series of drop-down, slider, and checkbox filters. Asinclusion criteria are tightened and/or relaxed via the userinterface, data are fetched from the MySQL database, andan R data frame is constructed on the server-side that con-sists of those patients who match the user-specified criteria.The data frame is manipulated and displayed using an Rplotting routine (ggplot2) in a series of five plots that depictthe sex, age, institution, and treatment distributions of thecurrent cohort.

Features to enable expressivity of the user in defining cohortsand analyzing their treatments and outcomes were heavily empha-sized in design decisions of the MRLU; even excluding age-basedinclusion criteria, the simple menu options allow for more than 8million combinations of inclusion criteria, outcome variables, andstratification variables. Nevertheless, we sought to maintain a verysimple and intuitive user interface. As such, most of the menuoptions are hidden from view by default and are only dynamicallygenerated when made relevant by other query options or openedexplicitly. Furthermore, much of the content of the filters are gen-erated from the contents of the data itself, which allows the engineto be robust to changes in the content of the database. For example,the addition of pharmacy data with previously unlisted drugs or

drug classes will not necessitate any adjustment to the interfaceor server code.

(2) Outcomes analysis (Fig. 4) of a given selected patient cohort isconducted using the R survival package [28,29]. The userselects an outcome variable in the MRLU system (survivalor time to next treatment) as well as a stratification variable(drug class, drug name, mutation type, BRAF status, NRASstatus, or sex) to be used in conjunction with the filtersdefined in the cohort selection step described previously.This input is then used to construct a Cox proportional haz-ards model [30]. The MRLU builds a model that is stratifiedbased on the user-defined stratification variable, that con-trols for all other variables available, and that right censorsall patients for whom the data were cut off (due to end ofobservation in the case of survival, and end of observationor death in the case of time to next treatment). The systemdisplays a Kaplan–Meier plot of the cohort, as well as thespecific R code used to generate the model. In addition, sum-mary statistics for each included class including median sur-vival and 95% confidence intervals are provided. The MRLUalso invokes the survival package’s implementation of theMantel–Haenszel test to evaluate the equality of the survivalcurves, and it reports the chi-square and p-value results ofthe statistical test, as well as the R code used to generatethe result. To lower the risk of inadvertent data dredging(see Discussion), the MRLU also displays a warning thatencourages users to conduct only one test at a time, and toavoid adjusting parameters post hoc in search of statisticalsignificance.

(3) Examination and download of raw data, as with cohort selec-tion above, is handled through a combination of R and Shinyfunctionality. The user is presented with menus to toggle

Page 5: Toward rapid learning in cancer treatment selection: An analytical

Fig. 4. Survival outcomes analysis. The MRLU displays Kaplan–Meier curves, summary statistics, chi-squared statistics and p-values, as well as the number of patients in eachstratified cohort. In this case, the MRLU shows survival plots of patients when stratifying on drug class for all female, NRAS – patients between the ages of 50 and 75. Patientswho were treated with immunotherapy have longer survival in this instance than those who received chemotherapy or therapeutic antibody, though this particular result isnot statistically significant.

108 S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113

between clinical and pharmaceutical data for those patientsthat match the inclusion criteria established in the cohortselection step. If viewing pharmaceutical data, the user canchoose between displaying all drug entries in the table ormerely the first entry for each unique drug used on a patient.By default, the tables are sorted according to patient identi-fication number, though the data can be sorted or searchedon the content of any column. Finally, if the user choosesto download the cohort to a file, the underlying R data frameis converted to CSV format before being sent to the user.

2.3. System evaluation

We conducted a preliminary evaluation of the MRLU focusingon two aspects of the system: (1) computational integrity, evi-denced by the ability of the MRLU to recapitulate known biomed-ical knowledge about Melanoma treatment effectiveness forparticular genetic mutations, and (2) clinician users’ experienceusing the system to glean information about Melanoma patientsin their clinical practice. For the first evaluation, we comparedthe output of the MRLU with known statistics from the biomedicalliterature. Specifically, we used MRLU to (1) evaluate the frequencyof BRAF positivity in Melanoma patients and to compare it withpublished facts, (2) to evaluate the response rate of BRAF+ Mela-noma patients to BRAF inhibitor therapy such as Vemurafenib,comparing that with published facts, and (3) to evaluate the differ-ences of frequency of Melanoma deaths according to age and gen-der [31–33].

For the second evaluation, 13 practicing physicians at the Stan-ford and Vanderbilt Cancer Centers used the system in the contextof a hypothetical patient. Each user completed a survey thatexplained how to use the system to build a cohort of similarpatients to the one under consideration and how to run analysesto compare potential therapies. The physicians were also asked aseries of questions regarding their views on Rapid Learning Sys-tems in general and their assessments of the utility of the MRLUin particular. They assessed the usability of the MRLU interface,and provided opinions on limitations of the system and needs forfuture work using Likert scale ratings. The survey itself is availablein the Supplemental Data. To assess inter-user agreement amongsurvey respondents, pairwise agreement percentages were com-puted using both raw Likert scores, as well as scores binned as neg-ative (Likert scores 1–2), neutral (Likert score 3), or positive (Likertscores 4–5). Pairwise agreement rates were computed over allcombinations of reviewers, as well as mean agreement rates foreach reviewers versus all other reviewers, mean agreement ratesfor each question, and mean agreement rate overall.

3. Results

3.1. System Implementation

A fully functional implementation of the MRLU was completed,loaded with data, and deployed on a live server, which is publiclyavailable at http://mrlu.stanford.edu:3838/mrlu/. We outline theworkflow of the system and results from the process ofimplementation.

Page 6: Toward rapid learning in cancer treatment selection: An analytical

Fig. 5. Survey results: RLS systems at large. Results from survey of 13 practicingoncologists from Stanford and Vanderbilt, assessing the anticipated benefit of RLSsystems in various settings. Respondents were asked to provide feedback on a Likertscale from 1 to 5, with 5 being the most positive. Lines indicate the mean rating foreach question.

S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113 109

Upon opening the browser, the user is presented with a set ofsummary plots that show the makeup of the entire dataset withrespect to sex, age, BRAF status, NRAS status, drug name and class(of first drug used), and hospital. A corresponding set of filtersenable the user to adjust the cohort to set specific patient charac-teristics. As filters are adjusted, the summary plots update in real-time. This allows the system, before any analysis is conducted, toleverage the judgment of the physician-user in selecting a cohortthat matches the patient as closely as possible while being cog-nizant of the number of patients in the cohort (so that judgmentis based on sufficient cohort size) as well as expected distributionsfor the variables not used for matching.

With cohort inclusion criteria established via the filters, theuser selects a stratification variable (drug name, drug class, muta-tion type, BRAF status, NRAF status, or sex) as well as the outcomevariable (survival or time to next treatment), over which theKaplan–Meier analysis will be performed as described in Section 2(Fig. 4). As noted above, summary statistics, confidence intervals,chi-square statistics, and p-values are all displayed in conjunctionwith the plot itself.

Once the preliminary analysis is completed, the cohort data ismade immediately available for browsing, searching, sorting, anddownload.

3.2. Evaluation 1: Confirmation of Known Clinical Trends

An initial assessment of the Melanoma Rapid Learning Utilityconfirmed that the MRLU successfully identified several knownclinical trends. This includes the frequency of BRAF mutations inthe Melanoma population [34], as well as a survival rate of approx-imately 50% in response to BRAF inhibitor therapy such as Vemu-rafenib among those patients whose tumors had a BRAFmutation [32]. Similarly, in accordance with the literature, the sys-tem shows that males account for roughly 56% of cases and 57% ofdeaths among patients under 65, whereas males account for 68% ofcases and 67% of deaths among those patients >65 years of age[35].

Furthermore, the physicians who used our system reported thatoverall survival outcomes are close to what would be expected fora metastatic Melanoma patient receiving current therapy withimmunotherapy and/or BRAF inhibitor therapy [32,33].

3.3. Evaluation 2: Physician survey

The physician feedback on both the vision for and the imple-mentation of the MRLU system was very favorable (Figs. 5–7,Supplemental Data).

When asked various questions regarding the benefit that wouldbe provided by Rapid Learning Systems in both clinical andresearch settings, the mean response was a score of 4.54 (CI:4.38–4.70) on a Likert scale from 1 to 5, 5 being the most positive(Fig. 5). The greatest enthusiasm (4.62/5, CI: 4.31–4.92) was for thebenefit that would be provided to non-academic centers that couldleverage data from academic centers for clinical decision support.

Assessment of the utility of the MRLU was also positive, with amean rating of 4.21/5 (CI: 4.00–4.41) (Fig. 6a). Enthusiasm wasgreatest for the MRLU in context of clinical research, with averageratings of 4.38/5 (CI: 3.99–4.78) in its current state, and 4.62/5 (CI:4.22–5.01) if only additional variables were made available. Whileresponses were also generally positive for the utility of the systemin a clinical setting, the mean ratings were lower in this context,4/5 (CI: 3.34–4.66) ‘‘as is,” and 4.23/5 (CI: 3.79–4.67) with addi-tional variables. When asked separately if the system currentlyimplements the analytic capabilities necessary for clinical usage,the mean score was 3.75/5 (CI: 3.27–4.23), which could indicatethat further capabilities would be requested by many clinicians

were the MRLU to be rolled out into clinical practice. Cliniciansdid, however, find the system usable, with a mean rating of4.42/5 (CI: 4.14 – 4.71) for interpretability and navigability(Fig. 6b). Taken together, these data (overall mean score 4.37/5,CI: 4.25–4.49) indicate that clinicians are generally eager to seeRapid Learning Systems in both the clinical and research settings,and view the progress made toward that aim by the MRLU as valu-able in its current state and potentially even more valuable withsimple additions to its underlying database.

We were particularly interested to gauge physician attitudestoward what would be required from a practice-based evidencetool like the MRLU in order for it to be trusted in practice(Fig. 7). When asked howmany patients matching their own wouldbe expected in order for their outcomes to be clinically useful,there was wide discordance: 23% of respondents indicated thatthey would use outcomes data from even one similar patient,whereas 30% indicated that they would expect more than 500patients. Strikingly, only one clinician felt that (s)he could onlyanswer this question if able to perform a power analysis on thedata. In contrast, expectations for confounding variables repre-sented in the database were much more uniform, with a majorityof clinicians indicated that they would want to see primary tumorcharacteristics, co-occurring medical conditions, location of metas-tases, and additional genetic features. However, the lack of featureswas generally not listed a major hindrance to the future clinicaladoption of these systems, with the strongest concern being anefficient workflow in using the system and, potential distrust ofthe data by clinicians.

Inter-rater reliability results were mixed, though generally pos-itive (Supplemental Data). While the rates of exact agreement weremoderately low for the raw 1–5 scale Likert data (overall mean0.41, CI 0.37–0.44), agreement rates were much higher when thescores were binned into negative, neutral, and positive ratings(mean agreement 0.80, CI: 0.75–0.85). This indicates that discor-dance among the respondents was likely due to the subjectivityinherent, for example, in distinguishing a positive rating of 4/5from an extremely positive rating of 5/5 – two responses which,taken together, comprise 87.3% of all recorded ratings. Neverthe-less, we do note that, unsurprisingly, the two questions with thelowest average rating also demonstrated the lowest rate of agree-ment. These questions, which asked respondents to rate the clini-cal utility of the MRLU in its current implementation, each hadonly 47% agreement of respondents on the 1–3 scale of binned

Page 7: Toward rapid learning in cancer treatment selection: An analytical

Fig. 6. Survey results: MRLU system. Survey results assessing (a) the utility of MRLU, and (b) the usability of the MRLU interface. Lines indicated mean ratings.

Fig. 7. Clinical adoption survey results. Breakdown of survey responses to questions pertaining to clinical adoption (questions were select all that apply with option to inputadditional custom ‘‘other” responses). ⁄Manual entry as ‘‘other” option.

110 S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113

scores. Notably, however, this discordance largely vanished (meanpairwise agreement 72%) when respondents were asked to assessthe clinical utility of the system if additional patient variables wereadded. Similarly, the distribution of one-versus-all reviewer agree-ment rates was somewhat left-skewed, with two reviewers havingaverage agreement rates more than two standard deviations belowthe mean. Given the presence of these apparent outliers, evalua-tions of future systems could be performed to study whetherreviewer characteristics such as age or computer-experience, otherreviewer factors such as practice attitudes, or mere chance drivesdiscordance.

4. Discussion

The MRLU we developed is a prototype of the analytical enginecomponent of a RLS for Melanoma (Fig. 1). This analytical engineenables physician users, who in many cases lack statistical pro-gramming skills or user-friendly tools that access clinical data, to

use primary clinical data to interactively build cohorts and run ret-rospective outcomes analyses tailored to patients like their own.Our results are promising in that our system was able to recapitu-late known facts about Melanoma in an observation cohort. Thisraises confidence that the MRLU may be useful clinically in provid-ing insight into the disease for patient characteristics for whichthere is not yet published evidence. In addition, the results of oursurvey suggest that our interface is usable and potentially clinicallyuseful. However, additional studies with a larger amount of patientdata and more detailed evaluation of potential clinical benefit willbe needed. In particular, the utility of the MRLU may also be lim-ited by the absence of a number of important clinical features inits database. Our clinical experts identified a number of variablesthat should be included in MRLU before it would be contemplatedfor clinical use. Since our system is extensible, adding such vari-ables would not be expected to be difficult if available in theEMR. Further, our user evaluation indicates that many clinicalpractitioners desired to use such systems.

Page 8: Toward rapid learning in cancer treatment selection: An analytical

S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113 111

4.1. MRLU in context of existing systems

The MRLU is complementary to recently developed enterprise-level platforms such as i2b2, STRIDE, RedCAP, and tranSMART [36–39]. These are data warehouse systems that also provide tools forintegrating and querying medical record and research data at alarge scale. They can fulfill an important role as one of the compo-nents of the RLS (Component A in Fig. 1); however, while some ofthese systems do provide analytics features, they have been gener-ally been designed and optimized for research rather than real-time use by physicians for rapid learning in a clinical setting[40]. Similarly, several industry-sponsored visual analytic toolshave also been designed to facilitate the transformation and navi-gation of electronic medical record data, as well as for visualizing‘‘care pathways” among similar patients in order to facilitate inter-vention planning (Component B in Fig. 1) [16–18]. These lattertools lack analyses needed by a RLS analytical engine (e.g.,treatment-stratified Kaplan–Meier, etc.).

Another effort related to our work is the CancerLinQ project,which has created a system based on aggregating primary clinicaldata from a group of community oncology practices [15]. Their sys-tem provides many features of the RLS (Components A, B, and C inFig. 1), but is still in the early stages of development. Its analyticalengine only permits rudimentary data exploration, such as provid-ing histograms of patient characteristics, and it lacks the types ofanalytical capabilities we have developed in MRLU, such as com-parative survival analysis in user-defined patient cohorts. Proto-type efforts such as our MRLU could inform further developmentof efforts such as CancerLinQ. Similar, disease-specific RLS toolslike the MRLU have also been reported for other diseases, includingfor lung cancer [41].

4.2. Statistical concerns

Interactive data analysis for real-time clinical decision supportis complicated by a range of statistical challenges. As highlightedabove, we designed the MRLU analytical engine for flexibility—the user can easily adjust almost all aspects of the analysis. In mak-ing this design decision, the MRLU can accommodate a wide rangeof clinical questions that can be quickly posed by the user andanswered in near real-time. However, this flexibility raises the riskof inadvertent errors in judgment by users lacking statisticalexpertise. Furthermore, as discussed above, our survey corrobo-rates previous work suggesting that many clinicians may not beextremely concerned by statistical limitations while using thesesystems [12]. As such, before deploying tools such as ours in theclinic, careful considerations should be made to ensure that statis-tical models are appropriately designed and interpreted.

The first statistical challenge faced by the MRLU arises duringcohort selection in the form of the bias-variance tradeoff. In orderfor models built with the MRLU to be relevant to a given patient,selection criteria must be applied to avoid high bias. Conversely,excessive selection criteria will lead to overfit models of high vari-ance. This tradeoff is particularly complicated to manage in thecase of Melanoma decision support, as even the largest institutionswill have relatively small, yet still widely heterogeneous, datasets.Ultimately this problem can only be addressed by integrating datafrom many participating institutions, which is the vision for alllearning health systems [7].

In order to assist the user in managing the bias-variance trade-off as effectively as possible, the MRLU’s summary plots on thecohort selection page (Fig. 3) convey descriptive information. Asprospective users apply filters to narrow in on patients of a targetdemographic, plots depict the makeup of the cohort they havebuilt. A note encourages the physician to confirm that the cohortboth reflects the characteristics of the patient and fits with his or

her professional judgment. Nevertheless, it is impossible to providesufficient automation to completely preclude the inadvertent mis-use of the system by inexperienced users.

Similarly, the system provides no advanced statistical analysesfor managing potential pitfalls such as outlier effects and violationsof the proportional hazards assumption. As such, these considera-tions can only be managed through the manual inspection and/oranalysis of the cohort data, and the MRLU thus provides this capa-bility. Though our interface is designed to make this as easy as pos-sible, physicians will likely not have the time nor the expertise toconduct such an analysis in a clinical setting. In addition, whileour analytical engine seeks to limit confounders by controllingfor all variables other than the stratification variable, the inherentcomplexity of data gathering in routine care leaves open the possi-bility of other forms of latent confounding [42].

Another risk inherent within any interactive statistical analysisplatform is that of inadvertent data dredging. A user of the MRLUcan quickly examine many clinical hypotheses in a single sitting.Due to the principle of multiple hypothesis testing, this ease ofuse can be dangerous, since it could enable a user running suffi-cient queries to find statistically significant trends by chance[43]. While the risk of data dredging can never be fully eliminated,the physical separation of the cohort selection page from that ofthe outcomes analysis was designed to encourage users to care-fully formulate a cohort for a single question of interest before see-ing whether or not significant results will be obtained. The MRLUproduces a warning to users about multiple testing, encouragingthem to utilize the tool for one test at a time, and warning themof the false-positive risk in rapid and repeated adjustments ofthe query in search of statistical significance. Nevertheless, proba-bly the best way to prevent statistical misuse of analytical enginessuch as MRLU is to engage experts in interpreting the analyses.

To further limit the statistical risks inherent in rapid learning,we will present, in a future work, a hybrid system in which a rapidMRLU analytical engine will be linked to a series of interfaces,some of which will allow expert statisticians to design statisticalmodels, others of which will allow physicians to apply these mod-els to their patients without the capacity to define patient cohortsor new models themselves.

4.3. Looking forward

Broader philosophical and legal questions that surround RLSpertain to where exactly the output of these tools should standwithin the existing hierarchy of evidence-based practice[11,12,44,45]. The results from controlled clinical trials providethe highest, most reliable evidence, while retrospective, observa-tional evidence can be less reliable. Rapid learning tools based onengines such as the MRLU are no substitute for traditional formsof medical evidence when they exist and are relevant to the patientof interest. In the future, we plan to implement in MRLU a mecha-nism to identify and present clinical guidelines and prominent arti-cles from the literature that relate to the user’s query. However, inthe (common) circumstance of cancer patients for which there isno clinical trial evidence available, the MRLU could be particularlyuseful, by leveraging evidence from clinical practice to provideinformation to physicians that may help in making treatment deci-sions. While our survey indicates that RLS analytical engines suchas the MRLU would likely see clinical use, prospective researchstudies that relate their use to clinical outcomes will ultimatelybe needed to show clinical value.

As highlighted in Section 4.2, an increasingly important compo-nent of systems such as MRLU will be the ability to expand thenumber of patients and data types used for analyses across manyinstitutions. We are currently expanding and generalizing MRLUto address this challenge by building a scalable, decentralized

Page 9: Toward rapid learning in cancer treatment selection: An analytical

112 S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113

system. In this system, a front-end interface derived from theMRLU will be used to execute distributed statistical models usingdata from participating institutions. A server at each institutionwill compute intermediate statistical results on local data, andcomputational results will be sent back to the coordinating server.This approach will allow centers to participate in inter-institutional computations without sharing any granular patientdata. Each site would do a one-time mapping of certain key patientdata fields to those used by the system, and this could expand overtime to include new data types in future. The menus and utilities inthe system that use these fields would dynamically update basedon the data types available from the connected institutions. Thissystem could scale up to including many patients as more sitesparticipate, and these institutions would have the freedom to with-draw at any time.

Finally, while the MRLU was developed specifically for use inMelanoma, the key functionality – integrating genetic variants,treatments, and survival outcomes – is relevant to many types ofcancer (and other disease). As such, small adaptations to thecovariates stored in and analyzed by the system would allow itto scale across cancer types. Since the menus and model can easilybe adapted to fit the data at hand, the rate-limiting steps in suchadaptation would almost certainly be data acquisition and clinicianinterest.

Our MRLU is just a portion of the complete RLS (Components Cand D in Fig. 1). Clearly, the other components are needed, and theMRLU must be combined with the other infrastructure in order torealize the RLS. On the other hand, we believe our results provideuseful insights into design considerations, feasibility and potentialutility of the analytical engine component of the RLS.

5. Conclusion

The MRLU is an analytical engine and user interface that repre-sents a component of the RLS. It can provide real-time, data-drivenclinical decision support for Melanoma treatment planning. In apreliminary evaluation, the MRLU successfully recapitulatedknown biomedical knowledge about Melanoma treatment, and itshowed promise for clinical utility when used by oncologists.Given its flexible architecture, it is extensible to other types of can-cer and to incorporating more and richer data for greater futureclinical utility in the future. We plan to incorporate the MRLU intothe rest of the learning system infrastructure and may ultimatelyenable EHR-driven evidence to be incorporated into medicalpractice.

Conflict of Interest

None.

Acknowledgments

This project has been funded from National Cancer Institute,National Institutes of Health, under Grant U01CA142555 and seedgrant from the Big Data for Human Health Stanford University andOxford University. This project was also supported by award Num-ber T32GM007753 from the National Institute of General MedicalSciences. The content is solely the responsibility of the authorsand does not necessarily represent the official views of theNational Institute of General Medical Sciences or the NationalInstitutes of Health.

Philip Lavori, PhD and Balasubramanian Narasimhan, PhD fromStanford University provided consultation in the development ofthe MRLU. Vanessa Sochat, Linda Szabo, and Luke Yancy Jr. fromStanford University also provided valuable collaboration on a

previous rapid-learning prototype. Finally, Denis Agniel providedhelpful feedback as we designed the inter-rater reliability analysis.

Appendix A. Supplementary material

Supplementary data associated with this article can be found, inthe online version, at http://dx.doi.org/10.1016/j.jbi.2016.01.005.

References

[1] L.M. Etheredge, A rapid-learning health system, Health Aff. (Millwood) 26(2007) w107–w118, http://dx.doi.org/10.1377/hlthaff.26.2.w107.

[2] A.P. Abernethy, L.M. Etheredge, P.A. Ganz, P. Wallace, R.R. German, C. Neti,et al., Rapid-learning system for cancer care, J. Clin. Oncol. 28 (2010) 4268–4274, http://dx.doi.org/10.1200/JCO.2010.28.5478.

[3] P. Lambin, E. Roelofs, B. Reymen, E.R. Velazquez, J. Buijsen, C.M.L. Zegers, et al.,‘‘Rapid learning health care in oncology” – an approach towards decisionsupport systems enabling customised radiotherapy, Radiother. Oncol. 109(2013) 159–164, http://dx.doi.org/10.1016/j.radonc.2013.07.007.

[4] A.R. Shaikh, A.J. Butte, S.D. Schully, W.S. Dalton, M.J. Khoury, B.W. Hesse,Collaborative biomedicine in the age of big data: the case of cancer, J. Med.Internet Res. 16 (2014) e101, http://dx.doi.org/10.2196/jmir.2496.

[5] J. Shrager, J.M. Tenenbaum, Rapid learning for precision oncology, Nat. Rev.Clin. Oncol. 11 (2014) 109–118, http://dx.doi.org/10.1038/nrclinonc.2013.244.

[6] M. Smith, R. Saunders, L. Stuckhardt, J.M. McGinnis, America C on the LHCS in,Medicine I of. A Continuously Learning Health Care System, 2013.

[7] C.P. Friedman, A.K. Wong, D. Blumenthal, Achieving a nationwide learninghealth system, Sci. Transl. Med. 2 (2010) 57cm29, http://dx.doi.org/10.1126/scitranslmed.3001456.

[8] F.R. Margison, Measurement and psychotherapy: evidence-based practice andpractice-based evidence, Br. J. Psychiatry 177 (2000) 123–130, http://dx.doi.org/10.1192/bjp.177.2.123.

[9] R.A. Nordyke, C.A. Kulikowski, An informatics-based chronic disease practice:case study of a 35-year computer-based longitudinal record system, J. Am.Med. Inform. Assoc. 5 (1998) 88–103.

[10] S. Ebadollahi, J. Sun, D. Gotz, J. Hu, D. Sow, C. Neti, Predicting patient’strajectory of physiological data using temporal trends in similar patients: asystem for near-term prognostics, in: AMIA Annu Symp Proc, vol. 2010, 2010,pp. 192–196.

[11] J. Frankovich, C.A. Longhurst, S.M. Sutherland, Evidence-based medicine in theEMR era, N. Engl. J. Med. 365 (2011) 1758–1759, http://dx.doi.org/10.1056/NEJMp1108726.

[12] C.A. Longhurst, R.A. Harrington, N.H. Shah, A ‘‘green button” for usingaggregate patient data at the point of care, Health Aff. (Millwood) 33 (2014)1229–1235, http://dx.doi.org/10.1377/hlthaff.2014.0099.

[13] R. Finn, Surveys identify barriers to participation in clinical trials, J. Natl.Cancer Inst. 92 (2000) 1556–1558, http://dx.doi.org/10.1093/jnci/92.19.1556.

[14] S.L. Elson, R.A. Hiatt, H. Anton-Culver, L.P. Howell, A. Naeim, B.A. Parker, et al.,The athena breast health network: developing a rapid learning system inbreast cancer prevention, screening, treatment, and care, Breast Cancer Res.Treat. 140 (2013) 417–425, http://dx.doi.org/10.1007/s10549-013-2612-0.

[15] G.W. Sledge, R.S. Miller, R. Hauser, CancerLinQ and the future of cancer care,Am. Soc. Clin. Oncol. Educ. Book (2013) 430–434, http://dx.doi.org/10.1200/EdBook_AM.2013.33.430.

[16] Catherine Plaisant RMASJLDHBSKPC. LifeLines: Using Visualization to EnhanceNavigation and Analysis of Patient Records, n.d.

[17] D. Gotz, K. Wongsuphasawat, Interactive intervention analysis, in: AMIA AnnuSymp Proc, vol. 2012, 2012, pp. 274–280.

[18] A. Perer, D. Gotz, Data-driven exploration of care plans for patients, in: CHI ’13Ext. Abstr. Hum. Factors Comput. Syst. – CHI EA ’13, New York, ACM Press,New York, USA, 2013, p. 439, http://dx.doi.org/10.1145/2468356.2468434.

[19] A. Jemal, E.P. Simard, C. Dorell, A.-M. Noone, L.E. Markowitz, B. Kohler, et al.,Annual report to the nation on the status of cancer, 1975–2009, featuring theburden and trends in human papillomavirus (HPV)-associated cancers andHPV vaccination coverage levels, JNCI J. Natl. Cancer Inst. 105 (2013) 175–201,http://dx.doi.org/10.1093/jnci/djs491.

[20] M. Yancovitz, A. Litterman, J. Yoon, E. Ng, R.L. Shapiro, R.S. Berman, et al., Intra-and inter-tumor heterogeneity of BRAF(V600E))mutations in primary andmetastatic melanoma, PLoS ONE 7 (2012) e29336, http://dx.doi.org/10.1371/journal.pone.0029336.

[21] A.M. Menzies, L.E. Haydu, M.S. Carlino, M.W.F. Azer, P.J.A. Carr, R.F. Kefford,et al., Inter- and intra-patient heterogeneity of response and progression totargeted therapy in metastatic melanoma, PLoS ONE 9 (2014) e85004, http://dx.doi.org/10.1371/journal.pone.0085004.

[22] J.L. Ashton, C.M. Lovly, Anticancer agents. My cancer genome, n.d. <http://www.mycancergenome.org/content/other/molecular-medicine/anticancer-agents> (accessed 27.02.14).

[23] W. Chang, J. Cheng, J.J. Allaire, Y. Xie, J. McPherson, shiny: Web ApplicationFramework for R, 2015.

[24] on Databases RSIG. DBI: R Database Interface, 2014.[25] J. Ooms, D. James, S. DebRoy, H. Wickham, J. Horner, RMySQL: Database

Interface and ‘‘MySQL” Driver for R, 2015.

Page 10: Toward rapid learning in cancer treatment selection: An analytical

S.G. Finlayson et al. / Journal of Biomedical Informatics 60 (2016) 104–113 113

[26] H. Wickham, ggplot2: Elegant Graphics for Data Analysis, Springer, New York,2009.

[27] I. RStudio, Shiny Server, n.d. <https://www.rstudio.com/products/rstudio/#Server (accessed 17.09.15).

[28] Terry M. Therneau, Patricia M. Grambsch, Modeling Survival Data: Extendingthe {C}ox Model, Springer, New York, 2000.

[29] T.M. Therneau, A Package for Survival Analysis in S, 2015.[30] J. Fox, Cox proportional-hazards regression for survival data. Appendix to An R

and S-PLUS Companion to Applied Regression, 2002. <http://StatEthzch/CRAN/doc/contrib/Fox-Companion/appendix-Cox-Regression.Pdf> (accessedApril 2013).

[31] J.A. Curtin, J. Fridlyand, T. Kageshita, H.N. Patel, K.J. Busam, H. Kutzner, et al.,Distinct sets of genetic alterations in melanoma, N. Engl. J. Med. 353 (2005)2135–2147, http://dx.doi.org/10.1056/NEJMoa050092.

[32] K.T. Flaherty, I. Puzanov, K.B. Kim, A. Ribas, G.A. McArthur, J.A. Sosman, et al.,Inhibition of mutated, activated BRAF in metastatic melanoma, N. Engl. J. Med.363 (2010) 809–819, http://dx.doi.org/10.1056/NEJMoa1002011.

[33] C.M. Balch, J.E. Gershenwald, S.-J. Soong, J.F. Thompson, S. Ding, D.R. Byrd,et al., Multivariate analysis of prognostic factors among 2,313 patients withstage III melanoma: comparison of nodal micrometastases versusmacrometastases, J. Clin. Oncol. 28 (2010) 2452–2459, http://dx.doi.org/10.1200/JCO.2009.27.1627.

[34] C.M. Balch, J.E. Gershenwald, S.-J. Soong, J.F. Thompson, M.B. Atkins, D.R. Byrd,et al., Final version of 2009 AJCC melanoma staging and classification,J. Clin. Oncol. 27 (2009) 6199–6206, http://dx.doi.org/10.1200/JCO.2009.23.4799.

[35] D. Polsky, S.Q. Wang, A. Hendi, C. Kim, J. Karen, Skin cancer facts. Ski cancerfound, n.d. <http://www.skincancer.org/skin-cancer-information/skin-cancer-facts#Melanoma (accessed 1003.13).

[36] S.N. Murphy, M.E. Mendis, D.A. Berkowitz, I. Kohane, H.C. Chueh, Integration ofclinical and genetic data in the i2b2 architecture, in: AMIA Annu Symp Proc2006, p. 1040

[37] H.J. Lowe, T.A. Ferris, P.M. Hernandez, S.C. Weber, STRIDE – an integratedstandards-based translational research informatics platform, in: AMIA AnnuSymp Proc, vol. 2009, 2009, pp. 391–395.

[38] P.A. Harris, R. Taylor, R. Thielke, J. Payne, N. Gonzalez, J.G. Conde, Researchelectronic data capture (REDCap) – a metadata-driven methodology andworkflow process for providing translational research informatics support, J.Biomed. Inform. 42 (2009) 377–381, http://dx.doi.org/10.1016/j.jbi.2008.08.010.

[39] S. Szalma, V. Koka, T. Khasanova, E.D. Perakslis, Effective knowledgemanagement in translational medicine, J. Transl. Med. 8 (2010) 68, http://dx.doi.org/10.1186/1479-5876-8-68.

[40] V. Canuel, B. Rance, P. Avillach, P. Degoulet, A. Burgun, Translational researchplatforms integrating clinical and omics data: a review of publicly availablesolutions, Brief. Bioinform. 16 (2015) 280–290, http://dx.doi.org/10.1093/bib/bbu006.

[41] A. Dekker, S. Vinod, L. Holloway, C. Oberije, A. George, G. Goozee, et al., Rapidlearning in practice: a lung cancer survival decision support system in routinepatient care data, Radiother. Oncol. 113 (2014) 47–53, http://dx.doi.org/10.1016/j.radonc.2014.08.013.

[42] M.A. Brookhart, T. Stürmer, R.J. Glynn, J. Rassen, S. Schneeweiss, Confoundingcontrol in healthcare database research, Med. Care 48 (2010) S114–S120,http://dx.doi.org/10.1097/MLR.0b013e3181dbebe3.

[43] K.L. Sainani, The problem of multiple testing, PM&R 1 (2009) 1098–1103,http://dx.doi.org/10.1016/j.pmrj.2009.10.004.

[44] R.L. Schilsky, D.L. Michels, A.H. Kearbey, P.P. Yu, C.A. Hudis, Building a rapidlearning health care system for oncology: the regulatory framework ofCancerLinQ, J. Clin. Oncol. 32 (2014) 2373–2379, http://dx.doi.org/10.1200/JCO.2014.56.2124.

[45] G.S. Ginsburg, N.M. Kuderer, Comparative effectiveness research, genomics-enabled personalized medicine, and rapid learning health care: a commonbond, J. Clin. Oncol. 30 (2012) 4233–4242, http://dx.doi.org/10.1200/JCO.2012.42.6114.