Big data, big knowledge big data for personalized healthcare

Download Big data, big knowledge big data for personalized healthcare

Post on 16-Feb-2017




8 download

Embed Size (px)



    Big Data, Big Knowledge: Big Datafor Personalized Healthcare

    Marco Viceconti, Peter Hunter, and Rod Hose

    AbstractThe idea that the purely phenomenological knowl-edge that we can extract by analyzing large amounts of data canbe useful in healthcare seems to contradict the desire of VPH re-searchers to build detailed mechanistic models for individual pa-tients. But in practice no model is ever entirely phenomenologicalor entirely mechanistic. We propose in this position paper that bigdata analytics can be successfully combined with VPH technolo-gies to produce robust and effective in silico medicine solutions.In order to do this, big data technologies must be further devel-oped to cope with some specific requirements that emerge fromthis application. Such requirements are: working with sensitivedata; analytics of complex and heterogeneous data spaces, includ-ing nontextual information; distributed data management undersecurity and performance constraints; specialized analytics to inte-grate bioinformatics and systems biology information with clinicalobservations at tissue, organ and organisms scales; and specializedanalytics to define the physiological envelope during the dailylife of each patient. These domain-specific requirements suggest aneed for targeted funding, in which big data technologies for insilico medicine becomes the research priority.

    Index TermsBig data, healthcare, virtual physiological human.


    THE birth of big data, as a concept if not as a term, is usu-ally associated with a META Group report by Doug Laneyentitled 3-D Data Management: Controlling Data Volume,Velocity, and Variety published in 2001 [1]. Further devel-opments now suggest big data problems are identified by theso-called 5V: volume (quantity of data), variety (data fromdifferent categories), velocity (fast generation of new data), ve-racity (quality of the data), and value (in the data) [2].

    For a long time the development of big data technologies wasinspired by business intelligence [3] and by big science (suchas the Large Hadron Collider at CERN) [4]. But when in 2009Google Flu, simply by analyzing Google queries, predicted flu-like illness rates as accurately as the CDCs enormously complexand expensive monitoring network [5], some analysts started toclaim that all problems of modern healthcare could be solvedby big data [6].

    In 2005, the term virtual physiological human (VPH) wasintroduced to indicate a framework of methods and technolo-gies that, once established, will make possible the collaborative

    Manuscript received September 29, 2014; revised December 14, 2014; ac-cepted February 6, 2015. Date of publication February 24, 2015; date of currentversion July 23, 2015.

    M. Viceconti is with the VPH Institute for Integrative Biomedical Research,and the Insigneo Institute for In Silico Medicine, University of Sheffield,Sheffield S1 3JD, U.K. (e-mail:

    P. Hunter is with the Auckland Bioengineering Institute, University ofAuckland, 1010 Auckland, New Zealand (e-mail:

    R. Hose is with the Insigneo institute for In Silico Medicine, University ofSheffield, Sheffield S1 3JD, U.K. (e-mail:

    Digital Object Identifier 10.1109/JBHI.2015.2406883

    investigation of the human body as a single complex system[7], [8]. The idea was quite simple:

    1) To reduce the complexity of living organisms, we decom-pose them into parts (cells, tissues, organs, organ systems)and investigate one part in isolation from the others. Thisapproach has produced, for example, the medical special-ties, where the nephrologist looks only at your kidneys,and the dermatologist only at your skin; this makes it verydifficult to cope with multiorgan or systemic diseases, totreat multiple diseases (so common in the ageing popula-tion), and in general to unravel systemic emergence dueto genotype-phenotype interactions.

    2) But if we can recompose with computer models all thedata and all the knowledge we have obtained about eachpart, we can use simulations to investigate how these partsinteract with one another, across space and time and acrossorgan systems.

    Though this may be conceptually simple, the VPH visioncontains a tremendous challenge, namely, the development ofmathematical models capable of accurately predicting what willhappen to a biological system. To tackle this huge challenge,multifaceted research is necessary: around medical imagingand sensing technologies (to produce quantitative data aboutthe patients anatomy and physiology) [9][11], data process-ing to extract from such data information that in some casesis not immediately available [12][14], biomedical modelingto capture the available knowledge into predictive simulations[15], [16], and computational science and engineering to runhuge hypermodels (orchestrations of multiple models) underthe operational conditions imposed by clinical usage [17][19];see also the special issue entirely dedicated to multiscalemodeling [20].

    But the real challenge is the production of that mechanis-tic knowledge, quantitative, and defined over space, time andacross multiple space-time scales, capable of being predictivewith sufficient accuracy. After ten years of research this has pro-duced a complex impact scenario in which a number of targetapplications, where such knowledge was already available, arenow being tested clinically; some examples of VPH applicationsthat reached the clinical assessment stage are:

    1) The VPHOP consortium developed a multiscale model-ing technology based on conventional diagnostic imagingmethods that makes it possible, in a clinical setting, topredict for each patient the strength of their bones, howthis strength is likely to change over time, and the proba-bility that they will overload their bones during daily life.With these three predictions, the evaluation of the absoluterisk of bone fracture in patients affected by osteoporosiswill be much more accurate than any prediction based on

    2168-2194 2015 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.See standards/publications/rights/index.html for more information. +917620593389


    external and indirect determinants, as it is in current clin-ical practice [21].

    2) More than 500 000 end-stage renal disease patients inEurope live on chronic intermittent haemodialysis treat-ment. A successful treatment critically depends on a well-functioning vascular access, a surgically created arterio-venous shunt used to connect the patient circulation tothe artificial kidney. The ARCH project aimed to improvethe outcome of vascular access creation and long-termfunction with an image-based, patient-specific compu-tational modeling approach. ARCH developed patient-specific computational models for vascular surgery thatmakes possible to plan such surgery in advance on thebasis of the patients data, and obtain a prediction of thevascular access function outcome, allowing an optimiza-tion of the surgical procedure and a reduction of associatedcomplications such as nonmaturation. A prospective studyis currently running, coordinated by the Mario Negri In-stitute in Italy. Preliminary results on 63 patients confirmthe efficacy of this technology [22].

    3) Percutaneous coronary intervention (PCI) guided by frac-tional flow reserve (FFR) is superior to standard assess-ment alone to treat coronaries stenosis. FFR-guided PCIresults in improved clinical outcomes, a reduction in thenumber of stents implanted, and reduced cost. However,currently FFR is used in few patients, because it is inva-sive and it requires special instrumentation. A less inva-sive FFR would be a valuable tool. The VirtuHeart projectdeveloped a patient-specific computer model that accu-rately predicts myocardial FFR from angiographic im-ages alone, in patients with coronary artery disease. In aphase 1 study the methods showed an accuracy of 97%,when compared to standard FFR [23]. A similar approach,but based on computed tomography imaging, is even at amore advanced stage, having recently completed a phase 2trial [24].

    While these and some other VPH projects have reached theclinical assessment stage, quite a few other projects are still inthe technological development, or preclinical assessment phase.But in some cases the mechanistic knowledge currently availablesimply turned out to be insufficient to develop clinically relevantmodels.

    So it is perhaps not surprising that recently, especially in thearea of personalized healthcare (so promising but so challeng-ing) some people have started to advocate the use of big datatechnologies as an alternative approach, in order to reduce thecomplexity that developing a reliable, quantitative mechanisticknowledge involves.

    This trend is fascinating from an epistemological point ofview. The VPH was born around the need to overcome thelimitations of a biology founded on the collection of a hugeamount of observational data, frequently affected by consider-able noise, and boxed into a radical reductionism that preventedmost researchers from looking at anything bigger than a singlecell [25], [26]. Suggesting that we revert to a phenomenologicalapproach where a predictive model is supposed to emerge notfrom mechanistic theories but by only doing high-dimensional

    big data analysis, may be perceived by some as a step towardthat empiricism the VPH was created to overcome.

    In the following, we will explain why the use of big data meth-ods and technologies could actually empower and strengthencurrent VPH approach


View more >