Transcript
Page 1: Healthcare Data Analytics Implementation

INNOVATE INTEGRATE TRANSFORM

March 31, 2016

PREDICITIVE ANALYTICS IN HEALTHCARE

Page 2: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

HIMSS describes healthcare analytics as the “systematic use of data and related clinical and business (C&B) insights developed through applied analytical disciplines such as statistical, contextual, quantitative, predictive, and cognitive spectrums to drive fact-based decision making for planning, management, measurement and learning

Objectiveso Healthcare providers are improving the clinical outcomes of patients

via treatments and protocolso Promotion of wellness and disease management

Overview

The key objective of healthcare analytics is to gain insight for making informed healthcare decisions.

Page 3: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Patient Re-admission within 30 days – A case study

Business ProblemA healthcare facility wants to identify a patient's chance of getting re-admitted upon discharge within 30 days.

BenefitsClinicians can be prepared to provide better post-discharge

care for patients who are likely to get re-admitted and hospitals can avail benefits from Government.

Page 4: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Patient admission If a patient is hospitalized for more than 24 hrs, it is considered as Patient admission.

Re-Admission Patient gets admitted for more than 24 hrs within 30 days of the last discharge date. If a patient comes back to the hospital after 30 days, it is not considered as Re-Admission.

Definitions

Page 5: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Data Analysis Process

The below figure shows the typical processes of Data analysis of a Dataset.

Receive the Datasets (.csv)

Process the Datasets for

Analysis

Analyse the Datasets

Build the Model

Visualize the Analysed data

Page 6: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

In order to predict the re-admission, following data fields/predictors were considered.

Demographics – Age, Sex Lab data – Includes lab tests Vitals – Includes BP, Sugar, Weight, etc. Visit types – Emergency, In-patient, Outpatient Diagnosis – Diseases/ailments – Heart,Pnuemonia Previous hospital visit Length of stay

The Predictors – Predictive Analytics

Page 7: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

The data was received as a set of .csv files which gave the complete

details of Demographics, Admission, vitals, lab tests of selected sample

of patients over a period of time.

The processing of the data included the following activities:

o Removing commas, uploading .csv files to HDFS (Horton works)

o The required DDL scripts were written in Hive

o The necessary joins were written

o The result was refined datasets

The refined datasets are passed on to Data Analysis team for analysis

Data Processing …

Page 8: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Predictive Analysis Process

Use the Module – Analyse the dataset

Identify the Suitable Algorithm

Build the model

Evaluate/Deploy the model

Monitor/Refactor the model

Page 9: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Datasets

The refined datasets are divided into train and test datasets in order to build the Model

30%

70%

Train Test

Page 10: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

The best model is arrived at by testing the data under different classifiers

and precision, recall and F1 score metrics calculated for each classifier.

Gradient Boosting

Random Forest

Support Vector Machines

Logistic Regression

K-Nearest Neighbor

Ridge

Evaluate Models

Page 11: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Model - Process

ModelTrainingDataset

TestDataset

Model Final Analyzed Dataset

Page 12: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Model – Fine tuning K- Fold Means

For tuning parameters and model selection, k-Fold cross validation was used where data was split into K equal partitions. 1 fold was used for testing and the remaining for training. This was repeated K( K=4) times and using the average testing accuracy.

Dataset

Page 13: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Accuracy

Accuracy is measured by area under the ROC curve as shown below0.77 accuracy is achieved by Random Forest as shown in the below curve

Page 14: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Process – Data AnalysisThe below figure shows the typical processes of Data analysis of a dataset.

1.Offline Processed data is dumped into Staging Data Mart

Rest Client

CAF Analytics

EngineRest API

1. Builds Model running Python scripts

2. Scores model

Page 15: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Data Visualization – Report on the Model

Page 16: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Actual Report on a Dataset

Page 17: Healthcare Data Analytics Implementation

©ALTEN Calsoft Labs

Screen shots

Page 18: Healthcare Data Analytics Implementation

Write to US @ [email protected]

©ALTEN Calsoft Labs

18INNOVATE INTEGRATE TRANSFORMCopyright 2016 © ALTEN Calsoft Labs. All such documents and related graphics are provided "as is" without warranty of any kind and are subject to change without prior

notice. ALTEN Calsoft Labs reserves the right, in its sole discretion, to correct any errors or omissions in any portion of this document

Visit: www.Altencalsoftlabs.com


Top Related