big data and smart healthcare

1 Sujan Perera Kno.e.sis Center, Wright State University Big Data and Smart Healthcare Wright State Honors Institute Symposium

Upload: sujan-perera

Post on 09-Jan-2017




0 download



Sujan PereraKno.e.sis Center, Wright State University

Big Data and Smart Healthcare Wright State Honors Institute Symposium

Healthcare is Changing

• Introduction of new federal rules and incentive programs• Hospitals are forced to change the process (30-day

readmission, ICD10 adaptation, quality measures) • Free and Open health information• Rise of discussions/forums/social media• 70-75% Americans online have used internet to find

health information1

• Rapid growth of health related devices• Variety of cheap sensors for health status/activity

monitoring• IBM Watson• Adaptation of Watson technology to Healthcare


Challenges on the way

• Huge amount of data being generated• Scientific knowledge, social forums, patient records

• Variety of data formats (text, images, videos)• Find the signal from noise (actionable information)• Expert can’t keep up with the new information• Need expert knowledge to interpret data (esp.

combination of observations)• Trustworthiness• Especially on social forums

• Privacy

It is clear that we need mechanisms to automate some parts of data processing and help humans in decision


This talk will concentrate on how to improve the machine understanding of unstructured data

Structured vs Unstructured Data

Patient Disorders ICD-9 Code

Patient1 Hypertension 401

Patient2 Atrial fibrillation 427.31

Patient1 Pulmonary hypertension 416

Patient3 Edema 782.3

Patient4 hyperthyroidism 242.9

Coronary artery disease, status post four-vessel coronary artery bypass graft surgery on , by Dr. X with a left internal mammary artery to the left anterior descending artery, sequential vein graft to the ramus and first diagonal, and a vein graft to the posterior descending artery. He had normal left ventricular function. He is having some symptoms that are unclear if they are angina or not. I am therefore going to get him scheduled for an exercise Cardiolite stress test.


• Structured data is incomplete and not accurate2,3

• 80% of patient data is unstructured1

• Stake holders interested in unstructured data• Medical professionals• Scientists• Insurance Companies• Policy makers

• Interesting Applications• Search• Prediction• Applications like CAC and CDI• Data and knowledge mining• Decision Support

Unstructured Data

1 and Limitations of CMS Administrative Data in Research3Comparison of clinical and administrative data sources for hospital coronary artery bypass graft surgery report cards

Patient Data Distribution

Structured data

Unstructured data

Lab resultsHbA1C, BP,


• Key indicators for readmission prediction reside in unstructured patient notes• facilities

• “Holter monitor was ordered by Lisa. She failed to get this because she did not have transportation”

• non-compliance• “Atrial fibrillation with poorly controlled ventricular rate due

to noncompliance.”• financial status

• “The patient mentioned that Bystolic is expensive and cannot afford it now.”

How Important is Unstructured Data

• ICD10 adaptation – need to understand the relationships E08 - Diabetes mellitus due to underlying condition

E08.0 - Diabetes mellitus due to underlying condition with hyperosmolarity E08.00 - without nonketotic hyperglycemic-hyperosmolar coma (NKHHC) E08.01 - with coma

E08.1 - Diabetes mellitus due to underlying condition with ketoacidosis E08.10 - without coma E08.11 – with coma

• The underlying condition can be congenital rubella, Cushing's syndrome, cystic fibrosis, malignant neoplasm, malnutrition, pancreatitis

How Important is Unstructured Data

Search Mining

Decision Support

Knowledge Discovery Prediction



The Solution

• Semantic Web– Provides a common framework that allows data to

be shared and reused across application, enterprise, and community boundaries

– Offers mechanisms to query data and reason over them

• Natural Language Processing– Enable computers to understand natural language


The Solution

An Example

He is off both Diovan and Lotrel. I am unsure if it is due to underlying renal insufficiency. He has actually been on atenolol alone for his hypertension.

Raw Text




diovan lotrel renal insufficiency atenolol hypertension



antihypertensive agent


tenominatenix kidney failure

renal insufficiency

kidney disease


blood pressure disorder


systoloc hypertension

pulmonary hypertension

Patient taking diovan for hypertension

Patient has kidney disease

Patient is on antihypertensive drugs

is used to treat

is a




ezKB<problem value="Asthma" cui="C0004096"/><med value="Losartan" code="52175:RXNORM" /><med value="Spiriva" code="274535:RXNORM" /><procedure value="EKG" cui="C1623258" />

ezFIND ezMeasure ezCDIezCAC

ezHealth Platform

Health Outcome Prediction

Thank YouVisit us: