mining primary care emrs

13
Understanding patient experiences from mining primary care data Centre for Health Informatics Filippo Galgani Adam Dunn Margaret Williamson Malcolm Gillies Guy Tsafnat

Upload: filippo-galgani

Post on 14-Apr-2017

319 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Mining primary care EMRs

Understanding patient

experiences from mining

primary care data

Centre for Health Informatics

Filippo Galgani

Adam Dunn

Margaret Williamson

Malcolm Gillies

Guy Tsafnat

Page 2: Mining primary care EMRs

General Practice EMRs

• Aim: measure quality of care for a range of conditions in a diverse

population using GP EMR data.

• Dataset: longitudinal data (2.5 million Australian patients) including

prescriptions, diagnoses, pathologies, referrals

• Patients’ journey: grouping patients by experience to detect relevant

patterns in data over time..

Page 3: Mining primary care EMRs

Big Data Problems

• Data collected to keep patient history:

– Dealing with missing information

– Inconsistency

– Combination of short text fields (not coded) and numerical

values

• Doctors’ time constraints make data entry inaccurate

• Progress notes not available (privacy issue)

• Patients may visit other practices (thus missing information)

• Events happen irregularly

Page 4: Mining primary care EMRs

Continuity of care

Page 5: Mining primary care EMRs

Reasons for Prescription

123571

162357

Some Reason Given

Reason Missing

1974 different for PPI prescriptions

GORD (Gastro-oesophageal Reflux Disease) 50842

Reflux - gastro-oesophageal 13596

Reflux oesophagitis 6285

GOR (Gastro-oesophageal Reflux) 6047

Gastritis 5755

Gastro-oesophageal Reflux 4356

… …

Page 6: Mining primary care EMRs

Textual inconsistency:

Natural Language Processing

gord

GORD

gord;

gord • Normalization of case

and punctuation

• Stopword Filtering

• Spelling Correction

Gastro-oesophageal

Reflux Disease Gastro-oesophageal

Reflux

oesophygitis oesophagitis

Page 7: Mining primary care EMRs

Textual inconsistency:

Natural Language Processing

• Lemmatization Oesophagitis ulcerative

Oesophagitis ulcerating

Oesophagitis

ulcer

• Acronym Expansion

• Synonyms

GORD

GORD (Gastro-oesophageal Reflux Disease)

Gastro-oesophageal Reflux Disease =

Reflux oesophagitis Gastro-oesophageal Reflux =

Page 8: Mining primary care EMRs

Reasons for Prescription

GORD (Gastro-oesophageal Reflux Disease) 50842

Reflux - gastro-oesophageal 13596

Reflux oesophagitis 6285

GOR (Gastro-oesophageal Reflux) 6047

Gastritis 5755

Gastro-oesophageal Reflux 4356

… …

GORD (Gastro-oesophageal Reflux Disease) 87217

NLP pipeline

1974 different for PPI prescriptions

123571

162357

Some Reason Given

Reason Missing

Page 9: Mining primary care EMRs

123571

162357

Some Reason Given

Reason Missing

Reasons for Prescription

?

Page 10: Mining primary care EMRs

Missing Information: Machine Learning Approach

Random set of PPI patients

annotated by experts wrt GORD

Page 11: Mining primary care EMRs

Grouping Patients by Journey

Page 12: Mining primary care EMRs

Conclusion

• Data mining on GP EMRs is challenging due to the

noisy, messy and sparse nature of the data

• Analyzing journeys is possible, it required:

– Temporal reasoning (infer missing events)

– Natural Language Processing (solve textual

inconsistencies)

– Machine Learning (predict missing information)

– Domain knowledge (for modeling)

Page 13: Mining primary care EMRs

Acknowledgment

• This research was funded by the Australian Department of Health

and Ageing through the NPS MedicineWise as part of the

MedicineInsight Program.

• I wish to express my gratitude to:

Malcolm Gillies and Margaret Williamson from NPS

Adam Dunn and Guy Tsafnat from UNSW

• Thank you for the attention