graphconnect nyc

43
Graphs Opening Medical Care Information @davefauth www.intelliwareness.org

Upload: davefauth

Post on 11-Jun-2015

148 views

Category:

Technology


1 download

DESCRIPTION

GraphConnect NYC talk on Mortar/Hadoop/DocGraph

TRANSCRIPT

Page 1: GraphConnect NYC

Graphs Opening Medical Care Information

@davefauthwww.intelliwareness.org

Page 2: GraphConnect NYC

2

About Me

• My Blog: http://www.intelliwareness.org• Find me on Twitter: @davefauth• Email me: [email protected]• GitHub: http://github.com/davidfauth

Page 3: GraphConnect NYC

Not talking about this….

Page 4: GraphConnect NYC

Or this….

Page 5: GraphConnect NYC

But we want to talk about this:

Page 6: GraphConnect NYC

Ryan Weald – isurfsoftware.com

And this:

Page 7: GraphConnect NYC

I’ll try not to do this…

Page 8: GraphConnect NYC

Or this….

Page 9: GraphConnect NYC

Where we are today

Page 10: GraphConnect NYC

Healthcare Data

• Recommend watching Fred Trotter speak at GraphConnect – SF

• Moving from no data -> bad data -> better data -> good data

• Claims Data– Hard to accurately describe what a doctor is

doing and how they are getting paid without claims data

– Limited and not a good data set by any standard

Page 11: GraphConnect NYC

Examples of Bad Data

• Not enough data – More transparency without having to FOIA

• State level data is hard to get

Page 12: GraphConnect NYC

Better Data Sets

• DocGraph Data– One of the “best” available– “Best” does not mean “good”

• DocGraph Rx– Prescribing patterns for Medicare Part D patients

• NPPES• NUCC

Page 13: GraphConnect NYC

DocGraph Dataset

• DocGraph by the numbers – Directed graph – Average total degree 52.8 – 940,492 providers (graph nodes/vertices) – 49,685,810 shared edges

Page 14: GraphConnect NYC

DocGraph Data

Page 15: GraphConnect NYC

Doctor Detail (docNPI.com)

Page 16: GraphConnect NYC

Doctor Detail

Page 17: GraphConnect NYC

NPPES

• National Plan and Provider Enumeration System • Source of NPI (National Provider Identifier) • No cost download • Information is entered and updated by provider

Data quality is good to poor

• CSV file with 314 columns

Page 18: GraphConnect NYC

NUCC

• National Uniform Claim Committee– Healthcare Provider Taxonomy– No cost download

• CSV file with 5 columns and 830 rows– Link taxonomy to NPPES reported taxonomy

Page 19: GraphConnect NYC

DocGraph DataNodesOrganizationsSpecialtiesProvidersLocationsCountiesZipCensus

Relationships* Organizations -[:PARENT_OF] – Providers -[:SPECIALTY]- Specialties* Lcations-[:LOCATED_FOR]-Providers* Providers -[:REFERRED]-Providers* Counties -[:INCOME_IN]- CountiesZip* Locations – [:LOCATED_IN]-CountiesZip

Page 20: GraphConnect NYC

DocGraph Data

Provider refers

Page 21: GraphConnect NYC

DocGraph Data

Provider refers

Specialty

Specializes_in

Page 22: GraphConnect NYC

DocGraph Data

Provider refers

Specialty

Specializes_in

Parent Org

Parent_Of

Location

Location_In

Page 23: GraphConnect NYC

DocGraph Data

Provider refers

Specialty

Specializes_in

Parent Org

Parent_Of

Location

Location_In

Page 24: GraphConnect NYC

DocGraph Data

Provider refers

Specialty

Specializes_in

Parent Org

Parent_Of

Location

Location_For

CountiesZip

Located_In

Income

Income_In

Page 25: GraphConnect NYC
Page 26: GraphConnect NYC

DocGraph RX Data

• Reinforcing Jonathan Freeman’s talk on Hadoop and Neo4J

Page 27: GraphConnect NYC
Page 28: GraphConnect NYC
Page 29: GraphConnect NYC

Time for Analysis

Page 30: GraphConnect NYC

Fraud Referrals

April 2013 - The owner and another senior executive of Sacred Heart Hospital and four physicians affiliated with the west side facility were arrested today for allegedly conspiring to pay and receive illegal kickbacks, including more than $225,000 in cash, along with other forms of payment, in exchange for the referral of patients insured by Medicare and Medicaid to the hospital, announced U.S. Attorney for the Northern District of Illinois Gary S. Shapiro.

Page 31: GraphConnect NYC

Hadoop Page Rank

Page 32: GraphConnect NYC

DocGraph RX Data

• Originally obtained by ProPublica• Prescribing pattern for all physicians for

Medicare Part D – 2011• Largest public released prescribing database• 2 sets of data - 30M edges each• Related to business name and NDC-9 code– NDC 9 code allows for aggregation of drugs

Page 33: GraphConnect NYC

DocGraph RX Data

Page 34: GraphConnect NYC

DocGraph RX Data

Page 35: GraphConnect NYC

DocGraph RX Data

Page 36: GraphConnect NYC

DocGraphRx Data

Provider refers

Specialty

Specializes_in

Parent Org

Parent_Of

Location

Location_For

CountiesZip

Located_In

Income

Income_In

Drugs

prescribes

Page 37: GraphConnect NYC
Page 39: GraphConnect NYC

DocGraph RX Data

• Back to “bad data”• http://www.albme.org/actions.html

Page 40: GraphConnect NYC
Page 41: GraphConnect NYC

Combine additional datasets

• Medical data– Doctor referral data– Medicare doctor prescription practices– “Dollars for Doctors” – Drug company promotional

payments• Census Data– Income data– Poverty data

Page 42: GraphConnect NYC

Recommendation Engine?

• Build a graph model of the data• Build a recommender model from the graph

model• Graphs can be visualized, explained, discussed

and debugged collaboratively

Page 43: GraphConnect NYC