safe medicine recommendation via medical knowledge … · safe medicine recommendation via medical...

9
Safe Medicine Recommendation via Medical Knowledge Graph Embedding Meng Wang 1 , Mengyue Liu 1 , Jun Liu 1 , Sen Wang 2 , Guodong Long 3 , Buyue Qian 1 1 MOEKLINNS Lab, Xian Jiaotong University, Xi’an, China 2 Griffith Universtiy, Gold Coast Campus, Australia 3 Centre for Quantum Computation and Intelligent Systems, University of Technology Sydney 1 {wangmengsd, liumengyue}@stu.xjtu.edu.cn, {liukeen, qianbuyue}@xjtu.edu.cn 2 sen.wang@griffith.edu.au, 3 [email protected] Abstract Most of the existing medicine recommendation systems that are mainly based on electronic medical records (EMRs) are significantly assisting doctors to make better clinical deci- sions benefiting both patients and caregivers. Even though the growth of EMRs is at a lighting fast speed in the era of big data, content limitations in EMRs restrain the existed recommendation systems to reflect relevant medical facts, such as drug-drug interactions. Many medical knowledge graphs that contain drug-related information, such as Drug- Bank, may give hope for the recommendation systems. How- ever, the direct use of these knowledge graphs in the sys- tems suffers from robustness caused by the incompleteness of the graphs. To address these challenges, we stand on re- cent advances in graph embedding learning techniques and propose a novel framework, called Safe Medicine Recom- mendation (SMR), in this paper. Specifically, SMR first con- structs a high-quality heterogeneous graph by bridging EMRs (MIMIC-III) and medical knowledge graphs (ICD-9 ontol- ogy and DrugBank). Then, SMR jointly embeds diseases, medicines, patients, and their corresponding relations into a shared lower dimensional space. Finally, SMR uses the em- beddings to decompose the medicine recommendation into a link prediction process while considering the patient’s di- agnoses and adverse drug reactions. To our best knowledge, SMR is the first to learn embeddings of a patient-disease- medicine graph for medicine recommendation in the world. Extensive experiments on real datasets are conducted to eval- uate the effectiveness of proposed framework. 1 Introduction Over the last few years, medicine recommendation sys- tems have been developed to assist doctors in making ac- curate medicine prescriptions. On the one hand, many re- searchers (Chen et al. 2016; Almirall et al. 2012) adopt rule-based protocols that are defined by the clinical guide- lines and the experienced doctors. Constructing, curating, and maintaining these protocols are time-consuming and labor-intensive. Rule-based protocols might be effective for a general medicine recommendation for a specific diagnosis, but give little help to tailored recommendations for compli- cated patients. On the other hand, supervised learning al- gorithms and variations, such as Multi-Instance Multi-label (MIML) learning (Zhang et al. 2017), have been proposed to recommend medicines for patients. Both input features and ground-truth information that are extracted from massive EMRs are trained to obtain a predictive model that outputs multiple labels of the new testing data as medicine recom- mendations. It is a fact that therapies and treatments in clin- ical practices are rapidly updated. Unfortunately, supervised learning methods cannot deal with those medicines that are not included in the training phase. Incomplete training data set will be a detriment to the recommendation system per- formance. It is reported in (Panagioti et al. 2015; Juurlink et al. 2003) that patients with two or more diseases, acute or chronic, often take five or more different medicines simul- taneously and have immense health risks. Studies (Edwards and Aronson 2000; Leape et al. 1995) have shown that 3- 5% of all in-hospital misused prescriptions blame to igno- rances of adverse drug reactions, which is difficult to pro- hibit even for the highly trained and experienced clinicians. With the assistance from the conventional medicine recom- mendation systems, clinicians still need to cautiously rule out those recommendations that have potential adverse ef- fects caused by drug-drug interactions. Most of the existing works have largely ignored the exploit of medical facts in medicines, such as drug-drug interactions, which is crucial in medicine recommendation system. One possible reason might because there is little medical expert knowledge in EMRs. Content limitations in EMRs constrain the systems to barely associate accurate medical facts with the recom- mended prescriptions, which makes the final recommenda- tion less trustworthy for the complicated patients. With the increasing emergence of knowledge graphs, many world-leading researchers have successfully extracted information from huge volumes of medical databases to build up giant heterogeneous graphs that reflect medical facts of medicines and diseases. For instance, DrugBank (Law et al. 2014) is a rich source of medicine information. It contains extensive entities (drugs, drug targets, chemistry, etc.) and relationships (enzymatic pathways, drug-drug in- teractions, etc.). ICD-9 ontology (Schriml et al. 2011) repre- sents a knowledge base of human diseases and can be used to classify diagnoses of patients. Harnessing well-built med- ical knowledge graphs in EMRs-based medicine recommen- dation system might enable the reinvented system to pro- vide appropriate prescriptions for special patients, as well as alerts of possible side effects and serious drug-drug interac- arXiv:1710.05980v2 [cs.IR] 26 Oct 2017

Upload: leliem

Post on 13-Apr-2018

255 views

Category:

Documents


1 download

TRANSCRIPT

Safe Medicine Recommendation via Medical Knowledge Graph Embedding

Meng Wang1, Mengyue Liu1, Jun Liu1, Sen Wang2, Guodong Long3, Buyue Qian1

1MOEKLINNS Lab, Xian Jiaotong University, Xi’an, China2Griffith Universtiy, Gold Coast Campus, Australia

3Centre for Quantum Computation and Intelligent Systems, University of Technology Sydney1{wangmengsd, liumengyue}@stu.xjtu.edu.cn, {liukeen, qianbuyue}@xjtu.edu.cn

[email protected], [email protected]

AbstractMost of the existing medicine recommendation systems thatare mainly based on electronic medical records (EMRs) aresignificantly assisting doctors to make better clinical deci-sions benefiting both patients and caregivers. Even thoughthe growth of EMRs is at a lighting fast speed in the eraof big data, content limitations in EMRs restrain the existedrecommendation systems to reflect relevant medical facts,such as drug-drug interactions. Many medical knowledgegraphs that contain drug-related information, such as Drug-Bank, may give hope for the recommendation systems. How-ever, the direct use of these knowledge graphs in the sys-tems suffers from robustness caused by the incompletenessof the graphs. To address these challenges, we stand on re-cent advances in graph embedding learning techniques andpropose a novel framework, called Safe Medicine Recom-mendation (SMR), in this paper. Specifically, SMR first con-structs a high-quality heterogeneous graph by bridging EMRs(MIMIC-III) and medical knowledge graphs (ICD-9 ontol-ogy and DrugBank). Then, SMR jointly embeds diseases,medicines, patients, and their corresponding relations into ashared lower dimensional space. Finally, SMR uses the em-beddings to decompose the medicine recommendation intoa link prediction process while considering the patient’s di-agnoses and adverse drug reactions. To our best knowledge,SMR is the first to learn embeddings of a patient-disease-medicine graph for medicine recommendation in the world.Extensive experiments on real datasets are conducted to eval-uate the effectiveness of proposed framework.

1 IntroductionOver the last few years, medicine recommendation sys-tems have been developed to assist doctors in making ac-curate medicine prescriptions. On the one hand, many re-searchers (Chen et al. 2016; Almirall et al. 2012) adoptrule-based protocols that are defined by the clinical guide-lines and the experienced doctors. Constructing, curating,and maintaining these protocols are time-consuming andlabor-intensive. Rule-based protocols might be effective fora general medicine recommendation for a specific diagnosis,but give little help to tailored recommendations for compli-cated patients. On the other hand, supervised learning al-gorithms and variations, such as Multi-Instance Multi-label(MIML) learning (Zhang et al. 2017), have been proposed torecommend medicines for patients. Both input features and

ground-truth information that are extracted from massiveEMRs are trained to obtain a predictive model that outputsmultiple labels of the new testing data as medicine recom-mendations. It is a fact that therapies and treatments in clin-ical practices are rapidly updated. Unfortunately, supervisedlearning methods cannot deal with those medicines that arenot included in the training phase. Incomplete training dataset will be a detriment to the recommendation system per-formance.

It is reported in (Panagioti et al. 2015; Juurlink et al.2003) that patients with two or more diseases, acute orchronic, often take five or more different medicines simul-taneously and have immense health risks. Studies (Edwardsand Aronson 2000; Leape et al. 1995) have shown that 3-5% of all in-hospital misused prescriptions blame to igno-rances of adverse drug reactions, which is difficult to pro-hibit even for the highly trained and experienced clinicians.With the assistance from the conventional medicine recom-mendation systems, clinicians still need to cautiously ruleout those recommendations that have potential adverse ef-fects caused by drug-drug interactions. Most of the existingworks have largely ignored the exploit of medical facts inmedicines, such as drug-drug interactions, which is crucialin medicine recommendation system. One possible reasonmight because there is little medical expert knowledge inEMRs. Content limitations in EMRs constrain the systemsto barely associate accurate medical facts with the recom-mended prescriptions, which makes the final recommenda-tion less trustworthy for the complicated patients.

With the increasing emergence of knowledge graphs,many world-leading researchers have successfully extractedinformation from huge volumes of medical databases tobuild up giant heterogeneous graphs that reflect medicalfacts of medicines and diseases. For instance, DrugBank(Law et al. 2014) is a rich source of medicine information.It contains extensive entities (drugs, drug targets, chemistry,etc.) and relationships (enzymatic pathways, drug-drug in-teractions, etc.). ICD-9 ontology (Schriml et al. 2011) repre-sents a knowledge base of human diseases and can be usedto classify diagnoses of patients. Harnessing well-built med-ical knowledge graphs in EMRs-based medicine recommen-dation system might enable the reinvented system to pro-vide appropriate prescriptions for special patients, as well asalerts of possible side effects and serious drug-drug interac-

arX

iv:1

710.

0598

0v2

[cs

.IR

] 2

6 O

ct 2

017

Medical Knowledge Graph

interact

target

Electronic Medical Records

missing

missing

Figure 1: Left part is a medical knowledge graph with miss-ing relationships. Right part is prescription records in EMRs,and each edge indicates that a patient takes a medicine.There is no relationship between medicines in EMRs.

tions (DDIs).As shown in Figure1, linking EMRs and medical knowl-

edge graphs to generate a large and high-quality heteroge-neous graph is a promising pathway for medicine recom-mendations in a wider scope, but never easy. Specifically,the newly designed system confronts with the followingchallenges: 1. Computational Efficiency. Querying spe-cialized medical entities and relationships based on conven-tional graph-based algorithms have limitations in portabil-ity and scalability. The computational complexity becomesunfeasible when the heterogeneous graph reaches a verylarge scale. 2. Data Incompleteness. The medical knowl-edge graphs also follow the long-tail distribution as sameas other types of large-scale knowledge bases. Data incom-pleteness is another serious problem existing among entitiesand relationships in such a distribution. For example, sincethe DDIs is not usually identified in the clinical trial phase,there is a lack of significant DDIs in DrugBank which can-not support a comprehensive precaution to the medication.Last but not least, medicine recommendation suffers from3. Cold Start. As conventional systems normally recom-mend medicines based on the historical records, the pace ofrecommendation changes cannot keep up with the frequentupdates of new therapies and treatments in medical prac-tices. Little information on adverse reactions to the newlyupdated medicines in historical EMRs or even in well-builtknowledge graphs makes the evidence-based recommenda-tion model hardly support the new medicines as updated rec-ommendations.

Taking all the challenges above into account, we pro-pose a novel medicine recommendation framework basedon graph embedding techniques, inspired by the idea of linkprediction. We name our framework as Safe Medicine Rec-ommendation (SMR) throughout this paper. The recommen-dation process mainly includes:

1. A large heterogeneous graph is constructed from EMRsand medical knowledge graphs, where the nodes are en-tities (medicines, diseases, patients), and the edges (orlinks) represent various relations between entities, suchas drug-drug interactions.

2. The different parts of the generated heterogeneous graph(patient-medicine bipartite graph, patient-disease bipar-tite graph, medicine knowledge graph, disease knowledgegraph) are embedded into a shared low-dimension space

based on graph-based embedding models. Afterward, ajoint learning algorithm is proposed to optimize the inte-grated graph simultaneously.

3. Based on the learned embeddings, a new patient, repre-sented by the vectors of his/her diagnoses, is modeledas an entity in the disease-patient graph. Recommendingmedicines for the patient is translated to predict links fromthe patient to medicines.

The primary contributions of this work are summarized asfollows:

1. We have developed graph-based embedding models tolearn the effective representations of patients, diseases,and medicines in a shared low-dimension space. Therepresentation of medicines enables the proposed frame-work can even effectively recommend newly emergedmedicines for patients, which distinguishes most of theexisting works.

2. To recommend safe medicines for new patients, we pro-pose a novel method for modeling a patient based on thelearned graph embeddings and make a safe recommenda-tion by minimizing the potential adverse drug reactions.

3. We have conducted extensive experiments on large real-world datasets (MIMIC-III, DrugBank, and ICD-9 ontol-ogy) to evaluate the effectiveness of our framework. Theexperimental results have shown that the proposed frame-work outperforms all the compared methods.

4. To our best knowledge, we firstly propose a frameworkto conduct Safe Medicine Recommendation (SMR) andformulate it as a link prediction problem. The imple-mentation generates a high-quality heterogeneous graphin which relationships among patients, diseases, andmedicines can be unveiled in a wider scope.

The remainder of this paper is organized as follows: Sec-tion 2 details our proposed framework SMR. Section 3 re-ports the experimental results and Section 4 reviews relatedwork. Section 5 presents the conclusions and future work.

2 The Proposed FrameworkIn this section, we will first describe the notations and for-mulate medicine recommendation problem, and then presentgraph embedding models and how to use learned embed-dings to recommend safe medicines for patients.

2.1 Problem FormulationBefore we focus on the medicine recommendation problem,we first briefly introduce the important notations employedin the remainder of this paper. Table 1 also summarizesthem.

The medical knowledge graph describes the medical en-tities collected from the integrated sources, as well as rela-tionships among these entities. For instance, a triple (gluco-corticoid, adverse interaction, aspirin) indicates that there isa relationship adverse interaction from glucocorticoid to as-pirin in DrugBank. We define the medical knowledge graphas follow.

Datasets

Training

Jointly Learning EmbeddingsHeterogeneous Graph

Represent a new patient

DrugBank

Constructing

Medicine

Recommendation

Recommending

Disease Drug Patient

MIMIC-III

ICD-9 ontology

+ -

+ -

Disease embeddings Medicine embeddings Patient embeddings

Patient-Medicine edges Patient-Disease edges

Relation embeddings

Figure 2: Overview of our framework. Patient-disease and patient-medicine graphs are all bipartite graphs, while disease andmedicine graphs are general graphs. Patient, disease, and medicine are encoded into a low dimensional metric by graph-basedmethods. Diverse factors can be connected through patients.

Table 1: Notations.

Variable InterpretationN ,R the set of entities and relations

(h, r, t) a triple in knowledge graphh, t both are k dimensional vector embeddingsr a d dimensional relation embeddingHr a k × d dimensional projection matrix

p,m, d a patient, medicine, disease

p, m, d a k dimensional vector of a patient,a medicine or a disease

Rk,Rd k and d dimensional latent space ofentities and relations

Definition 1 (Medical Knowledge Graph) The medicalknowledge graph G = (N ,R) is a set of triples in the form(h, r, t), where N is a set of entities, R is a set of relations,h, t ∈ N and r ∈ R.

To capture the co-relationships of patients, diseases, andmedicines in EMRs, we define the patient-disease, patient-medicine bipartite graphs as follow.

Definition 2 (Patient-Medicine Bipartite Graph)The patient-medicine bipartite graph is denoted asGpm = (P ∪M, Epm), where P is a set of patients andMis a set of medicines. Epm is the set of edges. If a patientpi takes a medicine mj , there will be an edge eij betweenthem, otherwise none. The weight wij of the edge betweenpatient pi and medicine mj is defined as the total times ofpatient pi takes the medicine mj .

Definition 3 (Patient-Disease Bipartite Graph)The patient-disease bipartite graph is denoted asGpd = (P ∪ D, Epd), where P is a set of patientsandD is a set of diseases. Epd is the set of edges. If a patientpi is diagnosed with a disease dj , there will be an edge eijbetween them, otherwise none. The weight wij is set to 1when the edge eij exists.

Figure 2 illustrates a heterogeneous graph by construct-ing patient-disease, patient-medicine bipartite graphs from

MIMIC-III, and linking them to medical knowledge graphs,ICD-9 ontology, and DrugBank. Finally, we formally definethe safe medicine recommendation problem as follows.

PROBLEM 1 (Safe Medicine Recommendation) Given apatient p and his/her diagnoses dataset Dp, recommendingsafe medicines for each d ∈ Dp is predicting edges from p tomedicines datasetM. The output is a set of medicinesMp

with minimum drug-drug interactions.

2.2 Model Description and OptimizationIn this section, we propose embedding learning approachesto encode the heterogeneous graph in the latent space and itsoptimization method.

Medical Knowledge Graph Embedding A medicalknowledge graph G = (N ,R) is a multi-relational graph,in which entities N and relations R can be different types.For a triple, (h, r, t) ∈ G, we use bold letters h, r, tto denote the corresponding embedding representations ofh, r, t. Plenty of graph embedding methods has been pro-posed to encode a multi-relational graphs into a continuousvector space. Translation-based models (Bordes et al. 2013;Wang et al. 2014; Lin et al. 2015) regard the relation r ineach (h, r, t) as a translation from h to t within the low di-mensional space, i.e., h + r− t, and perform much moreeffectively and efficiently than conventional models. TransR(Lin et al. 2015) is a state-of-the-art translation-based em-bedding approach. It represents entities and relations in dis-tinct vector space bridged by relation-specific matrices to getbetter graph representations.

Consider the above reason, we set entities embeddingsh, t ∈ Rk and relations embeddings r ∈ Rd. And we set aprojection matrix Hr ∈ Rk×d, which projects entities fromentity space to relation space. We define the translations be-tween entities and get the energy function z(h, r, t) as:

z(h, r, t) = b− ‖hHr + r− tHr‖L1/L2 (1)

where b is a bias constant.

Then, the conditional probability of a triple (h, r, t) is de-fined as follows:

P (h|r, t) =exp{z(h, r, t)}∑

h∈N exp{z(h, r, t)}(2)

and P (t|h, r), P (r|h, t) can be defined in the analogousmanner. We define the likelihood of observing a triple(h, r, t) as:

L(h, r, t) = logP (h|r, t) + logP (t|h, r)+ logP (r|h, t) (3)

We define an objective function by maximizing the condi-tional likelihoods of existing triples in G:

LG =∑

(h,r,t)∈G

L(h, r, t) (4)

Based on Eq.(4), the objective functions of medicineand disease knowledge graph Gm = (Nm,Rm), Gd =(Nd,Rd) can be defined respectively:

LGm =∑

(hm,rm,tm)∈Gm

L(hm, rm, tm) (5)

LGd=

∑(hd,rd,td)∈Gd

L(hd, rd, td) (6)

Bipartite Graph Embedding Different from the medicalknowledge graph, the patient-disease, patient-medicine arebipartite graphs. A bipartite graph has only one single typeof relations, i.e., it is a homogeneous graph. For a homoge-neous graph, LINE (Tang et al. 2015) model achieves thestate-of-the-art performance of encoding the entities into acontinuous vector space while preserving co-relations infor-mation of the graph. Hence, we follow LINE and set pa-tients, medicines, and diseases embeddings p,m,d ∈ Rk.We present the process of encoding patient-medicine bipar-tite graph as follow.

Given a patient-medicine bipartite graph Gpm = (P ∪M, Epm). We first define the conditional probability of thata patient pi in set P takes medicine mj in setM as follow:

P (mj |pi) =exp{z(pi,mj)}∑

mj∈Piexp{z(pi, mj)}

(7)

where z(pi,mj) = mTj · pi, pi is the embedding vector

of the patient pi in P , and mj is the embedding vector ofmedicine mj in M. Eq. (7) defines a conditional distribu-tion P (·|pi) over all medicines inM. The empirical distri-bution P (·|pi)is defined as P (mj |pi) =

wij

sumi, where wij is

the weight of the edge eij and sumi =∑

j wij is the totaltimes that the patient pi takes medicines. We maximize thefollowing objective function:

LGpm= −

∑pi∈P

λid(P (·|pi), P (·|pi)) (8)

where d(·, ·) is the distance between two distributions. Inthis paper, we use KL-divergence to compute d(·, ·). As

sumi is different from patients, we use λi = sumi in theobjective function to represent the personalization of the pa-tient pi in the graph. After omitting some constants, we have:

LGpm=

∑eij∈Epm

wij log(P (mj |pi)) (9)

For the patient-disease bipartite graph, we can get the objectfunction LGpd

in the analogous manner:

LGpd=

∑eij∈Epd

wij log(P (dj |pi)) (10)

Optimization and Training To learn the medical knowl-edge graph and bipartite graphs embeddings simultane-ously, an intuitive approach is to collectively embed the fourgraphs (patient-medicine bipartite graph, patient-disease bi-partite graph, medicine knowledge graph, disease knowl-edge graph) by maximizing the sum of the four logarithmlikelihood objective functions just as follow:

L(X) = LGm+ LGd

+ LGpm+ LGpd

+ γC(X) (11)

where X stands for the embeddings Rk, Rd of entities andrelations in the heterogeneous graph we construct, γ is ahyper-parameter weighting the regularization factor C(X),which is defined as follows:

C(X) =∑

nm∈Nm

[||nm|| − 1]+ +∑

nd∈Nd

[||nd|| − 1]+

+∑

rm∈Rm

[||rm|| − 1]+ +∑

rd∈Rd

[||rd|| − 1]+

+∑p∈P

[||p|| − 1]+ +∑d∈D

[||d|| − 1]+ +∑

m∈M[||m|| − 1]+

(12)

where [x]+ = max(0, x) denotes the positive part of x. Theregularization factor will normalize the embeddings duringlearning. And we adopt the asynchronous stochastic gradi-ent algorithm (ASGD) (Recht et al. 2011) to maximize thetransformed objective function.

Optimizing objective functions Eq. (5), Eq. (6), Eq. (9)and Eq. (10) in Eq.(11) are computationally expensive, ascalculating them need to sum over the entire set of entitiesand relations. To address this problem, we use the negativesampling method (Mikolov et al. 2013) to transform the ob-jective functions.

For Eq.(5) and Eq.(6), we should transform logP (t|h, r),logP (r|h, t),logP (h|r, t) in Eq.(3). Taking P (t|h, r) as anexample, we maximize the following objective function in-stead of it:

log σ(z(h, r, t))

+

C1∑n=1

Ehn∼zneg({(h,r,t)})[σ(z(hn, r, t))](13)

where C1 is the number of negative examples, σ(x) =

1/(1 + exp(−x)) is the sigmoid function. {(h, r, t)} is theinvalid triple set, and zneg is a function randomly samplinginstances from {(h, r, t)}. When a positive triple (h, r, t) ∈

G is selected, to maximize Eq.(13), C1 negative triples areconstructed by sampling entities from an uniform distribu-tion over N and replacing the head of (h, r, t). The trans-formed objective of logP (r|h, t), logP (t|h, r) are maxi-mized in the same manner, but for logP (r|h, t), the nega-tive relations are sampled from a uniform distribution overR to corrupt the positive relation r ∈ (h, r, t). We iterativelyselect random mini-batch from the training set to learn em-beddings until converge.

For Eq. (9), we also use the negative sampling method totransform it to the following objective function:

log σ(z(pi,mj))

+

C2∑n=1

Emn∼zneg(m)[log σ(z(pi, mn))](14)

where σ(x) = 1/(1 + exp(−x)) is the sigmoid function,C2 is the number of negative edges. zneg(m) ∝ sum

3/4m

according to the empirical setting of (Mikolov et al. 2013),summ is the total number of times that the medicine m istaken by patients. we can simplify Eq.(10) and maximize itin the same way.

Finally, we can efficiently learn the embeddings of differ-ent types of parts in the heterogeneous graph.

2.3 Safe Medicine Recommendation ProcessIn this section, we present how to recommend safemedicines based on the learned embeddings and diagnosesof a given patient. For an existing patient p, we use thelearned embedding p to predict new medicine recommenda-tions. For a new given patient p, we first use diseases embed-dings of p’s diagnoses to represent p, and then recommendsafe medicines for p, as shown in Figure 2,.

New Patient Model We aim to present a new patient p byhis/her diagnoses embeddings. We should consider the timesequence of diseases that a patient is diagnosed, especiallyfor the patient with multiple diseases. Assume a patient p inthe hospital or on medication is associated with n ranked dis-eases according to their timestamps in an increasing order.Then, the patient embedding p can be encoded as follow:

p =

n∑t=1

exp−t · dt (15)

where dt is the t-th embedding of disease dt.

Medicine Recommendation Given a query patient p withthe query disease d, i.e., q = (p, d), we first project diseased and patient p into their latent space, and then select top-ksafe medicines1. More precisely, given a query q = (p, d),for each medicine m which could be useful for p, we com-pute its ranking score as in Eq. (16), and then select themedicinem with the top-k highest ranking scores as the rec-ommendation.

S(q,mn) = pT ·mn−n−1∑o=1

‖mn + rinteration−mo‖L1/L2

(16)1In MIMIC-III, patients in ICUs are sicker and usually need

more medicines for a diagnosis. We set k=3 in this paper.

Table 2: Entities and relations in the heterogeneous graph.

Entities Relations#Disease 6,985 #Medicine-related 71,460#Medicine 8,054 #Disease-related 170,579#Patient 46,520 #Patient-Disease 489,923#Medicine-related 305,642 #Patient-Medicine 975,647

where p is the representation of patient p and mn is then-th medicines to be considered from medicines M basedon the the already selected medicine m1, ...,mn−1.

3 Experiments and EvaluationWe attempt to demonstrate the effectiveness of our recom-mendation method in this section, which is referred to asSMR in this paper. In particular, we expect to answer “howwell does our method compare with the competing tech-niques?” in Section 3.2. The results show that our recom-mendation method significantly outperforms the three base-lines. The detailed experimental settings of our evaluationsare described in Section 3.1.

3.1 Experimental SettingsData Sets Our experiments are performed on the realEMRs datasets, MIMIC-III (Johnson et al. 2016), and twomedical knowledge graphs, ICD-9 ontology (Schriml et al.2011) and DrugBank (Law et al. 2014). These real datasetsare publicly available in different forms.

• MIMIC-III (Medical Information Mart for Intensive CareIII) collected bedside monitor trends, electronic medicalnotes, laboratory test results, and waveforms from theICUs (Intensive Care Units) of Beth Israel DeaconessMedical Center between 2001 and 2012. It contains dis-tinct 46,520 patients, 650,987 diagnoses and 1,517,702prescription records that associated with 6,985 distinctdiseases and 4,525 medicines.

• ICD-9 ontology2 (International classification of diseases-version 9) contains 13,000 international standard codes ofdiagnoses and the relationships between them.

• DrugBank is a bioinformatics/cheminformatics resourcewhich consists of medicine related entities. The medi-cal knowledge graph version3 contains 8,054 medicines,4,038 other related entities (e.g., protein or drug targets)and 21 relationships.

Heterogeneous Graph Construction We connect MIMIC-III, ICD-9 ontology, and DrugBank (medicine group 1) byconstructing the patient-medicine bipartite graph and thepatient-disease bipartite graph.

For the patient-disease bipartite graph, MIMIC-III pro-vides ICD-9 codes for diagnoses, which implicitly the di-agnoses of MIMIC-III can be linked to ICD-9 ontology bystring matching. For the patient-medicine bipartite graph,

2http://bioportal.bioontology.org/ontologies/ICD9CM

3http://wifo5-03.informatik.uni-mannheim.de/drugbank/

Table 3: Experiments on medicine group 1.

Prediction accuracy DDIs rateRule-based 0.3068 32.01%

K-Most frequent 0.3473 14.08%LEAP 0.5582 1.66%SMR 0.6113 0.17%

Table 4: Experiments on medicine group 2.

Prediction accuracy DDIs rateRule-based 0.2736 27.01%

K-Most frequent NA NALEAP NA NASMR 0.5214 2.01%

the prescriptions in MIMIC-III consist of the drug informa-tion, e.g., the names, the duration, and the dosage. However,various names to a single type of medicine in MIMIC-III ex-ist due to some noisy words (20%, 50ml, glass bottle, etc.),which becomes an obstacle to link medicine names to Drug-Bank when directly applying to the string matching method.We use an entity linking method (Wang et al. 2017) insteadto address this problem. Table 2 shows the statistic of theheterogeneous graph we construct. The heterogeneous graphwill be used to learn low-dimension representations of enti-ties and relations by the SMR framework. Afterward, we cat-egorize the medicines in the heterogeneous graph into twogroups: 1). The first group consists of all 4,525 medicinesthat are recorded in EMRs, and will be used as inputs ofthe baseline methods. 2). The second group contains 3,529medicines that haven’t been observed in EMRs, and will beused as test data for cold start recommendation.

Baselines We compare our SMR with the following base-lines:

• Rule-based method (Almirall et al. 2012) recommendsmedicines based on mappings from existing medicinecategories to diseases in the MEDI database (Wei et al.2013). For each disease, a drug is assigned to the patientaccording to the mappings.

• K-Most frequent method is a basic baseline which re-trieves the topK medicines that most frequently co-occurwith each disease as their recommendation. We setK = 3in this paper.

• LEAP method (Zhang et al. 2017) uses a Multi-InstanceMulti-Label learning framework to train a predictivemodel taking disease conditions as input features andyielding multiple medicine labels as recommendations.

3.2 Evaluation MethodsTo guarantee medicine recommendations generated by SMRwork effectively, we evaluate four indice, the prediction ac-curacy, the ability to avoid adverse drug-drug interactions,the experienced clinical doctor assessments, and the capac-ity to process cold start problem. In all experiments, the ratioof training to validation to test sets is 0.7:0.1:0.2. The hyper-parameters was adjusted by a validation set.

Prediction Accuracy and DDIs Rate We utilize JaccardCoefficient to compare the similarity of the prescriptionsgenerated by SMR and the corresponding prescriptions writ-ten by doctors. Given the recommendation medicines setMi

generated by SMR for a patient pi, Mi is the medicines setprescribed by doctors in the data. The mean of Jaccard coef-

ficient is defined as follows,

Jaccard =1

K

K∑i

|Mi ∩ Mi||Mi ∪ Mi|

(17)

where K is the number of samples in a test set. Table 3shows the accuracy of the baselines and SMR on medicinegroup 1, the rule-based method performs the worst becauseit is the only one provides a general recommendation for aspecific diagnosis and it is not able to endow personalizedrecommendations, especially for the patients with multiplediseases. The frequency of each medicine-disease pair re-mains high in ICUs. Hence, recommendations based on fre-quency,K-Most frequent method, also work deficiently. Ourmethod SMR outperforms LEAP by 1.49% because moreaccurate medical facts are involved in medical knowledgegraphs rather than the prescription information in EMRs.

We extract all adverse drug-drug interactions (DDIs) fromDrugBank to evaluate whether medicine recommendationsembrace unsafe DDIs. Table 3 shows the percentages ofdifferent medicine recommendations consisting of adverseDDIs. The result indicates that SMR can recommend mostharmless medicines for patients as its drug interaction rateis the lowest. The rule-based method and K-Most frequentmethod select medicines by a greedy strategy only regard-less of specific adverse DDIs. For the rarely used medicinesand unknown DDIs in EMRs, SMR is more reliable thanLEAP. The reason is that SMR can predict each patient-medicine link and compute potential hidden DDIs by thelearned embeddings of medical knowledge graphs.

Cold Start We evaluate the ability of baselines and SMRin addressing cold start medicine recommendations on themedicine group 2. K-Most frequent method and LEAP arenot applicable (NA) on recommending new medicines inthe cold start scenario. Since our SMR process can presentnew medicines by the learned vector representations of usedmedicines, the potential patient-medicine links between coldstart medicines and patients will be captured correspond-ingly. In other words, SMR can leverage not only the patient-medicine links in EMRs but also the medical knowledgegraphs when recommending cold start medicines.

Clinical Assessment We invited three experienced clini-cal experts to evaluate the effectiveness of the medicine rec-ommendations by scoring on a 6-point scale: 5 correspond-ing to completely cover all diagnoses without DDIs; 4 topartially (at least 50%) diagnoses include without DDIs; 3to completely cover all diagnoses with DDIs; 2 to less than50% diagnoses without DDIs; 1 to partially (at least 50%)diagnoses covered with DDIs; 0 to less than 50% diagnoses

Table 5: Examples of medicine recommendations generated by Baselines and SMR.

Diagnosis Methods Medicine Recommendations

SepsisAcute respiratry failureHypertension

Rule-based Teicoplanin, MetoprololK-Most frequent Vancomycin, Furosemide, Metoprolol, InsulinLEAP Vancomycin, Furosemide, Metoprolol TartrateSMR Vancomycin, Furosemide, Amlodipine, Norepinephrine, Acetaminophen

Type 2 diabetesRheumatoid arthritisHypertensionHyperlipidemia

Rule-based Gliclazide, Phenylbutazone, Sulfasalazine, FenofibrateK-Most frequent Furosemide, Tolbutamide, Phenylbutazone, Metoprolol, Insulin, AcetaminophenLEAP Metformin, Amethopterin, Amiloride/HCTZ, FenofibrateSMR Metformin, Insulin, Acetaminophen, Nifedipine, Fenofibrate

SMR LEAP K-Most frequent Rule-based0

1

2

3

4

5

Ave

rage

Sco

re

Medicine Group 1 Medicine Group 2

Figure 3: Clinical Assessment.

with DDIs. The average score of three experts is used as thefinal clinical assessment score for each recommendation, asshown in Figure 3.

Case Study In table 5 we illustrate two events of medicinerecommendations on medicine group 1 for patients associ-ated with multiple types of diseases. SMR is qualified to suc-ceed in all these two cases when comparing it against otherbaselines. For the first patient, SMR recommended a set ofmedicines with 100% coverage, with Vancomycin for Sep-sis, Norepinephrine, Acetaminophen for respiratry failure,Furosemide and Amlodipine for Hypertension. In contrast,other baselines are not capable of make an adequate consid-eration. The rule-based method adopted Teicoplanin, target-ing Sepsis only and not appropriate. The K-Most frequentmethod and LEAP only selected Vancomycin for Sepsis andother medicines for Hypertension. For the second patient,SMR recommends more suitable medicines than LEAP andRule-based method, i.e., Metformin and Insulin for Type2 diabetes, Acetaminophen to release Rheumatoid arthritis,Nifedipine for Hypertension, and Fenofibrate for Hyperlipi-demia. There is an adverse DDI among the medicines rec-ommended by the K-Most frequent method. Tolbutamideand Phenylbutazone can lead to harmful, potentially fatal ef-fects when taken together. This case also indicates SMR canavoid the adverse DDIs when recommending medicines.

4 Related WorkIn this section, we discuss related work, including medicinerecommendation and medical knowledge graphs.

Medicine Recommendation As introduced in Section 1,two types of methods, rule-based protocols (Chen et al.2016; Gunlicks-Stoessel et al. 2016; Almirall et al. 2012),and supervised-learning-based methods (Zhang et al. 2017;Zhang et al. 2014), are currently utilizing EMRs to recom-mending medicines. Ideally, medicine recommendation sys-tems aim to tailor treatment to the individual characteris-tics of each patient (Fernald et al. 2011). Hence, medicinerecommendation has also received attention recently in ge-netics/genomics research fields. There are already existingmedicine recommendation systems (Rosen-Zvi et al. 2008;Bennett and Hauser 2013) by leveraging genetics/genomicsinformation of patients in current practice, such informationis not yet widely available in everyday clinical practice, andis insufficient since it only addresses one of many factorsaffecting response to medication.

Medical Knowledge Graphs Recent evaluation effortson knowledge graphs have focused on automatic knowl-edge base population and completion. Some knowledgegraphs have also been constructed from huge volumesof medical databases over the last years, such as (Ernst,Siu, and Weikum 2015), Bio2RDF(Dumontier et al. 2014),and Chem2Bio2RDF(Chen et al. 2010). Medical knowl-edge graphs contain an abundance of basic medical factsof medicines and diseases and provide a pathway for med-ical discovery and applications, such as effective medicinerecommendations. Unfortunately, such medical knowledgegraphs suffer from serious data incomplete problem, whichimpedes its application in the field of clinical medicine.

5 Conclusion and Future WorkIn this paper, we propose a novel framework SMR to rec-ommend safe medicines for patients, especially for the pa-tients with multiple diseases. SMR first constructs a high-quality heterogeneous graph by bridging EMRs (MIMIC-III) and medical knowledge graphs (ICD-9 ontology andDrugBank). Then, SMR jointly embeds diseases, medicines,patients, and their corresponding relations into a sharedlower dimensional space. Finally, SMR uses the embeddingsto decompose the medicine recommendation into a linkedprediction process while considering the patient’s diagnosesand adverse drug reactions. Extensive experiments on realworld datasets are conducted and demonstrate the effective-ness of SMR. In future work, we will improve the linkingaccuracy by considering more information of patients, suchas the clinical outcomes and demographics.

References[Almirall et al. 2012] Almirall, D.; Compton, S. N.;Gunlicks-Stoessel, M.; Duan, N.; and Murphy, S. A. 2012.Designing a pilot sequential multiple assignment random-ized trial for developing an adaptive treatment strategy.Statistics in Medicine 31(17):1887–1902.

[Bennett and Hauser 2013] Bennett, C. C., and Hauser, K.2013. Artificial intelligence framework for simulating clin-ical decision-making: A markov decision process approach.Artificial Intelligence in Medicine 57(1):9–19.

[Bordes et al. 2013] Bordes, A.; Usunier, N.; Garcia-Duran,A.; Weston, J.; and Yakhnenko, O. 2013. Translating em-beddings for modeling multi-relational data. In Advances inneural information processing systems, 2787–2795.

[Chen et al. 2010] Chen, B.; Dong, X.; Jiao, D.; Wang, H.;Zhu, Q.; Ding, Y.; and Wild, D. J. 2010. Chem2bio2rdf: asemantic framework for linking and data mining chemoge-nomic and systems chemical biology data. BMC Bioinfor-matics 11(1):255.

[Chen et al. 2016] Chen, Z.; Marple, K.; Salazar, E.; Gupta,G.; and Tamil, L. 2016. A physician advisory system forchronic heart failure management based on knowledge pat-terns. Theory and Practice of Logic Programming 16(5-6):604–618.

[Dumontier et al. 2014] Dumontier, M.; Callahan, A.; Cruz-Toledo, J.; Ansell, P.; Emonet, V.; Belleau, F.; and Droit,A. 2014. Bio2rdf release 3: a larger connected networkof linked data for the life sciences. In Proceedings of the2014 International Conference on Posters & Demonstra-tions Track-Volume 1272, 401–404. CEUR-WS. org.

[Edwards and Aronson 2000] Edwards, I. R., and Aronson,J. K. 2000. Adverse drug reactions: definitions, diagnosis,and management. The Lancet 356(9237):1255–1259.

[Ernst, Siu, and Weikum 2015] Ernst, P.; Siu, A.; andWeikum, G. 2015. Knowlife: a versatile approach for con-structing a large knowledge graph for biomedical sciences.BMC Bioinformatics 16(1):157.

[Fernald et al. 2011] Fernald, G. H.; Capriotti, E.;Daneshjou, R.; Karczewski, K. J.; and Altman, R. B.2011. Bioinformatics challenges for personalized medicine.Bioinformatics 27(13):1741–1748.

[Gunlicks-Stoessel et al. 2016] Gunlicks-Stoessel, M.; Muf-son, L.; Westervelt, A.; and Almirall, D. 2016. A pilot smartfor developing an adaptive treatment strategy for adolescentdepression. Journal of Clinical Child & Adolescent Psychol-ogy 45(4):480–494.

[Johnson et al. 2016] Johnson, A. E.; Pollard, T. J.; Shen, L.;Lehman, L.-w. H.; Feng, M.; Ghassemi, M.; Moody, B.;Szolovits, P.; Celi, L. A.; and Mark, R. G. 2016. Mimic-iii, a freely accessible critical care database. Scientific Data3.

[Juurlink et al. 2003] Juurlink, D. N.; Mamdani, M.; Kopp,A.; Laupacis, A.; and Redelmeier, D. A. 2003. Drug-drug interactions among elderly patients hospitalized fordrug toxicity. Journal of The American Medical Associa-tion 289(13):1652–1658.

[Law et al. 2014] Law, V.; Knox, C.; Djoumbou, Y.; Jewi-son, T.; Guo, A. C.; Liu, Y.; Maciejewski, A.; Arndt, D.;Wilson, M.; Neveu, V.; et al. 2014. Drugbank 4.0: shed-ding new light on drug metabolism. Nucleic Acids Research42(D1):D1091–D1097.

[Leape et al. 1995] Leape, L. L.; Bates, D. W.; Cullen, D. J.;Cooper, J.; Demonaco, H. J.; Gallivan, T.; Hallisey, R.; Ives,J.; Laird, N.; Laffel, G.; et al. 1995. Systems analysis ofadverse drug events. Journal of The American Medical As-sociation 274(1):35–43.

[Lin et al. 2015] Lin, Y.; Liu, Z.; Sun, M.; Liu, Y.; and Zhu,X. 2015. Learning entity and relation embeddings forknowledge graph completion. In Proceedings of the 29thAAAI Conference on Artificial Intelligence, 2181–2187.

[Mikolov et al. 2013] Mikolov, T.; Sutskever, I.; Chen, K.;Corrado, G. S.; and Dean, J. 2013. Distributed represen-tations of words and phrases and their compositionality. InAdvances in Neural Information Processing Systems, 3111–3119.

[Panagioti et al. 2015] Panagioti, M.; Stokes, J.; Esmail, A.;Coventry, P.; Cheraghi-Sohi, S.; Alam, R.; and Bower, P.2015. Multimorbidity and patient safety incidents in pri-mary care: a systematic review and meta-analysis. PloS One10(8):e0135947.

[Recht et al. 2011] Recht, B.; Re, C.; Wright, S.; and Niu,F. 2011. Hogwild: A lock-free approach to parallelizingstochastic gradient descent. In Advances in Neural Informa-tion Processing Systems, 693–701.

[Rosen-Zvi et al. 2008] Rosen-Zvi, M.; Altmann, A.; Pros-peri, M.; Aharoni, E.; Neuvirth, H.; Sonnerborg, A.;Schulter, E.; Struck, D.; Peres, Y.; Incardona, F.; et al. 2008.Selecting anti-hiv therapies based on a variety of genomicand clinical factors. Bioinformatics 24(13):i399–i406.

[Schriml et al. 2011] Schriml, L. M.; Arze, C.; Nadendla,S.; Chang, Y.-W. W.; Mazaitis, M.; Felix, V.; Feng, G.;and Kibbe, W. A. 2011. Disease ontology: a backbonefor disease semantic integration. Nucleic Acids Research40(D1):D940–D946.

[Tang et al. 2015] Tang, J.; Qu, M.; Wang, M.; Zhang, M.;Yan, J.; and Mei, Q. 2015. Line: Large-scale informationnetwork embedding. In Proceedings of the 24th Interna-tional Conference on World Wide Web, 1067–1077. Interna-tional World Wide Web Conferences Steering Committee.

[Wang et al. 2014] Wang, Z.; Zhang, J.; Feng, J.; and Chen,Z. 2014. Knowledge graph embedding by translating onhyperplanes. In Proceedings of the 28th AAAI Conferenceon Artificial Intelligence, 1112–1119.

[Wang et al. 2017] Wang, M.; Zhang, J.; Liu, J.; Hu, W.;Wang, S.; Li, X.; and Liu, W. 2017. Pdd graph: Bridg-ing electronic medical records and biomedical knowledgegraphs via entity linking. In International Semantic WebConference. Springer.

[Wei et al. 2013] Wei, W.-Q.; Cronin, R. M.; Xu, H.; Lasko,T. A.; Bastarache, L.; and Denny, J. C. 2013. Developmentand evaluation of an ensemble resource linking medications

to their indications. Journal of the American Medical Infor-matics Association 20(5):954–961.

[Zhang et al. 2014] Zhang, P.; Wang, F.; Hu, J.; and Sor-rentino, R. 2014. Towards personalized medicine: leverag-ing patient similarity and drug similarity analytics. Proceed-ings of AMIA Summits on Translational Science 2014:132.

[Zhang et al. 2017] Zhang, Y.; Chen, R.; Tang, J.; Stewart,W. F.; and Sun, J. 2017. Leap: Learning to prescribe effec-tive and safe treatment combinations for multimorbidity. InProceedings of the 23rd ACM SIGKDD International Con-ference on Knowledge Discovery and Data Mining, 1315–1324. ACM.