leveragingdeepneuralnetworksandsemanticsimilarity ... · gnteam logisticregression 87.7 bi-gru 85.5...

12
Leveraging Deep Neural Networks And Semantic Similarity Measures For Medical Concept Normalization In User Reviews Miftahutdinov Z.Sh. Tutubalina E.V. Kazan Federal University 1 june 2018

Upload: others

Post on 20-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Leveraging Deep Neural Networks And Semantic SimilarityMeasures For Medical Concept Normalization In User Reviews

Miftahutdinov Z.Sh.Tutubalina E.V.

Kazan Federal University

1 june 2018

Problem DescriptionProblem

Medical concept normalization – mapping a disease mention to aconcept in a controlled vocabulary.

Examples:

inflammation in my neck –> C0263854, Cervical arthritisvery painful joints –> C3864084, Arthralgiacan’t sleep –> C0393758, Insomniahigh BP –> C0020538, Increased venous pressure

KFU 1 june 2018 2 / 12

Problem DescriptionPossible Bottlenecks

∙ Social-media language∙ Ambiguity∙ Vocabulary variations∙ Abbreviations∙ Variety of target vocabularies

KFU 1 june 2018 3 / 12

BackgroundMetaMap and DNorm

MetaMap∙ mapping to UMLS∙ rule based

DNorm∙ mapping to MEDIC∙ pairwise learning to rank

KFU 1 june 2018 4 / 12

BackgroundSocial Media Mining for Health

Dataset∙ Twitter∙ mapping to MedDRA concepts∙ Training set 6650 phrases, 472 concepts∙ Test set 2500 phrases, 254 concepts

KFU 1 june 2018 5 / 12

BackgroundSocial Media Mining for Health

Team Results

Team Model Accuracy (%)

gnTeamLogistic Regression 87.7Bi-GRU 85.5Ensemble 88.5

UKNLPHierarchical Char-LSTM 87.2Hierarchical Char-CNN 87.7

KFU 1 june 2018 6 / 12

BackgroundLimsopatham and Collier

∙ Data from askapatient∙ mapping to SNOMED∙ 8411 phrases∙ 1029 unique codes∙ 81% Accuracy

KFU 1 june 2018 7 / 12

Proposed ModelSpell checking

∙ Levenshtein distance∙ Up to 3 misspellings

KFU 1 june 2018 8 / 12

Proposed ModelNeural Network Architecture

w1 UMLS

very poor appetite

Semantic

similarity features

Soft

max

C0232462

Decrease in

appetite

Medical

Concept

RNN

Featuresh1 h2 h3

h'1h'

2h'3

a1 a2 a3

a1 a2 a3

Em

beddin

g

Layer

Bid

irecti

onal R

NN

wit

h a

ttenti

on

KFU 1 june 2018 9 / 12

Proposed ModelSemantic Similarity Features

KFU 1 june 2018 10 / 12

Results

Model Accuracy (%)DNorm 73.39CNN 81.41RNN 79.98GRU+At., TFIDF 85.71

New FoldsCNN 46.19LSTM 64.51GRU 63.05LSTM + Attention 65.73GRU + Attention 67.08LSTM + Attn, TFIDF 67.63GRU + Attn, TFIDF 69.92

KFU 1 june 2018 11 / 12

Limitations and Future Work

∙ SMM4H dataset∙ Take context into account∙ Ranking algorithms

KFU 1 june 2018 12 / 12