online forums dalu_nov20

36
Marina Sokolova Institute for Big Data Analytics DECM and School of EECS, University of Ottawa, [email protected]

Upload: viorel-dodon

Post on 12-Apr-2017

151 views

Category:

Health & Medicine


1 download

TRANSCRIPT

Marina Sokolova Institute for Big Data Analytics

DECM and School of EECS, University of Ottawa, [email protected]

M. Sokolova Opinion Mining of Online Medical

Forums 2

M. Sokolova Opinion Mining of Online Medical

Forums 3

M. Sokolova Opinion Mining of Online Medical

Forums 4

19%-28% of Internet users participate in online health discussions.

In North America, 59% of all adults have looked online for information about a range of health issues, the most popular being specific diseases and treatments.

M. Sokolova Opinion Mining of Online Medical

Forums 5

Personal health information (PHI) is information about one’s health discussed by a patient in a clinical setting

PHI is the most vulnerable private information posted online

I have a family history of Alzheimer's disease. I have seen what it does and its sadness is a part of my life. I am already burdened with the knowledge that I am at risk.

We're going for the basic blood tests, the NT scan, and the "Ashkenazi panel" since both XX and I are Jewish from E. European descent.

M. Sokolova

Opinion Mining of Online Medical Forums 6

26% of adult internet users have read or watched someone else’s health experience about health or medical issues in the past 12 months.

16% of adult internet users in the U.S. have gone online in the past 12 months to find others who share the same health concerns.

Up to 49% of the users are most interested in personal testimonials related to health

< 25% of the users are interested only in facts

M. Sokolova

Opinion Mining of Online Medical Forums 7

Understanding of PHI posted by the general public is important for the development of health care policies I really dont know why everyones freaking out about the H1N1

vaccine. I got it the first day it came out (about a week and a half ago) and so did 4 of my family members. None of us had any problems and were all really glad we got the vaccine.

Previous to social networks, PHI studies had been conducted on restricted and controlled groups (e.g., nuns from the same monastery, patients of the same clinic)

Time-, event- and location-dependent!

M. Sokolova Opinion Mining of Online Medical

Forums 8

Manual analysis

- pros: the most accurate (80- 95% of labels coincide); can work with any type of data;

- cons: effort- and labor-consuming; inviting more annotators can improve or hurt accuracy, agreement depends on the topic (kappa 0.60 – 0.73);

- data size: up to 1000 text units per a person;

Fully automated pros: fast and portable; cons: can work with certain type of data; high accuracy vacillation

(50-80%); data size: 1000 – 10,000 units

Automated, with humans in the loop pros: relatively accurate (65-80%), can work with any type of data; cons: less portable than FA; less accurate than MA data size: 500 – 10,000 units

M. Sokolova Opinion Mining of Online Medical

Forums 9

Open-source data Analytics kit:

Social Mining – framework Information Extraction - knowledge-based search Machine Learning - processing of (extremely) large data Opinion mining – semantic analysis of data

Strategically placed humans in the loop Data annotation by 2-3 annotators Exhaustive verification of positive results; Random verification of negative results

PHI resources ontology of PHI terms HealthAffect lexicon

M. Sokolova

Opinion Mining of Online Medical Forums 10

We used random posts to verify whether the messages were self-evident for sentiment annotation or required an additional context.

We looked for discussions where the forum participants discussed only one topic. A preliminary analysis showed that discussions with ≤ 30 posts

satisfied this condition.

We wanted discussions be long enough to form a meaningful discourse. This condition was satisfied when discussion had ≥ 5 messages.

M. Sokolova Opinion Mining of Online Medical

Forums 11

Social Mining identifies demographic parameters which influence and imply language parameters.

Information Extraction uses those language parameters to find and retrieve relevant information from text.

I have a family history of Alzheimer's disease. I have seen what it does and its sadness is a part of my life. I am already burdened with the knowledge that I am at risk.

Record A1 ... A5000 Class

1 1 0 P

2 1 1 N

.....

....

.....

....

3000 0 1 N

M. Sokolova Opinion Mining of Online Medical

Forums 12

In the Table, each record represents one comment .

learning modes – do we know data labels?

Training and test stages – training and test sets should not overlap!

Algorithms - simpler often means robust

The best model selection – it is always worth trying several parameters

performance evaluation – verify results on positive AND negative classes

M. Sokolova Opinion Mining of Online Medical

Forums 13

General health information: they are promoting cancer awareness particularly lung cancer

Personal health information: I had a rare condition and half of my lung had to be removed

Irrelevant: I saw a guy chasing someone and screaming at the top of his lungs

Terminology the transfer went well - my RE did it himself which was comforting. 2 embies (grade 1 but slow in development) so I am not holding my breath for a positive

Technical terms Someone with 50 DB hearing aid gain with a total loss of 70 DB may not know that the place is producing 107 DB since it may not appear too loud to him since he only perceives 47 DB

M. Sokolova Opinion Mining of Online Medical

Forums 14

M. Sokolova Opinion Mining of Online Medical

Forums 15

Sentiment: I am sickened by the thought …

Ailment: I feel sick for awhile; should see my physician

Opinion: I think it is evident that …

Improvement: The benefit is usually evident within a few days of starting it

Humor: don't forget that it's better for your health to enjoy your steak than to resent your sprouts

Complain: After that my health deteriorated …

Non-textual features

people use few emoticons when discuss PHI

people do not create new hashtags about PHI

General purpose lexicons

WordNet’s semantics needs substantiation

SentiWordNet, WordNetAffect require considerable adjustment

M. Sokolova Opinion Mining of Online Medical

Forums 16

Electronic medical dictionaries are developed to analyze scientific publications

the Medical Dictionary for Regulatory Activities (MedDRA):

8,561 unique terms/86 PHI terms

the Systematized Nomenclature of Medicine Clinical Terms (SNOMED CT):

44,802 unique terms/108 PHI terms

M. Sokolova

Opinion Mining of Online Medical Forums 17

55 discussions, 1008 comments from North American online forums gathered in Summer 2013.

Topics discussed by many participants:

Security and safety of retention /storage of DNA data

Exploitation of DNA data by government and/or insurance companies

Anonymity of the data for research purposes

And you have no problem with this government owning their

genetic code, potentially knowing illness, disabilities, strengths,

weaknesses and potential? A trusting soul you are indeed.

M. Sokolova

Opinion Mining of Online Medical Forums 18

From 7238 discourse units, we identified that the forum users

are patient Elaboration –49%

Explanation and background - 8%

want to convince other participants Joint units – 17%

Enabling, attribution, summary - 7%

can be skeptical Evaluation and conditions – 10%

can argue Contrast – 8%

M. Sokolova 19 Opinion Mining of Online Medical

Forums

Text unit: one comment

Classification categories: positive, negative, mix, and irrelevant. Each comment was assigned with a label. There were no “other” comments.

Text representations

Bag-of-words, # of words identified as positives (negatives) by SentiWordNet, General Inquirer, # of punctuation marks, emoticons, elongated words.

DepecheMode, SentiWordNet, General Inquirer

SVM, NB, 10-fold cross-validation.

M. Sokolova 20

Opinion Mining of Online Medical Forums

5-class flat classification On Bag-of-words+, accuracy 66.3%

On sentiment lexicons, accuracy 63.9-64.2%

Two-level semi-automated classification • Irrelevant comment removed manually

• Positive/negative/mix from the relevant comments

• Accuracy 60.8%

Hierarchical classification Relevant/irrelevant classifier

• Accuracy 77.3%

Positive/negative/mix from the result of the above classifier Accuracy 58.7%

Irrelevant comments help!?

M. Sokolova 21 Opinion Mining of Online Medical

Forums

6 sub-forums of IVF.ca

95% participants are women

Empirical results are obtained on sub-forum Age 35+ 130 discussions, 1438 messages.

A separate discussion contained a coherent discourse. unexpected shifts in the discourse flow can be introduced by a

new participant joining the discussion.

Five emotional and non-emotional categories: encouragement, gratitude, confusion, facts, and endorsement. - identified bottom-up: from specific to general

M. Sokolova Opinion Mining of Online Medical

Forums 22

Alice: Jane - whats going on??

Jane: We have our appt. Wednesday!! EEE!!!

Beth: Good luck on your transfer! Grow embies grow!!!!

Jane: The transfer went well - my RE did it himself which was comforting. 2 embies (grade 1 but slow in development) so I am not holding my breath for a positive. This really was my worst cycle yet; it was the Antagonist protocol which is supposed to be great when you are over 40 but not so much for me!!

M. Sokolova Opinion Mining of Online Medical

Forums 23

Manual Annotation

+ Two raters annotated each post with the dominant sentiment. + Only author’s subjective comments were marked as such;

- if the author conveyed sentiments of others, we did not mark it.

+ We obtained Fleiss Kappa = 0.737 which indicated a strong agreement between annotators.

- Kappa values demonstrated an adequate selection of classes of sentiments and appropriate annotation guidelines.

Classification category # posts

Per-cent

Facts 494 34.4%

Encouragement 333 23.2%

Endorsement 166 11.5%

Confusion 146 10.2%

Gratitude 131 9.1%

Ambiguous , i.e. raters disagree 168 11.7%

Total 1438

100%

Discussions usually start by a participant by expressing her doubts and concerns, continued by describing a treatment and come to a conclusion by the announcement of the results.

All these cornerstone messages received corresponding replies.

Within discussions messages were related: every posted message replied to one or several previous messages.

M. Sokolova Opinion Mining of Online Medical

Forums 26

4-class classification where all 1269 unambiguous posts are classified into (encouragement, gratitude, confusion, and neutral, i.e., facts and endorsement)

3-class classification (positive: encouragement, gratitude; negative: confusion, neutral: facts and endorsement)

M. Sokolova Opinion Mining of Online Medical

Forums 27

M. Sokolova Opinion Mining of Online Medical

Forums 28

Metrics 4-class classification 3-class classification

microaverage F-score

0.633 0.672

macroaverage Precision

0.593 0.625

macroaverage Recall

0.686 0.679

macroaverage F-score

0.636 0.651

Baseline F-score

0.281 0.356

Precision (P) – how many of comments identified as C indeed belong to class C; rate of false hits. Recall (Re) – how many of all comments from class C are identified as C; rate of misses. F-score - the harmonic mean of P and Re;

The most accurate classification occurred for gratitude. It was correctly classified in 83.6% of its occurrences.

It was most commonly misclassified as encouragement (9.7%).

The second most accurate result was achieved for encouragement. It was correctly classified in 76.7% of cases.

It was misclassified as neutral, i.e. facts + endorsement, in 9.8%.

The least correctly classified class was neutral (50.8%). One possible explanation is the presence of the sentiment

bearing words in the description of facts in a post which is in general objective and which was marked as factual by the annotators.

M. Sokolova Opinion Mining of Online Medical

Forums 29

Pairs Occurrence Percent

facts, facts 170 19.5%

encouragement, encouragement 119 13.7%

facts, encouragement 55 6.3%

endorsement, facts 53 6.1%

encouragement, facts 44 5.1%

Triads Occurrence Percent

factual, factual, factual 94 12.8%

encouragement, encouragement,

encouragement 63 8.6%

encouragement, gratitude, encouragement 18 2.4%

factual, endorsement, factual 18 2.4%

confusion, factual, factual 17 2.3%

The most reinforcing transition: facts->facts – 0.47

The least reinforcing transition: gratitude–> gratitude – 0.14

The most frequent changes: confusion-> facts and gratitude –> encouragement – 0.30 each

The least frequent change: facts –> confusion – 0.02

The most frequent 1st comment: confusion – 0.57

The most frequent last comment: facts – 0.39

The most ambiguous comments: 1st – 0.26

The least ambiguous comment: endorsement – ambiguous – 0.06

M. Sokolova

Opinion Mining of Online Medical Forums 31

15 most active authors posted 15–50 comments each. This comes to 387 texts, or 29% of the data.

71% of comments convey facts, endorsement and encouragement

remaining non-ambiguous comments are evenly split between confusion and gratitude

11 authors had posts marked as confusion.

Only 8 authors had posts marked as gratitude.

comments of prolific authors were more ambiguous than of other authors: 16% vs 12.5%

M. Sokolova Opinion Mining of Online Medical

Forums 32

M. Sokolova Opinion Mining of Online Medical

Forums 33

ambiguous 16%

confusion 1%

encouragement

25%

endorsement 8%

facts 39%

gratitude 11%

Last comments

ambiguous 16%

confusion 5%

encouragement

25% endorsement 13%

facts 34%

gratitude 7%

Comments written by prolific authors

ambiguous 26%

confusion 57%

encouragement

0%

endorsement 1% facts

16%

gratitude 0%

First comments

ambiguous 13%

confusion 9%

encouragement

24% endorsement 12%

facts 33%

gratitude 9%

All comments

Too early for conclusions

Proud to say: our group was the 1st TDM group to analyse PHI in user-generated Web content

In future, Social Mining may play a bigger role

Eventually, PHI resources will be developed on a scale of current medical resources

Privacy protection will start at the source

Authorship attribution F-score = 0.97

M. Sokolova Opinion Mining of Online Medical

Forums 34

Personal Health Information retrieval Twitter (Sokolova et al, RANLP 2013)

MySpace (Ghazinour et al, AI 2013;)

Opinion Mining on Health Care medical forums (Ali et al, IJCNLP 2013; Poursepanj, M.Sc.

thesis, in progress)

Commentosphere (Sobhani, Ph.D. thesis, in progress)

Sentiment Analysis of Personal Health Information Twitter (Bobicev et al, AI 2012)

medical forums (Ali et al, RANLP 2013; Bobicev & Sokolova, RANLP 2013)

Sentiment propagation of Personal Health Information IVF forums (Bobicev et al, SocialNLP 2014)

M. Sokolova Opinion Mining of Online Medical

Forums 35

Thank you!

Questions?

M. Sokolova Opinion Mining of Online Medical

Forums 36