insights from patient authored text: from close …nh030tg4542/... · how people self-treat...

INSIGHTS FROM PATIENT AUTHORED TEXT:

FROM CLOSE READING TO AUTOMATED EXTRACTION

A DISSERTATION

SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE

AND THE COMMITTEE ON GRADUATE STUDIES

OF STANFORD UNIVERSITY

IN PARTIAL FULFILLMENT OF THE REQUIREMENTS

FOR THE DEGREE OF

DOCTOR OF PHILOSOPHY

DIANA LYNN MACLEAN

MARCH 2015

http://creativecommons.org/licenses/by-nc/3.0/us/

This dissertation is online at: http://purl.stanford.edu/nh030tg4542

© 2015 by Diana Lynn MacLean. All Rights Reserved.

Re-distributed by Stanford University under license with the author.

This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.

ii



http://purl.stanford.edu/nh030tg4542

I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.

Jeffrey Heer, Primary Adviser


Michael Bernstein


Christopher Manning


Stuart Card

Approved for the Stanford University Committee on Graduate Studies.

Patricia J. Gumport, Vice Provost for Graduate Education

This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.

iii

Abstract

Millions of people collaborate online with others who share their health concerns. In the process,

these users perform complex health-related tasks, such as differential diagnosis and treatment compar-

ison. The result is a massive, growing and readily accessible corpus of patient authored text (PAT) that

documents patients’ behavior outside of the clinical environment. As a result, PAT can provide insights

into otherwise obscure topics, such as why patients follow only certain parts of a treatment protocol, or

how people self-treat stigmatized conditions such as prescription drug addiction.

Despite the potential value of PAT, attempts to extract medically-relevant insights from it have been

limited. PAT is notoriously noisy and challenging to work with, and there is a dearth of methods and tools

for processing and analyzing it. Moreover, the specific research questions that PAT can support are not

obvious: determining what data PAT encodes, and how, is a challenge in and of itself.

In this thesis, I develop methods for automatically extracting medically-relevant data from PAT. I focus

specifically on the topic of addiction: a stigmatized and prevalent medical condition. Building on close

readings of source text to inform schema induction, data annotation, and feature engineering, I train clas-

sifiers that accurately identify (1) medically-relevant terms in PAT; (2) users’ motivations for participating

in an addiction-related online health community; (3) users’ drugs of choice, and (4) users’ transitions

through relapse and recovery. Using these classifiers to scale analyses to large PAT corpora, I derive

novel insights into the process of addiction, as well as the role that online health communities play in

giving users informational and emotional support and, ultimately, in enabling recovery.

In concert, these contributions both underscore PAT’s latent value for illuminating poorly understood

or clandestine medical topics, and offer viable methods that dramatically improve our ability to realize

this value.

iv

For Angus and June

v

Acknowledgements

My first and foremost thanks to go my advisor, Jeffrey Heer. Jeff has been a wonderful source of

support, knowledge and inspiration during my time at Stanford, and I am deeply indebted to him for

not only supporting my curiosity as my research ventured into uncharted territory, but for doing so with

enthusiasm and confidence. Most importantly, however, Jeff has been an exemplary role model. I am

lucky, grateful, and unquestionably better for having had the opportunity to learn from him, and am proud

to be taking that with me as I start my next great adventure.

There are several people without whom this dissertation would not have been possible: Anna Lem-

bke, who brought with her invaluable medical perspective, and whose enthusiasm, thoughtful insight and

patience were instrumental in making this cross-disciplinary work a reality; Stuart Card, whose inge-

niousness I aspire to, and whose advice I have had the fortune to benefit from on several occasions;

Sonal Gupta, a close friend and collaborator from whom I have learned a great many things, and hope to

learn many more; and Michael Bernstein and Christopher Manning, who have given generously of their

time and advice, helping to steer this work from its inception through its completion.

I am fortunate to have had many wonderful co-conspirators while at Stanford. Sudheendra Hangal,

whose patient support and advice were instrumental in my early graduate school years, has been a

fantastic collaborator and a dear friend. Monica Lam, with whom I worked closely during my first year,

remains an uplifting source of inspiration. The UW IDL group, the Stanford HCI group, and the fantastic

people in the 3B wing have been a fun, dynamic and reliable source of new ideas, feedback and ca-

maraderie, and will be greatly missed. Finally, Jillian Lentz and Monica Niemiec deserve special thanks

for not only providing efficient administrative support, but also for answering even the most frantic of

questions with a smile.

Finally, there are some people without whom I would not be where I am today. The inimitable Margo

Seltzer who, suffice it to say, started this whole business in the first place; David Holland, whose patient

and thorough technical tutelage stands me in good stead to this day; Will Phan, who helped me to see

the real joy in coding; my mother, Heather, who is the embodiment of never giving up; and, of course, my

husband, Isa, who inspires and challenges me to be a little better every day. It makes all the difference.

vi

Table of Contents

1 Introduction 1

1.1 Overview & Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 The Internet and Health 9

2.1 Online Health Information Seeking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Historical Overview & Current Landscape . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 What Health Information Do Users Seek Online? . . . . . . . . . . . . . . . . . . 12

2.1.3 Who Seeks Health Information Online? . . . . . . . . . . . . . . . . . . . . . . . 12

Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Race . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Socio-Economic Status & Education . . . . . . . . . . . . . . . . . . . . . . . . . 15

Role (Patient vs. Caregiver) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.1.4 Where Do People Find Health Information Online? . . . . . . . . . . . . . . . . . 15

2.2 Online Health Community Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.1 Modes of Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.2 Who Participates in OHCs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2.3 Reasons for Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Medium-Based Affordances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Informational Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Emotional Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.2.4 Efficacy of Online Health Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

vii

3 Prior Work on Patient Authored Text 21

3.1 Patient Authored Text (PAT): Introduction & Overview . . . . . . . . . . . . . . . . . . . . 21

3.1.1 Value of PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1.2 Challenges of Working with PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Noisiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Lack of Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Applicability to Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Syndromic Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.1 Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.2.2 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.2.4 Modeling and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.2.5 Real-World Evaluation Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3 Pharmacovigilance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.3.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.2 Identifying Drugs in PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.3.3 Identifying Adverse Events in PAT . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4 Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

3.4.1 Ontology-Based Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.4.2 Statistical Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3.5 Thematic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.5.1 Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.5.2 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.5.3 Analysis Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

3.5.4 Scaling Thematic Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4 Data 35

4.1 MedHelp Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

4.1.2 Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

viii

4.2 CureTogether Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

5 Identifying Medically Relevant Terms in PAT 40

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.2.1 Medical Term Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

5.2.2 Consumer Health Vocabularies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

5.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

5.3.2 Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

5.4 Labeling Medically Relevant Terms with the Crowd . . . . . . . . . . . . . . . . . . . . . 45

5.4.1 Task Design and Pilot Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5.4.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Determining a Gold Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Comparing Turkers Against a Gold Standard . . . . . . . . . . . . . . . . . . . . 49

5.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.4.4 Limitations of the Crowd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

5.5 Training a Classifier on Crowd-Labeled Data . . . . . . . . . . . . . . . . . . . . . . . . 52

5.5.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

5.5.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Failure Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5.6 Example Applications of ADEPT to PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.6.1 Summarizing Important Medical Content in MedHelp’s Arthritis Forum . . . . . . . 57

5.6.2 Navigating MedHelp’s Substance Abuse Forum (Forum77) . . . . . . . . . . . . . 57

5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

6 What do People Seek on Forum77? 64

6.1 Why Study Addiction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1.1 Addiction is Highly Prevalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1.2 Addiction is Highly Stigmatized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

6.1.3 People are Turning Online for Help with Addiction . . . . . . . . . . . . . . . . . . 66

ix

6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

6.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.3.1 Thematic Analysis Development Dataset . . . . . . . . . . . . . . . . . . . . . . 68

6.3.2 Labeled Training & Testing Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.4 Who Posts? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.5 Users’ Objectives in Initiating Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . 69

6.6 Classifying Informational vs. Emotional Support . . . . . . . . . . . . . . . . . . . . . . . 70

6.6.1 Training Dataset Annotation and Agreement . . . . . . . . . . . . . . . . . . . . . 70

6.6.2 Classifier Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.6.3 Classifier Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.7 Classifying Updates vs. Non-updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.7.1 Classifier Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.7.2 Classifier Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

6.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

6.8.1 Thomas’ Recipe: An Informal Collaboration . . . . . . . . . . . . . . . . . . . . . 76

6.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

6.9.1 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

6.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

7 Identifying Drugs of Choice 83

7.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

7.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.3 Automatically Identifying Drugs of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.3.1 Definition of Drug of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

7.3.2 Data Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3.3 Classifier Training & Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

7.3.4 Drug Term Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

7.4 Comparing Real-World DOC Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.4.1 Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

7.4.2 Narcotics Anonymous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.4.3 TEDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

x

7.4.4 DAWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

7.6.1 Limitations & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

8 Quantifying Recovery and Relapse 96

8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

8.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

8.2.1 The Prescription Drug Abuse Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 97

Withdrawal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Self-Detoxification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

Relapse & Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

8.2.2 In-Person Mutual Help Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

8.2.3 Inferring Health State from Social Media . . . . . . . . . . . . . . . . . . . . . . . 99

8.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

8.4 Exploring & Modeling Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.4.1 Transtheoretical Model for Behavior Change . . . . . . . . . . . . . . . . . . . . . 101

8.4.2 Rubric Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

8.4.3 A Taxonomy of the Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . 102

8.4.4 Labeling People, not Posts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.5 Characterizing the Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

8.5.1 Sample & Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103

8.5.2 Activity Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

8.5.3 Linguistic & Content Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

LIWC Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Days Mentioned and Question Features . . . . . . . . . . . . . . . . . . . . . . . 105

Phase-Specific Term Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

8.5.4 Results: Activity and Linguistic Features . . . . . . . . . . . . . . . . . . . . . . . 106

8.6 Automatically Classifying Addiction Phase . . . . . . . . . . . . . . . . . . . . . . . . . . 108

8.6.1 Model & Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

8.6.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

xi

8.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.7 Automatically Classifying Relapse and Recovery . . . . . . . . . . . . . . . . . . . . . . 111

8.7.1 Identifying Relapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

8.7.2 Identifying Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

8.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8.8.1 Use and Efficacy of Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116

8.8.2 Implications for Forum Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

8.8.3 Implications for Addiction Treatment . . . . . . . . . . . . . . . . . . . . . . . . . 118

8.8.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

9 Conclusion 121

9.1 Contribution Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

9.2.1 Supporting the Methodological Process . . . . . . . . . . . . . . . . . . . . . . . 123

Interface Support for Thematic Analysis . . . . . . . . . . . . . . . . . . . . . . . 124

Improved Tools for Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

Mapping the Limits of the Crowd in PAT Annotation Tasks . . . . . . . . . . . . . 125

9.2.2 PAT Interface Design & Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Expose Aggregate Data to Users . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Support Data Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

Automatically Construct User Timelines . . . . . . . . . . . . . . . . . . . . . . . 126

9.2.3 Making the Leap to Medical Discoveries . . . . . . . . . . . . . . . . . . . . . . . 126

9.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

A ADEPT Supplementary Material 128

B F77 Purpose Supplementary Material 129

C F77 Drug of Choice Supplementary Material 130

D F77 Phase Supplementary Material 136

xii

List of Tables

4.1 Top 40 MedHelp forums ranked by total post count. A ◦ in the Stigmatized column de-

notes our conservative estimate of whether the condition represented by the forum carries

a stigma or is otherwise embarrassing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.1 Majority vote at the token level over RN responses. Terms identified by RNs as medically

relevant are shown in bold. Stopwords (e.g.,“and”, “of”) are excluded from the vote. . . . 49

5.2 Turker performance against the RN gold standard. Voting threshold indicates the minimum

number of Turkers who have to annotate a term as medically relevant for it to be included

in the result. Maximum column values are indicated in bold. A corroborative policy of 2+

votes yields high scores across the board, and maximizes F1-score. . . . . . . . . . . . 50

5.3 Annotator performance against the crowd-labeled data set and the gold standards. Maxi-

mum column values are indicated in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.4 Examples of ADEPT’s misclassifications in the test corpora. . . . . . . . . . . . . . . . . 56

6.1 Summary statistics of a representative sample of online health communities focused on

addiction recovery. We identified sites through Google searches and gathered statistics

(if available) from site pages. Data current as of 3/1/2014. . . . . . . . . . . . . . . . . . 67

6.2 Annotator-derived taxonomy for users’ objectives in initiating a post, with % prevalence in

the 1,000 post labeled sample on the right. Note that 1.) labels are mutually exclusive, 2)

“w/d” stands for “withdrawal”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.3 Descriptions and samples of taxonomy labels. Samples are synthesized in order to pre-

serve user privacy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

6.4 Classifier performance for labeling initiating posts as seeking informational support or

emotional support. Performance scores are averaged over 10 folds. . . . . . . . . . . . . 73

6.5 Classifier performance labeling posts as either update or non-update. Performance scores

are averaged over 10 folds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

xiii

6.6 Thomas’ Recipe (circa 2001) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

6.7 Thomas’ Recipe (circa 2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

7.1 DOC classifier performance across term categories. The classifier performs best on cor-

rectly spelled, specific drug terms; worst on general drug terms. . . . . . . . . . . . . . . 87

7.2 Examples of DOCs extracted by our CRF classifier. Identified SOA terms are shown in

bold in the context of their originating sentence, and the resolved drug name, generic

name and category are shown on the right. . . . . . . . . . . . . . . . . . . . . . . . . . 87

7.3 Summary of similarities and differences between our Forum77, NA, TEDS and DAWN

datasets. Forum77 is unique in that participation is always voluntary and that users report

only substances that they deem relevant. . . . . . . . . . . . . . . . . . . . . . . . . . . 88

7.4 Alignment of categories across the Forum77, NA, TEDS and DAWN datasets for compar-

ative purposes. Exact category terms from each survey have been preserved in this table

for replicability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

8.1 Addiction Phase Taxonomy derived via a thematic analysis. . . . . . . . . . . . . . . . . 103

8.2 Sample phase specific terms for the USING, WITHDRAWING and RECOVERING categories. 106

8.3 CRF performance scores aggregated over 10 runs of 10-fold cross validation, with ran-

domly shuffled input sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

8.4 Performance for identifying relapse events (top) and whether a user’s final state is RECOV-

ERING (bottom). Combined scores across classes are shown in bold. . . . . . . . . . . . 113

8.5 Comparison of activity features for users who are and are not RECOVERING in their last ini-

tiating post. Per-user values are aggregated over USING and WITHDRAWING posts. Statis-

tical significance is determined using Kruskal-Wallis tests (*** p < 0.001) after Bonferroni

corrections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

A.1 The following features are specified when training our CRF. Other features retain their de-

fault values as described at http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/

stanford/nlp/ie/NERFeatureFactory.html . . . . . . . . . . . . . . . . . . . . . . . 128

B.1 Features used to train our purpose classifiers, which distinguish emotional from informa-

tional support seeking, as well as update from non-update posts. . . . . . . . . . . . . . 129

xiv

C.1 Drug term resolution map, manually compiled from classifier output. The i column indi-

cates whether the drug category is included in our analysis in Chapter 7. . . . . . . . . . 130

C.2 The default feature list for Stanford’s NER classifier is at nlp.stanford.edu/nlp/javadoc/

javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html. Here, we list all features

whose default values were changed to train our DOC classifier. . . . . . . . . . . . . . . 134

C.3 Gazette of common substances used as a feature in the DOC classifier. This gazette was

compiled from a range of online resources. . . . . . . . . . . . . . . . . . . . . . . . . . 135

D.1 LIWC features for the three classes in the labeled dataset over initiating posts. Only

statistically significant variables are shown. Statistical significance is determined using

Kruskal-Wallis tests (* p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections

to adjust for family-wise error rate across all 184 variables (includes activity features).

Column c denotes (◦) if the feature is used in our CRF classifier. . . . . . . . . . . . . . . 136

D.2 LIWC features for the three classes in the labeled dataset. Only statistically significant

variables are shown. Statistical significance is determined using Kruskal-Wallis tests (*

p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise

error rate across all 184 variables (includes activity features). Column c denotes (◦) if the

feature is used in our CRF classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

D.3 Activity and content-based features for the three classes in the labeled dataset. Statistical

significance is determined using Kruskal-Wallis tests (* p < 0.05; ** p < 0.005; *** p <

0.001) after Bonferroni corrections to adjust for family-wise error rate across all 184 vari-

ables (includes 160 LIWC variables). Column c denotes (◦) if the feature is used in our

CRF classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138

xv

List of Figures

1.1 Our general methodological process. Nodes in grey show avenues for future work sup-

ported by our contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

4.1 Illustrative example of MedHelp and Forum77 content and structure. . . . . . . . . . . . 37

4.2 Summary statistics of Forum77 variables: post volume by month (A), user volume by

month (B), thread length distribution (C), user tenure distribution (D), user initiating post

count distribution (E), and user response post count distribution (F). . . . . . . . . . . . . 38

5.1 Final PAT medical term identification task instructions and interface. Turkers were informed

that their answers would be checked against other Turkers’ in the HIT description on the

MTurk interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

5.2 Sample sentences labeled by ADEPT, the dictionary, MetaMap, OBA and TerMINE. . . . 54

5.3 Term classification accuracy plotted against logged term frequency in test corpora. Purple

(darker) circles represent terms that are always classified correctly; blue (lighter) circles

represent terms that are misclassified at least once. A LOWESS fit line to the entire data

set (black) shows that most terms are always classified correctly. A LOWESS fit line to the

misclassified points (blue/lighter) shows that classification accuracy increases with term

frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.4 Top 50 terms, ranked by frequency, derived from MedHelp’s Arthritis forum as determined

by ADEPT (left) and OBA (right). Terms unique to their respective portion of the list are

shown in bold. Terms occurring in both lists are linked with a line. The gradient of these

lines show that all co-occurring terms, bar three, are more highly ranked by ADEPT. . . . 58

5.5 A graph showing important terms in Forum77 (nodes), and significant co-occurrence rela-

tionships between them (edges). Node size is proportional to degree, while colors indicate

clusters. Node labels are omitted for legibility; instead, we examine main clusters in-depth

in subsequent figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

xvi

5.6 The largest cluster in Figure 5.5 suggests that discussions frequently involve detoxification

from prescription drugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.7 The second-largest cluster in Figure 5.5 suggests that discussions frequently pair specific

drugs and the withdrawal symptoms that they cause. . . . . . . . . . . . . . . . . . . . . 60

5.8 The third-largest cluster in Figure 5.8 contains medically relevant terms from Thomas’

Recipe: a user-developed schedule for medication-assisted opioid withdrawal. . . . . . . 61

6.1 Thematic analysis process. Orange edges indicate the iterative component of the analysis. 70

6.2 Normalized transition probabilities and average transition times between consecutive up-

date and non-update posts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

7.1 Drug of choice distributions (% of population using) across the Forum77, TEDS, NA and

DAWN data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

7.2 Prevalence of major opioids in the Forum77 population over time. . . . . . . . . . . . . . 92

8.1 Illustration of how sequence analysis can (1) reduce NA labels by leveraging context from

surrounding posts, and (2) capture relapse events in regressive sequences without requir-

ing the user to explicitly state that she relapsed. . . . . . . . . . . . . . . . . . . . . . . . 104

8.2 Confusion matrix for our CRF classifier aggregated across 10 randomized runs of 10-fold

cross validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

8.3 (a) Normalized transition frequencies between addiction phases (e.g., USING → RECOV-

ERING edges comprise 1.12% of the total transitions in the CRF-labeled data) and (b)

conditional transition probabilities (e.g., the probability of a user moving from USING to

RECOVERING is 4.57%.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

8.4 Distributions of phase lengths. A red bar indicates the median value, while the dark blue

region indicates the middle spread. The light blue region indicates values that fall within

1.5 ∗ the interquartile range of the middle spread. . . . . . . . . . . . . . . . . . . . . . . 112

8.5 Aggregated user transitions from start to end state. Bar widths denote population propor-

tion. For example, 48% of users in our sample relapsed during their tenure on Forum77. 114

9.1 Our general methodological process. Nodes in grey show avenues for future work sup-

ported by our contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

xvii

Chapter 1

Introduction

Just keep in mind that whether you recommend online support groups or not, your patients

will use them. There’s no getting around the fact that certain patients in your practice will be-

come as knowledgeable about their conditions as they can. They will also begin to develop

clinical judgment on their own.

– Deborah Grandinetti: Doctors and the Web. Help your patients surf the Net safely [104].

1.1 Overview & Focus

The Internet has revolutionized the way in which people interact with medical knowledge, transforming

its availability, leveling the playing field in terms of who can contribute such knowledge, and facilitat-

ing connections between people with shared health concerns. While to this day accessing and sharing

medical knowledge via traditional resources (e.g., medical practitioners, textbooks, pamphlets, etc.) re-

quires overcoming financial, scheduling and geographic barriers, such frictions are divorced from online

resources. Indeed, the use of the Internet as a health resource is one of its earliest functions: with the

commercialization of the Internet in 1995, patients readily took advantage of the ability to collaborate

with others who shared their health concerns, and the first online health communities (OHCs), in the

form of listservs, came into existence.

Demand for such groups remains high today. Pew’s 2013 Health Online survey [91] reported that

59% of U.S. adults looked online for health information in the last year, and that of these, 16-18% specif-

ically sought to find others who shared their health concerns. Based on the U.S. Census Bureau’s

population estimate for 2013 [9], this comprises some 50-57 million people. Today, thousands of OHCs

1

CHAPTER 1. INTRODUCTION 2

exist, and while their interfaces have become slightly more sophisticated, their underlying functionality of

connecting patients with mutual health interests remains unchanged.

Through participation in online mutual help groups, patients can spend a sizable number of hours

performing complex, health-related tasks. These include differential diagnoses (of either their own or

someone else’s condition), treatment comparison and evaluation, symptom measurement and docu-

mentation, and seeking and providing emotional support. To perform these tasks, patients draw on a

variety of resources: their own experiential knowledge, observations of other community members’ ex-

periences, information sourced from healthcare providers, and the fruits of self-directed research efforts.

The culmination of this effort is a massive, and growing, corpus of data contributed by patients who have

gained not a small degree of clinical expertise in their own condition. Although in some cases these data

are structured (e.g., PatientsLikeMe1 and CureTogether2 collect symptom severity measurements on nu-

merical scales), for the most part OHCs have barely deviated from the original listserv format, meaning

that a large portion of these data exist as free-form text.

We term any medical text authored by patients patient authored text (PAT). PAT contains inherently

valuable content. Foremost, PAT uniquely documents patients’ behavior outside of the clinical environ-

ment. As such, it can host insight into topics that remain obscure in traditional medical data sets, such

as why patients follow only certain parts of a treatment protocol, or how people self-manage conditions

that carry a stigma in the medical profession, like addiction [176,187]. Answers to such questions could

have high-level policy impacts on healthcare systems, potentially affecting both their efficiency and effi-

cacy. PAT may also contain data of immediate medical value. Prior work has leveraged PAT to identify

disease trends [33, 41] and adverse drug events [257]. Through active collaboration, OHC participants

have uncovered novel insights into disease co-morbidities (such as a correlation between asthma and

infertility [40]) and drug-treatment effects (such as the questionable efficacy of lithium as a treatment for

ALS [260]) which have been replicated in subsequent medical trials [97,260]. Finally, medically-relevant

data derived from PAT could be used to both enhance community design as well as support members in

tasks that they already perform, such as polling treatment popularity or sourcing drug reviews.

In spite of the inherent value in PAT and the enormous number of human-intelligence hours invested

in its creation, attempts to leverage PAT have been limited for three main reasons. First, PAT is notoriously

noisy and often incomplete, making it challenging to work with. For example, the fact that authors may

have only partial mastery over medical terminology casts the accuracy of their symptom descriptions

1http://www.patientslikeme.com2http://www.curetogether.com


into doubt. Moreover, they may omit important information and their contributions may be infrequent and

irregular.

Second, and closely related, is the dearth of methods, approaches and toolkits for extracting medically-

meaningful data from PAT. Take, for example, the basic problem of identifying medically-relevant terms

in PAT. While well-established toolkits for extracting medical terms from text authored by medical experts

exist, as we show in Chapter 5 their performance on PAT is sufficiently poor that the resulting output is

of dubious analytic value.

Third, the question of whether PAT contains data of medical relevance is contentious. As we discuss

in detail in Chapter 2, medical professionals especially take issue with such claims. Even taking an open-

minded perspective, however, the medical relevance of PAT in relation to a specific research question is

usually unclear, and must be determined empirically. This relevance tends to depend on how well the

research question aligns with users’ motivations for authoring PAT. For example, because people mention

their influenza-like symptoms on social media platforms, Twitter is a viable data source for monitoring

influenza outbreaks [10, 15, 62, 213]. However, Twitter would be a poor data source for comparing drug

dosage efficacy, because people do not consistently tweet drug dosages, schedules, and self-reported

wellness metrics. Determining what medically relevant signals are present in PAT is a challenge separate

from extracting them.

Our goals in this work are twofold: first, to develop methods for extracting a variety of medically-

relevant data from PAT. Second, to uncover medically-meaningful insights through the application of

these methods. To this end, we focus specifically on the topic of addiction, studying Forum77: Med-

Help’s 3 online health community for Addiction & Substance Abuse. Addiction is both highly prevalent,

affecting 16% of Americans ages 12 or older (about 40 million people), which far exceeds the num-

ber of people afflicted with heart disease (27 million), diabetes (26 million), or cancer (19 million) [4],

and highly stigmatized, even within the medical profession [176, 187]. These facts conspire to make

addiction-related PAT a rich source for novel and impactful insights.

Our work draws from and contributes to several fields in Computer Science. From the Human Com-

puter Interaction perspective, we investigate crowdsourcing as a method for large-scale data annotation,

and leverage methodological work on thematic analyses to develop taxonomies of medically relevant

information contained in our PAT data sources. From the Computer Supported Cooperative Work per-

spective, we investigate the types of support that users give and receive, and analyze on-site behavioral

3http://www.medhelp.org


and content features that correlate with successful and unsuccessful participatory outcomes in Forum77.

On the Natural Language Processing side, we evaluate the application and extension of existing statis-

tical classification methods to a variety of PAT information extraction tasks. Finally, to guide the validity

of our work from a medical perspective, we collaborated closely with an addiction specialist: a practicing

psychiatrist who specializes in the topic of substance use disorders.

1.2 Contributions

In concert, this thesis contributes a viable, multi-stage approach for finding and extracting data of medi-

cal relevance from PAT. The specific contributions of this thesis are:

Targeted literature reviews that serve both to illuminate the landscape of related work as well as con-

textualize our own work. In particular, we review:

Online health seeking behavior: via a cross-disciplinary literature review, we first synthesize an

overview of the demographics, methods and motives of people who seek health information online.

Next, we narrow our focus to the specific topic of OHC participation, exploring users’ reasons for

participation as well as whether and how such participation is beneficial (Chapter 2).

Prior work analyzing patient authored text: we conduct an extensive review of literature utilizing

PAT as a primary data source, including work on pharmacovigilance, syndromic surveillance, entity

extraction and thematic analyses (Chapter 3). To our knowledge, this review is the first compre-

hensive synthesis and summary of data sources, methods, goals and outcomes of prior work that

utilizes PAT as a primary data source.

Methods for extracting medically-relevant data from PAT. Our characteristic methodology, illustrated

in Figure 1.1, moves through human categorization and labeling of data to automatic extraction and

analysis. Accordingly, our methods comprise multiple stages, including inductive content analysis, data

annotation, feature engineering, classifier training and result analysis. Our specific contributions are:

A method for crowdsourcing medically-relevant term annotation in PAT. Having medical ex-

perts annotate data is both costly and slow. We show that for the task of identifying medically-

relevant terms in PAT, a crowd of non-experts yields annotations comparable in quality to those

submitted by medical professionals (Chapter 5).


Data-driven annotation rubrics describing what users seek when they initiate posts on Forum77

(Chapter 6), as well as the phases of addiction that users exhibit on Forum77 (Chapter 8). These

rubrics, educed via thematic analyses of Forum77 content, serve as novel contributions in their

own right as well as reusable guides for data annotation.

A novel analysis of behavioral and linguistic features that correlate with each phase of ad-

diction. The results of this feature space analysis (Chapter 8) give novel insight into how the

psychologically and physiologically distinct phases of addiction correspond with Forum77 users’

behavior and linguistic usage. They are also a valuable resource for feature design and engineer-

ing.

Trained classifiers that accurately extract medically-relevant data from PAT. We train classi-

fiers that accurately extract medically-relevant terms (Chapter 5), addictive drugs of choice (Chap-

ter 7), phase of addiction at the time of writing (Chapter 8) and the type of support that a user is

seeking when she initiates a thread (Chapter 6) from PAT. These classifiers are novel in function.

We make them freely available to support future work and comparisons in this area.

Labeled Data (auto)

Medical Discovery

PAT interface design

application

Content Schema

Labeled Data

(human) Classifier Features PAT

close reading

annotation training

Insights Process-ed Data

schema revision

processing & analysis

tuning

Future Work

108 Figure 1.1: Our general methodological process. Nodes in grey show avenues for future work supportedby our contributions.

Medically-relevant insights on Addiction. Our classification methods allow us to scale our analyses

to the entire Forum77 population. Some of the resulting insights are, to the best of our knowledge, novel

to both the Computer Science and the Addiction literature. These insights include the discovery that:


Users actively collaborate on developing highly effective medication-assisted withdrawal

treatment protocols. The most prevalent example of this is Thomas’ Recipe, a detailed protocol

for medication-assisted opiate withdrawal that has evolved on Forum77 over the course of several

years (§ 6.8.1).

The Forum77 population is comprised almost entirely of people struggling with prescription

opioid abuse, making it strongly distinct from traditionally surveyed drug-using populations. Our

results evidence that such populations are not well covered by existing medical research methods.

While relapse is common, chances of a user leaving Forum77 in the state of RECOVERING

are favorable. Although different methodological approaches make comparison with real-world

treatments difficult, our results suggest that Forum77 is an effective self-detoxification resource.

Active participants are more likely to leave Forum77 in a state of RECOVERING. Such users

participate significantly more frequently than those who leave in a state of ¬ RECOVERING, even

when they are USING and WITHDRAWING. This resonates with prior research that shows that

increased participation in the traditional mutual help group Alcoholics Anonymous correlates with

sustained sobriety [190,223].

1.3 Outline of Thesis

Chapters 2-4 serve to contextualize our work and give the reader a framework for reference and eval-

uation. Chapter 2 presents a targeted literature review of online health information seeking. We begin

with a broad overview of online health information seeking (§ 2.1) before focusing on the question of who

participates in OHCs, their motivations for doing so, and the associated benefits and pitfalls of partici-

pation (§ 2.2). Chapter 3 begins with a definition of PAT accompanied by a discussion of its values and

the challenges that it presents (§ 3.1). Next, we synthesize prior work that utilizes PAT as a primary data

source, including syndromic surveillance (§ 3.2), pharmacovigilance (§ 3.3), Named Entity Recognition

(§ 3.4) and Thematic Analyses (§ 3.5). Chapter 4 describes the data sets that we use in our work: the

MedHelp corpus (§ 4.1), which includes the Forum77 data set, and the CureTogether corpus (§ 4.2).

While PAT contains a wealth of information, it is inherently noisy, and requires text mining techniques

to extract data of value. In Chapter 5, we address one of the most basic problems of this sort: identifying

medically-relevant terms in PAT. After discussing related work (§ 5.2) and data preparation (§ 5.3), we


explore the feasibility of replacing experts with non-expert crowds in medical term annotation tasks

(§ 5.4). Next, we show that a conditional random field (CRF) model trained on crowd-labeled data

dramatically outperforms state of the art medical term annotation tools (§ 5.5). Finally, we demonstrate

the effectiveness of our approach through applying our classifier to large PAT corpora (§ 5.6). While

our results demonstrate the efficacy of our approach, we find that the extracted data are too broad for

deriving insights on specific medical conditions. We narrow our focus to the topic of addiction, one of the

most urgent public health issues of the day.

Understanding why people participate in Forum77 is a precursor to more targeted analyses. Chap-

ter 6 poses the question, “what do people seek on Forum77?”. We first motivate studying the topic of

addiction (§ 6.1), before discussing related work (§ 6.2) and data preparation (§ 6.3). Next, we present the

process and result of a thematic analysis of users’ motivations for initiating Forum77 discussions (§ 6.5).

Congruent with prior work, driving motivations are the seeking of informational and emotional support.

In terms of informational support, we find that users primarily seek explicit medical advice on prescrip-

tion opioids. In the emotional support category, the update post, in which users log their progress but

request no feedback, is highly prevalent. We train machine learning classifiers to distinguish emotional

from informational support-seeking (§ 6.6), as well as update from non-update posts (§ 6.7). Finally, we

present and discuss the results of applying our classifiers to the entire Forum77 data set (§ 6.8 & § 6.9).

Chapter 7 establishes whether the Forum77 population is similar to traditionally surveyed drug-using

populations in terms of drugs of choice (DOCs). We first discuss related work (§ 7.1) as well as our

data preparation and sampling (§ 7.2). Next, we present our method for automatically extracting users’

DOCs from Forum77 initiating posts (§ 7.3), which comprises data annotation, classifier training and term

resolution. We then detail how we compare our classifier-derived Forum77 DOC distribution with those

from three traditionally-surveyed drug-using populations (§ 7.4). Among other things, our results (§ 7.5)

indicate that Forum77 is used primarily by people struggling with prescription opioid use disorders, rather

than by people using traditionally-abused substances such as alcohol, cocaine and marijuana (§ 7.5).

Finally, we discuss the implications and opportunities revealed by these results (§ 7.6).

Chapter 8 focuses on the topic of the cycle of abuse, a well-known concept whose stages and

transitions, to the best of our knowledge, have never been quantified. Drawing on the addiction literature,

we first describe the phases of drug abuse and define key terminology (§ 8.2), and then describe our data

preparation and sampling (§ 8.3). Next, building on the well known Transtheoretic Model for Behavioral

Change [203], we develop a taxonomy describing the phases of addiction as they are expressed on


Forum77 (§ 8.4). We then analyze a variety of behavioral and content-based features in order to identify

features that discriminate between the phases USING, WITHDRAWING and RECOVERING (§ 8.5). Next,

we present our statistical classifier for identifying addiction phase (§ 8.6), and discuss how this enables

us to identity important sequences in the process of addiction, such as relapse and recovery (§ 8.7).

Aggregating these events across the entire Forum77 membership base indicates, amongst other results,

that although relapse is common, reaching a state of RECOVERING prior to leaving the forum is likely

(§ 8.7.3).

In Chapter 9, we reiterate the main contributions of this thesis (§ 9.1), and outline challenges for

future work (§ 9.2), and offer our concluding thoughts (§ 9.3).

Chapter 2

The Internet and Health

Millions of people around the world seek health information online, and have been doing so since the

earliest days of the Internet [166]. But who are these people, and what do they seek? Our goal in this

chapter is to provide readers with a contextual backdrop against which to interpret our work. Drawing

on prior work from Computer Science, Medical Informatics and Medicine, we first describe online health

information seeking in general (§ 2.1), beginning with an historical overview before investigating what

kinds of information people seek, who seeks this information, and where. Next, we focus on a specific

subset of online health information seeking: online health community (OHC) participation (§ 2.2). We pay

particular attention to who participates, their motivations for doing so, and potential benefits associated

with participation. Finally, we summarize our findings (§ 2.3) before moving on to a literature review of

prior work utilizing PAT as a primary data source (Chapter 3).

2.1 Online Health Information Seeking

2.1.1 Historical Overview & Current Landscape

When the Internet was commercialized in 1995 [120], widespread consumer adoption brought with it

widespread supply and demand for health information [49]. The Internet made health information more

accessible. An example illustrates: between 1997-1998 the National Library of Medicine (NLM) made

Medline1, a repository of journal citations and abstracts from the biomedical literature previously only

available to medical professionals, publicly accessible online. The number of queries to Medline in-

creased almost threefold, from 7 million to 120 million, with more than 30% of new queries stemming

from consumers [49]. In response, the NLM launched MedlinePlus2, a site hosting information targeted

1https://www.nlm.nih.gov/bsd/pmresources.html2http://www.nlm.nih.gov/medlineplus

9

CHAPTER 2. THE INTERNET AND HEALTH 10

specifically at patients and their families [49]. The move was a roaring success: in the first quarter of

1999, MedlinePlus had 62,638 unique visitors. Since then, this statistic has only increased: in the third

quarter of 2013, the site had ∼81,000,000 unique visitors [172].

In addition to making health information more accessible to consumers, the Internet also broadened

the scope of potential contributors: for the first time, health information could be easily sourced from and

exchanged between patients. Widespread, patient-driven mutual help efforts unfolded simultaneously

with the commercial web. As early as 1997, Salem et al. [215] published an analysis of an online mutual

help group for depression; their study covered 2 weeks’ worth of data and comprised 533 participants.

Even earlier, in 1996 Mayer and Till [166] published a short, interview-based study of a breast cancer

listserv allegedly utilized by thousands of patients. Today, a full 8% of Internet users in the U.S. report

either sharing a personal health experience or posting a related question online [91].

The revolution in how health information was created and shared was received primarily positively by

consumers and sociologists, who celebrated its potential for “democratizing” healthcare and rebalancing

the power dynamic in doctor-patient relationships [182]. The reaction from the medical community was

substantially more turbulent. Early research on online health information seeking raised concerns about

the quality of the information available, as well as patients’ ability to evaluate it critically [49, 156, 181,

182, 199, 210]; some even described the phenomenon as an “epidemic of misinformation” [51]. Indeed,

discussion in the medical literature at the time communicates a strong resistance to the idea of patients

pursuing medical knowledge outside the purview of a medical professional [104, 182]. For example, in

2000 the Journal of Medical Economics initiated a series of articles aimed to educate doctors about

online resources so that they, in turn, could guide their patients through the plethora of available online

health information resources. The first article in the series is titled, “Doctors and the Web: Help your

patients surf the Net safely” [104].

Despite these concerns, analyses of online health seeking behavior indicates that patients are, in

fact, highly skeptical of information presented online and take care to evaluate it critically [21, 105, 156,

178, 182, 205, 209, 225]. Patients tend to mistrust information from websites that appear to be primarily

commercial [92, 182], have unclear sources of information [92], or that seem unprofessional or highly

opinionated [182]. Moreover, rather than taking a single source at face value, patients typically evaluate

information quality by aggregating information from multiple sources [82, 92, 182, 205], and even posing

and testing hypotheses from one information source to the next [225]. That said, online health seekers

are not infallible: cyberchondria – the escalation of a user’s perception of the severity of her medical


state as a result of researching it online – has been provably documented, and results in increased

stress levels and potentially unnecessary use of available medical resources [254,255].

Measuring the quality of online health information is challenging. Prior work finds that information

accuracy tends to be high [25, 80]. For example, in an independent evaluation of 4,600 posts on The

Breast Cancer Mailing List3, Esquivel et al. [80] found only 10 (0.22%) posts containing misleading or

incorrect information. Of these, 7 were identified as such by participants and corrected within 4.5 hours.

However, the majority of studies from the medical domain conclude that online health information is of

subpar quality [21,83]. A common point of failure cited is whether the information is “complete” (covers all

medically-relevant details). However, the value of the completeness metric has been called into question:

first, including all relevant medical information might comprise information overload for readers [83].

Second, as patients typically synthesize medical information from a variety of sources [82,92,182,205],

they are likely robust to this. Patients themselves report that in general they have no trouble finding the

information that they need online [92,105].

Despite this, strong resistance, and even condescension, from medical professionals is a common

response to the idea of patients pursuing medical knowledge online. “Many of the participants reported

symptoms that they attributed to using a computer keyboard, so it appeared incongruous that they turned

for help to an activity that required more typing”, quip Culver et al. [64] in an evaluation of an online

health community on Carpal Tunnel Syndrome. Yet even amongst surveyed physicians, there is general

agreement that the result of patients pursuing medical information online is rarely harmful, and in fact can

be moderately beneficial [181, 199]. One explanation may be that the public dissemination of medical

knowledge, which was previously exclusive and difficult to access, challenges medical professionals’

dominance as medical experts [116]. Indeed, many physicians who feel that online health information

seeking negatively impacts the doctor-patient relationship also feel that their patients are challenging

their authority [11, 181]. Today, almost 20 years later, most research agrees that the nature of the

patient-doctor-internet relationship remains in flux, with resistance from the medical field barring potential

synergies from reaching fruition [14,121].

3http://www.bclist.org


2.1.2 What Health Information Do Users Seek Online?

Despite the concerns echoed in the medical literature, patients seem disinclined to stage a cyber coup

d’etat against the medical profession. In fact, with the exception of teens [105,209], patients rarely con-

sider the Internet their primary or most important source of medical information [82, 165, 200]. Rather,

information acquired online tends to supplement or complement that acquired through traditional chan-

nels [82, 149, 209], and is often sought for the express purpose of discussing it with a medical practi-

tioner [49, 92, 181, 205, 225]. Moreover, patients have preferences over which types of information they

would prefer to acquire online: respondents to Pew’s 2010 Peer-to-peer Healthcare Survey [90] said

that they would prefer to communicate with medical professionals for information regarding prescription

drugs and alternative treatments, an accurate diagnosis, and recommendations for other medical profes-

sionals and medical facilities. Peers and professionals were rated as equally helpful for practical advice

for day-to-day coping, and peers were rated most helpful for emotional support and quick remedies for

non-urgent, everyday health issues.

Major categories of online information sought by patients include finding disease-specific informa-

tion [49, 91], finding information about particular medial treatments or procedures [91]; and attempting

to diagnose or treat a new condition [49, 91]. In fact, Pew’s 2013 Health Online survey found that 35%

of American adults tried to diagnose a condition using information found online; of these, roughly half

followed up with a medical professional [91]. Cartright et al. [42], who analyzed user search logs sur-

rounding self-diagnosis attempts, observed two patterns: evidence-based searching, in which users

searched for a condition that matched a set of symptoms and risk factors, and hypothesis-based search-

ing, in which given a specific condition, users searched for symptoms and risk factors associated with that

condition. Minor categories of health information sought online include finding information about health

insurance, food and drug safety recalls, interpreting medical test results, information on weight loss [91],

and finding reviews on medical professionals or medical facilities [49]. Finally, an estimated 16-18% of

online health seekers go online specifically to find others who share their health concerns [90,91].

2.1.3 Who Seeks Health Information Online?

Early proponents of the Internet as a health information resource touted its potential as a liberating

technology for those with limited access to traditional health resources [182]. In some ways this is

true: online health information seeking seems to be need-driven, with those suffering from chronic or


stigmatized conditions more likely to seek health information online. However, survey-based research

also points to a strong “digital divide” between those who have access to, and are comfortable using

the Internet as a determinant of who searches for health information online. We discuss discriminating

features in detail below.

Gender

Women are more likely to seek health information in general [57], and this trend is mirrored online [37,57,

90–92, 165] despite the fact that men and women have equal access to the Internet [91]. Pew’s Health

Online [91] survey in 2013 estimated that while 53% of all U.S. male adults look for health information

online; the corresponding statistic for U.S. female adults is 64%. Extrapolating from the 2013 U.S.

Census results [9], approximately 55% of online health seekers are female.

In a survey exploring online health information seeking in 2000, Fox & Rainie [205] describe several

differences between men and women’s health seeking behavior. First, while both men and women are

equally likely to search for information in relation to a parent or older relative, women are twice as likely to

search for information on behalf of a child. This is likely a residual of the fact that women spend more time

on child care [192]. Finally, women are more likely to search for information related to specific conditions

(either physical or mental), while men are more likely to search for information related to sensitive topics

and for information on treatment timelines and administration [205].

Age

Studies measuring the age distribution across online health information seekers report that it is relatively

uniform among adults until the age of 65, at which point it declines [37, 91, 165]. This is contrary to the

fact that health needs generally increase with age, and stands in contrast to the age distribution over

offline health information seekers, who tend to be older (mean age 40 vs. 52) [57]. Both Cotten et

al. [57] and Bundorf et al. [37] hypothesize that this discrepancy is due to the fact that younger people

have more access to and experience with using the Internet. In fact, health information seeking is one

of the most common and important online activities for young people [105, 209]. A random-dial survey

of 1,209 respondents aged 15-24 initiated in 2002 by healthcare provider Kaiser [209] found that 75% of

respondents had looked for health information online: more than had downloaded music (72%), played

games (72%), shopped online (50%) and participated in chat rooms (67%). In fact, many young people

consider the Internet to be their primary source of health information [105].


Health

People suffering from chronic conditions (e.g., asthma, diabetes etc.) [37,90] and people suffering from

stigmatized conditions (e.g., anxiety, herpes, addiction) [24, 67] are highly likely to seek health informa-

tion online. A casual inspection of our own MedHelp data set (described in Chapter 4) corroborates

this: 8 of the top 20 forums focus on stigmatized or otherwise embarrassing conditions including addic-

tion, Hepatitis C, STDs and HIV (see Table 4.1). Other health characteristics that correlate with online

health information seeking include experiencing a medical crisis within the past year [90], experiencing

a significant change in physical health (e.g., weight loss/gain, smoking cessation) [90], having a rare

condition [90], and having significant barriers to health care (e.g., expense, travel distance) [37].

This suggests that online health seeking behavior is need-driven; however, other evidence also points

to a digital divide: people are more likely to seek health information online if they have health insur-

ance [91] and a regular healthcare provider [165]. Finally, online health seekers self-report as being

healthier than their offline counterparts [57].

Race

Pew’s 2013 Health Online survey [91] reports that 83% of Caucasian adults go online: significantly more

than adult African Americans (74%) and Latinos (73%). Therefore, at a population level, significantly

more Caucasians search for health information online. In a study of online health information seeking

in youth, Rideout et al. [209] observe the same phenomenon, noting that fewer African American and

Hispanic youth in their survey had Internet access at home.

Controlling for adults who use the Internet shows no significant differences in ethnicity between those

who search for health information online and those who do not. In addition, Cotton et al. [57] find

no significant differences in ethnicity between online and offline health seekers. However, Pew’s 2013

Health Online survey [91] highlights some statistically significant, ethnicity-based differences in what

kind of information people seek. For example, Caucasians are more likely than African Americans and

Latinos to look online for a diagnosis and for information pertaining to a specific disease/condition, and

are less likely to search for information on weight loss. African Americans are more likely to conduct

online research on a drug seen in advertising, while Latinos are more likely to search for information on

pregnancy.


Socio-Economic Status & Education

Online health seekers tend to have higher income levels than those who do not seek health information

online [57,74,165]. In addition, higher levels of education correlate with online health seeking [57,74,91,

165]. This again suggests a digital divide, with those who have ready Internet access being more likely

to use it as a health information resource. However, o work points out that literacy and language barriers

can prevent people from engaging fully with online health resources [25,49].

Role (Patient vs. Caregiver)

Queries conducted on behalf of someone else (e.g., a child, a parent or other older relative, or a

friend) comprise roughly 50% of all online health inquiries [90, 91]. Usually such “caregivers” are ei-

ther women [205] or parents [91] (or both).

2.1.4 Where Do People Find Health Information Online?

There are myriad ways of accessing health information online. We highlight those most often discussed

in related work.

Search Engines The majority of online health information quests start at a search engine such as

Google4, Yahoo5 or Bing6 [82, 91, 114, 178]. Users iteratively refine their queries based on search re-

sults [82,114], and in the majority of cases are successful in finding the information that they are looking

for [92,114].

Medical Information Portals Sites such as WebMD7 and MedlinePlus8 serve as medical information

portals and are heavily utilized [172]. However, it is rare for online health seekers to have a favorite or

“go-to” information portal [92], and they are rarely the starting point of a user’s search [82].

Online Health Communities Online health communities (OHCs) provide an interactive environment

in which users can seek others familiar with their health concerns and acquire tailored information.

These groups provide social support, information and shared experiences, and can be empowering

4http://www.google.com5http://www.yahoo.com6http://www.bing.com7http://www.webmd.com8http://www.nlm.nih.gov/medlineplus


for patients [49]. Prior work indicates that a significant proportion of online health seekers ultimately

participate in an OHC, with estimates ranging from 8% [91] to 16% [90] to 25% [49]. We discuss OHC

participation in depth in the next section.

2.2 Online Health Community Participation

Having outlined the landscape of online health information seeking in general, we now turn to the spe-

cific topic of online health community participation. Where possible, we expand on any relevant details

introduced in § 2.1. We briefly discuss modes of participation (§ 2.2.1), before addressing the question

of who participates in OHCs (§ 2.2.2), why (§ 2.2.3), and what measurable benefits may result from their

participation (§ 2.2.4).

2.2.1 Modes of Participation

OHCs typically comprise environments in which users communicate via posted messages. There are

three primary forms of participation on an OHC: users start new discussions by contributing initiating

posts, and respond to existing discussions with response posts. The third, much overlooked, mode of

participation is lurking, in which users read community-generated content, but never contribute or make

their presence known in any way. Lurking is prevalent in all kinds of online communities [185, 202],

although possibly less so in health-oriented OHCs [186]. Prior work suggests that lurkers’ demographics

and motivations for participating align closely with those of active OHC participants [202]. Moreover,

lurkers and active members derive the same benefits from OHC participation [246]. As defining and

measuring lurking behavior is challenging, we do not discuss it further in our own work, but note here

that capturing lurking behavior is an important avenue for future work.

2.2.2 Who Participates in OHCs?

Demographic analyses of OHC participants similar to those offered in § 2.1.3 are scarce. Unlike the

problem of general health information seeking, OHCs focus on specific medical conditions, many of

which correlate with particular demographic factors. For example, people suffering from breast cancer

tend to be female, and people suffering from Alzheimer’s tend to be older.

However, in concert with research on online health seeking behavior [24], Davison et al. [67] find

that social factors that predict for face-to-face support group seeking correlate with those that predict for


online support group seeking. Specifically, conditions that are embarrassing, stigmatized, or disfiguring,

as well as conditions in which a patient’s attitude towards the condition is important in treatment outcome,

lead people to seek the support of others with similar conditions online.

2.2.3 Reasons for Participation

A user’s overarching goal in joining an OHC is to align herself with other people who share her health

concerns [90, 96, 259]. A great deal of literature examines patients’ perceived benefits to OHC partici-

pation. Results tend to fall into one of three categories: (1) medium-based affordances, in which users

cite practical advantages related to the fact that OHCs are online, digital resources; (2) informational

support; and (3) emotional support. We discuss each of these in detail below.

Medium-Based Affordances

By nature of being online and digital, OHCs have several unique characteristics that users view as

advantageous, such as the convenience of having the community be available around the clock [49,60,

162, 205, 275]. Other factors cited include providing access to a wide range of people, information and

experiences [162, 205]; the fact that such information is personalized or tailored [49]; the ability to store

and edit personal narratives [117,162]; and the perception of privacy and anonymity on OHCs [49,105,

205, 270, 275]. Users’ ability to conceal their true identities has also been credited with increasing their

propensity to discuss issues that they would not discuss face-to-face [21,105,149]. Finally, OHC content

is easily searchable, making it easier for patients to browse and filter for suitable people to approach for

help. In an analysis of PatientsLikeMe, Frost et al. [96] conclude that searching for similar users is the

primary motivation behind patients’ sharing their data with each other.

Informational Support

The two most cited benefits of OHC participation are the information and emotional (sometimes called

“social”) support given by the community [36, 47, 86, 122, 131, 148, 149, 162, 211, 243, 250, 258]. Infor-

mational support constitutes the exchange of clinical as well as experiential knowledge relevant to a

particular condition. Typical topics of discussion include treatments and treatment options [47, 96, 258],

symptoms [96,258], preventive care [47] and condition outcomes [47,96]. Patients seek this information

for several reasons, including learning what to expect in the future and how to plan for it [47], informing

decision making (especially related to treatment options) [47, 122], informing day-to-day care/everyday


illness management (coping strategies) [60, 90, 122, 131], advice on managing interactions with others

(e.g., from healthcare professionals to colleagues to family) [122], and often for simply acquiring a better

understanding of their condition [47, 122, 149, 258]. As such, OHCs are often a source of information

distinct from and complementary to that typically acquired via medical practitioners.

Emotional Support

In addition to being valuable sources of personalized informational support, OHCs provide users with

an accepting and safe space to vent emotions or discuss uncomfortable topics [149, 243]. Participation

provides users with a means of articulating and making sense of their experience, which they find em-

powering [131,173]. Patients also receive positive affect, encouragement and sympathy from their fellow

community members [60, 131]. Continued participation over time may result in patients taking on new,

supportive roles [164] as well as developing increased optimism towards their situation [211]. OHCs also

provide patients managing serious conditions with unique types of emotional support that are difficult to

acquire elsewhere. For example, patients find that sharing with people like them partially relieves the

burden of care placed on family members who, despite their best intentions, cannot empathize with the

patient’s experience [162, 243]. In addition, patients find that while family and friends tend to try to nor-

malize their (the patient’s) emotions – even when they are inappropriate – online communities challenge

users on inappropriate emotional behavior [243].

2.2.4 Efficacy of Online Health Forums

While patients perceive many benefits to participating in OHCs, measuring the effect of participation on

their health outcomes is difficult, and raises the question of what metrics really matter in health manage-

ment. Would we consider OHC participation effective if it altered disease outcome, or shortened time to

recovery? What about if it imparted a sense of control and wellbeing on patients, improving quality of life,

even if it had no effect on prognosis? Although OHC efficacy is difficult to define, participation has been

shown to promote effective disease management strategies [93,131,148,211], and impart psychosocial

benefits, such as improved ability to cope [148,150,179], improved mood/decreased distress [158,211],

and improved stress management [211]. Moreover, some studies report measurable beneficial effects

on symptoms. Houston et al. [130] found that increased participation in a depression-oriented OHC cor-

related with likelihood of users experiencing a resolution in their condition. Lieberman et al. [158] found

that cancer patients who participated in OHCs reported a decrease in physical pain. However, they note


that it is impossible to tell whether this was due to emotional suppression on behalf of their subjects: a

conundrum afflicting the measurement of any subjective symptom.

In general, then, research points to OHC participation having beneficial effects for patients. However,

the jury is still out when it comes to conditions in which negative behaviors are enabled through social in-

teraction with similar patients [21]. While some research finds that OHC participation provides increased

protection and motivation for continuing these behaviors, others conclude that the overall experience may

be a more positive way of dealing with the condition than traditional methods [21,89,179]. For example,

Wilson et al. [261] found that patients learned new binging and purging techniques on both pro-eating

disorder sites9 and pro-recovery sites. However, while they found no significant difference in final health

outcomes between the two groups, users of pro-eating disorder sites experienced a significantly longer

illness duration [261]. On the other hand, group bonds forged through shared secret identity may render

participants less likely to reveal their condition to others, potentially increasing the likelihood that they

will not seek appropriate help [98].

2.3 Summary

Our goal in this chapter was to provide a general overview of the landscape of online health informa-

tion seeking. Beginning with an historical overview (§ 2.1.1), we noted that the advent of the Internet

both made health information more accessible, and made it possible for anybody to contribute health

information online. From patient’s perspective, this was a largely positive improvement, and a great deal

of research supports the notion that little harm, other than cyberchondria, arises from online health in-

formation seeking. The medical community, however, remains somewhat opposed to people pursuing

health information outside of the purview of medical professionals.

In general, online health seekers search for information on specific diseases and diagnoses (§ 2.1.2).

This behavior appears to be partially need-driven, with people suffering from chronic or stigmatized

conditions more likely to seek help online. It is also partially driven by a digital divide, in which those

with ready Internet access and technical skills (i.e., younger, wealthier, and more educated people) are

more likely to seek health information online. One exception to the digital divide pattern is gender: 55%

of online health seekers are female (§ 2.1.3).

9Sites that promote eating disorders.


While medical information portals such as WebMD and MedlinePlus are heavily utilized, most health

information quests begin with search engines. A significant proportion (8-25%) of online health informa-

tion seekers eventually participate in an OHC (§ 2.1.4).

The primary reason for participating in an OHC is to find others who share the same health concerns.

While we know that people with stigmatized, or otherwise embarrassing, medical conditions are more

likely to participate in OHCs, we know little else about participant demographics, which are rarely studied.

Given the demographic specificity of many medical conditions (e.g., only women acquire breast cancer),

it is likely that such demographics vary widely across conditions (§ 2.2.2).

Users perceive several benefits to participating in OHCs, which we can categorize into: medium-

based affordances – unique and valuable characteristics that OHCs have by nature of being an online,

digital resource; informational support benefits; and emotional support benefits (§ 2.2.3). While acquir-

ing an objective assessment of an OHC’s efficacy is challenging, participation does appear to impart

psychosocial benefits on users, and may play a role in measurably reducing certain symptoms. How-

ever, the answer to whether OHC participation benefits those afflicted with conditions that are stimulated

by social contact with similar patients, such as eating disorders, is less clear (§ 2.2.4).

Chapter 3

Prior Work on Patient Authored Text

A great deal of prior work utilizes patient authored text (PAT) as a primary data source. Despite this,

to our knowledge no organized review of data sources, methods, goals and outcomes of such work

exists. Our goal in this chapter is to motivate the utility of PAT as a data source and provide a structured

framework over relevant prior work. We first scope our definition of PAT, and discuss its latent value as a

data source as well as the challenges it poses for analysis (§ 3.1). We then review prior work that uses

PAT as a primary data source. This work tends to fall into one of four categories: syndromic surveillance

(§ 3.2), pharmacovigilance (§ 3.3), entity extraction (§ 3.4), and thematic analysis (§ 3.5). Finally, we

summarize our findings (§ 3.6).

3.1 Patient Authored Text (PAT): Introduction & Overview

We define patient authored text (PAT) as any online, medical text authored by someone who is not a

medical professional. A main source of PAT is online health communities (OHCs): online discussion

forums dedicated to specific health topics where people converse in the form of posted messages.

MedHelp1, PatientsLikeMe2 and CureTogether3 are all examples of OHCs. Other sources of PAT include

search logs, social media data (e.g. Twitter4 and Facebook5), personal blogs (e.g. Lady of Lyme6), and

email.1http://www.medhelp.org2http://www.patientslikeme.com3http://www.curetogether.com4http://www.twitter.com5http://www.facebook.com6http://www.ladyoflyme.com

21

CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 22

3.1.1 Value of PAT

In the process of creating PAT, users are documenting medical data, making sense of it, prioritizing it, and

synthesizing it in order to solve problems that are relevant to them. This is time intensive work, performed

by agents who may well make up in motivation for what they lack in medical expertise. The resulting text

is rich in medical information, with users recording medical histories, comparing treatments, detailing

symptoms and reasoning about differential diagnoses. At a minimum, this culminates in a unique record

of patient behavior outside of the clinical environment. In the case of stigmatized or otherwise embar-

rassing conditions7, PAT may well contain medical data that is rarely captured elsewhere. For example,

someone struggling with substance abuse might detail her self-prescribed treatment schedule for with-

drawal. In concert, then, PAT comprises a valuable and, in many cases, unique medical data set that is

abundant and readily available. However, PAT is also challenging to work with.

3.1.2 Challenges of Working with PAT

PAT is notoriously difficult to work with. We attribute this to three main reasons: it’s inherent noisiness;

the lack of existing tools for exploring and analyzing it; and the fact that it is often difficult to discern

whether PAT supports any given research question. As we will show in § 3.2-§3.5, prior work tends

to compensate for these challenges by either fixing some variables in a quantitative analysis, or by

conducting small-scale, qualitative analyses.

Noisiness

On the text level, PAT is riddled with spelling and grammatical errors. Compared with expert-authored

text, differences include lexical and semantic mismatches [167,272], mismatches in consumers’ and ex-

perts’ understanding of medical concepts [99,272] and mismatches in descriptive richness and length [99,

167,272]. Consider, for example, the text snippets below, both discussing the predictive value of a family

history of breast cancer. The first snippet is from a medical study by De Bock et al. [68]:

In our study, at least 2 cases of female breast cancer in first-degree relatives, or having at

least 1 case of breast cancer in a woman younger than 40 years in a first or second-degree

relative were associated with early onset of breast cancer.

7In Chapter 2 we note that people suffering from stigmatized conditions are more likely to seek help online and to participate inOHCs.


The second (unedited) snippet is from the MedHelp breast cancer community:

im 40yrs old and my mother is a breast cancer surivor. i have had a hard knot about an inch

long . the knot is a little movable. the knot has grew a little over the past year and on the

edge closest to my underarm. i am scared and dnt want to worry my mom ..

Moreover, PAT contributors vary widely in their level of medical expertise, command of medical jar-

gon, and the frequency with which they document their experiences online. Most PAT would be consid-

ered unusable from a medical perspective: symptom descriptions, treatments and medical histories are

incomplete, and basic demographic data is absent.

Lack of Analysis Tools

The dearth of tools and methods for mining PAT is likely exacerbated by its noisiness and inconsistencies.

As we discuss in § 3.4, the handful of medical annotation toolkits that do exist are tailored to process

well formatted, expert-authored text (e.g. clinical text, journal publications), and perform poorly on PAT.

As a result, exploring PAT corpora is costly, often requiring researchers to build ad hoc tools for large

scale annotation and extraction. Moreover, as there is no systematic method for exploring the space of

possible approaches to extracting medically useful information from PAT, these ad hoc tools are often

not recyclable.

Applicability to Research Questions

The question of whether or not a PAT corpus supports a given research question is not always obvious,

and depends very much on users’ reasons for authoring the PAT in the first place. Finding a tight

match between a research question and users’ motivations for authoring PAT is crucial for success. For

example, search logs are an appropriate data source for monitoring influenza trends, because users are

motivated to search for their symptoms when they get sick. However, Twitter would be an inappropriate

data source for mining optimal drug dosages, as users tend not to tweet this information en masse.

Determining what data PAT encodes, and how it is encoded, is a costly investment and a separate

challenge from extracting these data.


3.2 Syndromic Surveillance

Syndromic surveillance – also known as early warning, outbreak detection, or biosurveillance – is the

utilization of health-related data for the purpose of detecting, analyzing and monitoring potential disease

outbreaks [128]. Syndromic surveillance systems do not necessarily utilize online data: the first such

systems were developed to give advanced notice of bioterrorism attacks – in particular, those related to

anthrax – after 9/11, and utilized data such as pharmacy purchases and emergency room visits [35,127,

128,163,207].

However, building syndromic surveillance systems based on PAT is appealing for a number of rea-

sons. The first is users’ proclivity for seeking health information online. For example, it is fairly common

for users to search online for symptoms that they are experiencing, or for conditions that they believe

they might have [156, 254, 256]. As such, data useful for syndromic surveillance tends to accrue natu-

rally, which is preferable to resource-intensive, manual data collection [128, 262]. In addition, collecting

and analyzing online data is fast, enabling advanced (or even real-time) detection of outbreaks, which is

not possible using traditional syndromic surveillance systems [41,100,128].

The best known example of a PAT-based syndromic surveillance system is likely Google Flu Trends8,

which estimates regional flu activity from aggregated search queries [41]. Google Flu Trends can often

identify flu outbreaks a full 1-2 weeks ahead of the CDC, which bases its reports on laboratory and

clinical data [41]. However, the system is vulnerable to anomalous situations, such as outbreaks of new

influenza strains, or particularly bad influenza seasons [38]. Other challenges to syndromic surveillance

systems based on PAT include their vulnerability to changes in users’ online health seeking behavior [38,

262], making it difficult to estimate false positive and false negative rates [262]. Finally, a successful

syndromic surveillance system requires that a sufficient portion of the population of interest is seeking

health information online, which is not always the case. Below, we outline the chief components of

syndromic surveillance projects.

3.2.1 Condition

Typically, syndromic surveillance systems focus on a single medical condition of interest. To date, the

majority of work on syndromic surveillance focuses on influenza [10,15,55,56,62,63,81,100,132,137,

152,198]. Exceptions include investigating general infectious disease outbreaks [33,52,109,262], Lyme

8http://www.google.org/flutrends/us


Disease [221], and potential foodborne illness outbreaks at restaurants [213]. Syndromic surveillance

techniques have also been used to monitor “non-outbreak” conditions or behaviors. For example, Cooper

et al. [53] use syndromic surveillance techniques to monitor cancer prevalence, while Ayers et al. [18]

use them to track the popularity of electronic nicotine delivery systems (e-cigarettes).

3.2.2 Data Source

People searching for their own symptoms online is a well documented phenomenon [254,257]. Accord-

ingly, search logs are a natural choice for a syndromic surveillance data source, and are successfully

utilized in several instances of prior work [18, 53, 100, 132, 198, 221]. More recently, Twitter has come

to light as another suitable source [10, 15, 62, 63, 152, 213], suggesting that users are prone to men-

tioning when they, or someone around them, falls ill. Rarer data sources include blogs [55, 56], website

access logs [137], and aggregated web data (a combination of search logs, news articles, RSS feeds

etc.) [33, 52]. The latter may be particularly appropriate when trying to survey regions in which the

population of interest has limited education and/or Internet access, such as developing countries.

3.2.3 Filtering

As syndromic surveillance aims to correlate online frequency data with real-world epidemiological trends,

separating signal from noise in the data stream is important. Mentions of a condition do not necessarily

correlate with real-world instances of it [152].

On the simple end of the spectrum is keyword filtering. While common [10, 18, 53, 55, 56, 198, 221],

this approach has several shortcomings. First, relying on a static set of keywords makes the system

susceptible to over-fitting [62], as well as fluctuations in the use of those keywords that are unrelated

to the disease in question [15, 38, 100]. For example, a news story on flu could galvanize a “burst” of

online activity around the topic of flu, even while infection levels in the population remain unchanged.

Finally, although keywords are occasionally picked in a principled and consistent manner (e.g. Ginsberg

et al. [100] pick keywords based on how their frequency fluctuations correlate with regional influenza ac-

tivity), in general selection is arbitrary and prone to human misjudgment. For example, spelling variations

of keywords may be ignored [56].

Other work indicates that more nuanced filtering yields higher quality results [62, 152]. One such

approach is to train statistical classifiers to automatically identify whether a datum is relevant or not. Both

Support Vector Machines (SVMs) [15,193] and other simple bag-of-words models [52,62,63] have been


successfully leveraged to identify data that correspond to actual influenza infections. Moreover, Lamb

et al. [152] show that using binary classifiers to acquire even more detailed information (specifically,

whether a tweet is about the author or about someone else; whether a tweet represents an awareness

vs. an instance of flu; and whether a tweet is flu-related or not) greatly improves prediction.

3.2.4 Modeling and Prediction

In the case of syndromic surveillance systems that focus on a specific condition (e.g. influenza), linear

models are commonly used to predict trends from the filtered data [10, 62, 63, 100, 152, 198]. Simpler

approaches do not model the filtered data, deeming frequency counts sufficient for reflecting real-world

trends [15,53,55,56,137,221].

The few syndromic surveillance systems attempting to monitor a range of diseases require the ad-

ditional step of identifying specific diseases and geographic locations [33, 52]. Of note is the approach

used by Paul et al. [193], who use topic modeling over their filtered data to acquire distributions of ail-

ments over time. One key advantage of this approach is its ability to surface new diseases without

manual intervention [193].

3.2.5 Real-World Evaluation Dataset

In order to prove the utility of a syndromic surveillance system, a corresponding real-world metric of the

same phenomenon that the system is trying to measure is required for comparison. In the case of in-

fluenza, the CDC frequently releases timely data on cases of influenza-like illnesses detected through its

traditional surveillance systems9. It is likely that the availability of this data set is the driving force behind

the fact that almost all PAT-based syndromic surveillance research focuses on the topic of influenza.

3.3 Pharmacovigilance

Pharmacovigilance is concerned with detecting, monitoring and preventing adverse affects related to

pharmaceutical products. Like syndromic surveillance, traditional Pharmacovigilance systems are of-

fline, typically comprising adverse drug event reports contributed by patients, physicians and pharma-

cists, which are collected by the United States Food and Drug Administration10. Many of the appeals

9http://www.cdc.gov/flu/weekly/fluactivitysurv.htm10http://www.fda.gov


of making online-based syndromic surveillance systems apply to Pharmacovigilance. However, by con-

struction Pharmacovigilance is a more complex problem: whereas syndromic surveillance systems typ-

ically monitor only a single variable (e.g. how many people have the flu), an adverse event involves at

least two elements: a drug and an adverse effect (e.g. unexpected side effects). Extracting such entities

can be challenging. Unlike syndromic surveillance, prior work on Pharmacovigilance addresses a wide

array of topics and conditions. Below, we discuss important components of Pharmacovigilance systems.

3.3.1 Data Source

In order to leverage the advantages of both scale and relevant content, researchers must find a large

source of PAT where patients typically disclose both which drugs they use as well as adverse events they

experience. Online health communities (OHCs) are rich with discussions disclosing users’ medications,

symptoms and current health states (see Chapter 2). Accordingly, almost all work on PAT-based Phar-

macovigilance utilizes OHC communications as a primary data source [23, 45, 154, 171, 183, 265, 266,

268,269]. To our knowledge, the only exception to this is also arguably the most successful & impactful

work on Pharmacovigilance: White et al [257] successfully utilize search query logs to discover a novel

adverse drug-drug interaction, which was later proved in medical trials.

3.3.2 Identifying Drugs in PAT

Identifying drugs in PAT is challenging: in addition to the many spelling variations of a drug that might

be present in a PAT data set, users may mention several drugs at once, making it difficult to tell which

one is responsible for the adverse event [119]. Accordingly, only a handful of prior Pharmacovigilance

work attempts to explicitly identify drugs related to adverse events in a data set. Yang et al. [265, 266]

extract drug entities using a lexicon, and Yates et al. [269] train a conditional random field (CRF) model

for this purpose. A more common approach is to pre-select a small number of drugs of interest, filter

the original data set for mentions of these drugs, and then attempt to extract adverse events from these

filtered data [154,171,183,257,268].

Chee et al. [45] take a different approach that is worth noting. Rather than attempting to extract {drug,

adverse event} pairs, they use an ensemble classifier over OHC text to identify drugs that are similar

to “watch list” drugs: drugs that already have adverse effects reported by the FDA11. Unfortunately this

method gives no insight into why a drug might be worthy of inclusion on such a list.

11http://www.fda.gov/Safety/MedWatch


3.3.3 Identifying Adverse Events in PAT

Unlike the drug involved in an adverse event, the adverse events themselves are rarely fixed: typically

a Pharmacovigilance system will attempt to identify any adverse event related to a particular drug. The

list of extracted events is then somehow ranked and given to an human reviewer for analysis. Yang et

al. [265,266] and Lehman et al. [154] identify adverse events in PAT by first compiling lexicons describing

adverse events, and then scoring matches against sliding n-gram windows over PAT sentences.

Yates et al. [269] train a CRF to identify adverse events in PAT. Nikfarjam et al. [183] learn patterns

from text about known adverse drugs; they then apply these patterns to identify new adverse events.

White et al. [257] are the sole exception to extracting an open set of adverse events: rather, they limit

their extraction to a pre-specified set of symptoms related to hyperglycemia. The fact that theirs is

arguably the most successfully Pharmacovigilance system to date suggests that this may be a promising

approach.

3.3.4 Evaluation

In general, evaluating the efficacy of Pharmacovigilance systems is difficult: results typically contain

several known indications; the remaining result elements are either false positives, or true positives that

have yet to be detected via traditional reporting mechanisms. In general, most work serves as a proof of

concept that some adverse drug events manifest in PAT, but there is little quantification of how many and

how strongly different events are represented. Most importantly, determining how to surface the most

relevant true positives remains an area for future work. The work by White et al. [257], which rigorously

demonstrates the existence of the connection between paroxetine, pravastatin and hyperglycemia in

PAT (predating the FDA’s discovery of this), comes closest to proposing a methodology for doing this.

However, their approach lacks flexibility in that both their drugs and adverse events of interest were

predefined.

3.4 Named Entity Recognition

Named entity recognition (NER) is an information extraction task in which the goal is to develop methods

that automatically identify entities of a specific type from text. For example, extracting drugs, adverse

events or symptoms from medical records are all NER tasks. In general, there are two ways to go about

medical NER in PAT: the first is to use state of the art ontology-based tools, which work “straight out of


the box”, but have poor performance on PAT. The second is to use custom statistical classifiers, which

tend to have high accuracy, but require large volumes of labeled data for training and testing. We discuss

each in detail below.

3.4.1 Ontology-Based Tools

Historically, the go-to tools for medical text annotation are MetaMap12 [17] and, more recently, the Open

Biomedical Annotator (OBA)13 [138]. These tools are ontology-based, meaning that they search through

text for matches against underlying ontologies (curated vocabularies of medical terms and the rela-

tionships between them) [17, 138]. While these tools are capable of fine-grained entity resolution, a

previous study [201] comparing OBA and MetaMap against human annotator performance underscores

two sources of performance error on PAT. The first is ontology incompleteness, which results in low re-

call, and the second is inclusion of contextually irrelevant terms. For example, when restricted to the

RxNORM ontology and semantic-type Antibiotic (T195), OBA will extract both “Today” and “Penicillin”

from the sentence “Today I filled my Penicillin rx”. We observe the same limitations in Chapter 5 and in

later collaborative work with Gupta et al. [112].

Despite recent efforts to develop an ontology suitable for PAT - the open and collaborative Consumer

Health Vocabulary (OAC) CHV [77, 273, 274] - we suspect that tools like MetaMap and OBA will remain

ill-suited to the task of medical term identification in PAT due to structural differences between PAT and

text authored by medical experts that we discuss in § 3.1.2. Finally, in addition to including misspellings

and slang, consumer medical jargon may evolve over time as patients acquire expertise.

3.4.2 Statistical Classifiers

A natural alternative to ontology-based tools are statistical classifiers, which can be trained to extract

biomedical entities of interest with high accuracy. However, such methods require sizable corpora of

labeled data for training and evaluation. This is problematic in the medical domain, as having medi-

cal experts annotate text is both expensive and time consuming. Only a handful of publicly available

annotated medical corpora exist, all of them comprised of annotated biomedical journal publication

abstracts (i.e. expert authored text) [145, 146, 204, 271]. This has had the dual effect of generat-

ing a plethora of prior work demonstrating the efficacy of statistical-based approaches to biomedical

12http://metamap.nlm.nih.gov13http://bioportal.bioontology.org/annotator


NER [76, 87, 95, 124, 125, 214, 238, 239, 267], but little work that explicitly examines PAT as a potential

data source.

Our work on ADEPT (Chapter 5) is an exception to this. By proving that crowdsourcing medical term

annotations yields labels comparable in quality to experts’, we were able to use crowd-labeled PAT to

train a conditional random field (CRF) classifier to identify medically-relevant terms in PAT. However, we

also find that crowdsourcing is not always a ready solution to PAT annotation tasks (§ 5.7). In Chapter 7

we show that a CRF similarly extracts users’ drugs of choice (preferred substances of abuse) from

PAT from a manually-labeled data set. Later work in collaboration with Gupta et al. [112] shows that

the unsupervised method of lexico-syntactic pattern induction is a promising approach for extracting

specific types of biomedical entities (including symptoms & conditions, as well as drugs & treatments)

from PAT. This approach is also employed by Xu et al. [264], although our method achieves higher

scores. Finally, other work demonstrating entity extraction on PAT includes some of the work discussed

in Pharmacovigilance (§ 3.3), which utilizes CRFs [269] and pattern learning [183] to extract drugs and

adverse events from PAT.

3.5 Thematic Analysis

Thematic analyses (sometimes called content analyses) involve the systematic reading of text with the

goal of eliciting a taxonomy (i.e., an organized collection of significant patterns and themes) that de-

scribes the source data. While some literature outlines standard practice for thematic analyses [30,

110, 236], it is infrequently referenced, and methods utilized in applied research tend to be somewhat

ad hoc. Thematic analysis is the most extensively used qualitative analysis technique [110], and in our

experience, the most common type of analysis applied to PAT, easily outnumbering work on syndromic

surveillance, pharmacovigilance, and Named Entity Recognition. This is likely due to the fact that (1)

thematic analyses are easy to apply: any kind of text is a suitable candidate for thematic analysis, which

is not true for quantitative analyses requiring automated extraction, (2) they are interesting: the results

of a thematic analysis over PAT almost always satisfy our latent curiosity about what people actually do

online in relation to their own health, and (3) they are useful: illuminating corpus content via thematic

analysis is a sensible precursor to higher-investment, quantitative research with automated components.

Below, we discuss compare and contrast prior work that conducts thematic analyses on PAT.


3.5.1 Condition

There is a great deal of diversity in the conditions studied via thematic analysis. Stigmatized, or oth-

erwise embarrassing, conditions receive notably more coverage than they do in syndromic surveil-

lance, pharmacovigilance or NER. Examples include smoking cessation [180, 197], infertility [160, 161],

HIV/AIDS [61, 177], Huntington’s disease [59], irritable bowel syndrome [58], and post-partum depres-

sion [69]. Underlying the interest in these topics is likely the fact that PAT comprises a unique data

source, especially for stigmatized conditions. Another common topic of study are conditions that have a

behavioral component through which the user can directly influence health outcomes. These include di-

abetes [107,206], smoking cessation [180,197], weight loss and fitness [134,142,217,240], and general

wellness [108].

3.5.2 Data Source

The majority of thematic analyses focus on online health communities (OHCs) [29, 34, 58–61, 101, 134,

160, 161, 177, 206, 220, 233], a natural choice given the volume and richness of OHC text. However,

contemporary PAT thematic analyses also turn to Twitter [69, 135, 142, 170, 180, 218, 234, 235, 240] and

Facebook [22, 71, 107, 197]. Other data sources include search logs [224, 255, 256], email [13], and

personal blogs [217].

3.5.3 Analysis Question

Thematic analyses are, by nature, exploratory, and researchers leverage them to answer a wide ar-

ray of questions. A frequent focus is unearthing users’ reasons for participating in a particular OHC,

which alludes to the question of what role the community plays in helping users meet their health

goals [13, 34, 60, 134, 142, 161, 206, 217, 224, 235]. Results usually contain some interesting insights.

For example, Hwang et al. [134] find that online support groups for weight loss are an important source

of encouragement as well as friendly competition. Relatedly, Kendall et al. [142] find that people use

Twitter to realize their fitness goals in two ways: the first is to publish evidence of having worked out, the

second is to publish a commitment to work out in the future.

The assumed role of many OHCs is to provide users with support. In such cases, a natural question

to ask is what types of support users receive. Results are practically unanimous in noting that users

seek primarily informational and emotional support [58,59,61,153,177,197,224].


In larger communities that are not necessarily specifically health-oriented (e.g. Twitter and Face-

book), the research question often takes the angle of, “When people mention X on interface Y, what

do they talk about?”. A wide range of health topics have been analyzed on Twitter along these lines,

including insomnia [135], epileptic seizures [170], and concussions [234], often with interesting insights.

For example, Scanfeld et al. [218] find that Tweets mentioning antibiotics often indicate misuse. McNeil

et al. [170] note that most tweets about concussions are in reference to professional sports injuries,

and Bender et al. [22] find that a great deal of breast cancer related discussion on Facebook involves

fundraising.

Finally, a handful of thematic analyses investigate how the experience of an illness can differ by

gender. Makil et al. [160, 161] investigate infertility, paying special attention to the experience of men

whose partners are infertile. Another topic that has received some attention is how coping and self-help

mechanisms differ between people with breast cancer and prostate cancer. In general, these studies find

that men seek more informational support and less emotional support than women do [101,220,233].

3.5.4 Scaling Thematic Analyses

Only a handful of prior work uses thematic analysis results as the foundation for a larger-scale analysis

of PAT. Most notable is that by De Choudhury et al. [69–71], who analyze how postpartum depression

(PPD) is characterized on both Twitter and Facebook. Using their findings, they leverage activity and

linguistic features to build models that can predict the onset of PPD from Facebook data [71]. Also of

note is the work on cyberchondria by White & Horvitz [255, 256], who analyze health-related search

logs and leverage the results of their analysis to model anxiety escalation and predict the transition from

self-diagnosis to seeking medical assistance. Our work on identifying users’ reasons for participating in

Forum77 (Chapter 6) and their transitions through addiction (Chapter 8) also implements scaled thematic

analyses.

Results of scaled thematic analyses are especially powerful, as they provide both a novel, insightful

contextualization of PAT acquired via close reading of a small sample, as well as population-level insights

acquired via extending these results through automated annotation and large-scale analysis. As such,

their rarity is puzzling: it is possible that many researchers who conduct thematic analyses do not have

experience with machine learning. Alternatively, categories derived in thematic analyses may be too

fine-grained for classifier training. A final explanation may be that there is sufficient reward for publishing

the results of a thematic analysis without investing the resources required to scale it.


3.6 Summary

Our goal in this chapter was to motivate PAT as a data source and present a comprehensive overview of

relevant prior work. We define PAT as any medical text authored by someone who is not a medical pro-

fessional (§ 3.1). PAT, which is often the product of many human hours spent on complex health-related

problem solving, provides a unique window into patient behavior outside of the clinical environment

(§ 3.1.1). However, it is also challenging to work with: PAT is noisy, few tools support mining and explor-

ing it, and determining what medical data PAT encodes, and how, is often unclear upon casual inspection

(§ 3.1.2). This underscores the importance of matching research questions with users’ motivations for

authoring PAT in the first place.

Work utilizing PAT as a primary data source tends to fall into one of four categories. Syndromic

Surveillance (3.2) and Pharmacovigilance (3.3) both involve processing large quantities of data in order

to monitor health-related variables. Entity extraction (3.4), which lies under the purview of Natural Lan-

guage Processing and Machine Learning, concerns the identification of specific entities in PAT. Finally,

on the qualitative side, thematic analyses (3.5) involve close readings of text in order to gain insight into

its structure and content.

PAT-based syndromic surveillance systems have great potential in the toolbox of techniques for the

real-time monitoring of medical conditions. To date, the majority of such systems focus on the topic

of influenza, relying either upon search query logs or Twitter as a primary data source. Filtering the

PAT data stream for relevant entities is crucial for a cleaner signal: although keyword-based filtering is

popular due to its simplicity, training classifiers to discriminate relevant from irrelevant data produces

superior results. Often, frequency counts of these filtered data are compared as-is to real-world gold

standards (most commonly, the CDC ILI data set14), but prior work shows that linear models built on

these data have promising predictive value.

Pharmacovigilance (§ 3.3) is concerned with detecting adverse effects related to pharmaceutical

products in real-time. PAT comprises a potentially valuable, but difficult to work with, data source for Phar-

macovigilance [119]. Most prior work focuses on online health communities (OHCs), although search

logs have also been shown to be a viable data source for web-scale pharmacovigilance [257]. While

many systems demonstrate the ability to identify {drug, adverse event} pairs, automatically identifying

14http://www.cdc.gov/flu/weekly/fluactivitysurv.htm


which of these pairs (amongst thousands) are important is an unsolved problem. To date, no work has

presented a viable predictive model for adverse drug events.

A great deal of work on biomedical named entity recognition (NER) exists. While ontology-based

MetaMap and Open Biomedical Annotator are the go-to tools for medical term annotation, they per-

form poorly on PAT for two reasons: first, ontologies have insufficient coverage of consumer medical

terminology. Second, their lack of context sensitivity leads to over-inclusion of irrelevant terminology in

results.

Statistical classifiers have been shown to achieve high accuracy in biomedical NER tasks. However,

these approaches are limited by their requirement for a sizable corpus of annotated data for training and

testing. Most research on biomedical NER utilizes existing publicly available data sets, which are based

on abstracts from biomedical journal publications. Consequently, little prior work on biomedical NER

in PAT exists. Exceptions to this include some of the work on Pharmacovigilance [183, 269], and our

work on ADEPT (Chapter 5), identifying drugs of choice (Chapter 7) and using patterns to extract entity

types [112,264].

Thematic analyses over PAT cover a wide array of conditions. However, notably present are stigma-

tized conditions and conditions that have a behavioral component through which the user can influence

health outcomes. Online health communities, Twitter and Facebook are the most commonly utilized PAT

sources for thematic analyses. As thematic analyses are exploratory by nature, they are used to answer

a wide array of questions. Common topics include elucidating users’ reasons for participating in an on-

line community as well as what kinds of support such a community provides. The results of a thematic

analyses can be used to train automatic classifiers, thereby extending the research from a small PAT

sample to large PAT corpora. While prior work demonstrates the power and value in this approach, it is

rare.

In sum, PAT is a valuable data source that has been proven to have clinical value. However, PAT is

challenging to work with. To date, prior work on PAT tends to be either structured in such a way as to

reduce the number of variables being analyzed, making analysis and evaluation easier (e.g. syndromic

surveillance, pharmacovigilance, NER), or focuses on qualitative analyses of PAT (e.g. thematic anal-

yses). Although little work builds automated extraction and analysis on top of the results of a thematic

analysis, prior work, as well as our findings in Chapters 6-8, indicate that this approach yields novel and

valuable insights.

Chapter 4

Data

In this chapter we describe our PAT data sets and define terminology relevant to our work. We first

present our full MedHelp data set (§ 4.1), which we use in our work on medical term identification

(Chapter 5), and define key terminology (§ 4.1.1). We then describe Forum77 (§ 4.1.2), a subset of the

MedHelp data set, which we use for our work on addiction (Chapters 6, 7 and 8). Finally, we present our

CureTogether data set (§ 4.2), which we use as an independent test set in Chapter 5. We acquired our

data sets through research agreements with MedHelp and CureTogether, respectively, who anonymized

the data prior to sharing them.

4.1 MedHelp Corpus

MedHelp1 is an online health community designed to aid users in the diagnosis, exploration, and man-

agement of personal health conditions. The site boasts a variety of tools and services, including over

200 condition-specific user online health communities (OHCs). Our data set comprises all discussions

on all of MedHelp’s forums from 2006 through mid-2011: a total of ∼1,250,000 threads. Table 4.1 lists

the top 40 MedHelp forums by post volume, along with unique contributor counts.

4.1.1 Terminology

Figure 4.1 provides an illustrative example of the composition and content of our MedHelp data. A forum

comprises several threads (or discussions) centered around a specific medical condition (e.g. addiction,

breast cancer, etc.). A thread is composed of an initiating post, in which the initiator posts new content

for the community’s consideration, and a series of response posts, in which respondents contribute to

1http://www.medhelp.com

35

CHAPTER 4. DATA 36

Table 4.1: Top 40 MedHelp forums ranked by total post count. A ◦ in the Stigmatized column denotes ourconservative estimate of whether the condition represented by the forum carries a stigma or is otherwiseembarrassing.

Stigmatized Forum Post count Unique users

◦ Addiction: Substance Abuse 486,972 32,542Maternal & Child 402,065 45,821Pregnancy 18-34 364,475 28,321

◦ Hepatitis C 343,433 14,330◦ HIV Prevention 274,072 27,528◦ Fertility 243,919 17,391

Women’s Health 208,683 76,221Thyroid Disorders 169,713 21,939Multiple Sclerosis 156,500 5,545

◦ STDs 117,462 29,455Neurology 111,671 47,968Dermatology 107,134 47,612Ovarian Cancer 99,954 10,425

◦ Anxiety 98,971 17,373◦ Herpes 89,792 17,061

Undiagnosed Symptoms 82,301 30,741Gastroenterology 79,659 32,694Heart Disease 74,671 22,294

◦ Hepatitis Social 74,412 2,122Pregnancy 35+ 72,414 5,923Eye Care 70,744 18,666

◦ Addiction: Social 68,831 3,253Heart Rhythm 57,001 9,496Child Behavior 45,660 14,961Relationships 42,891 4,724Pain Management 42,099 7,990Breast Cancer 41,197 10,869Urology 37,121 17,351Weight Loss Alternatives 36,925 15,003

◦ Depression 35,614 9,035Chiari Malformation 32,493 1,892Sexual Health 32,269 11,344MedHelp Social 31,800 778Men’s Health 31,712 14,832

◦ Bipolar Disorder 29,057 3,775Back & Neck 28,926 13,082Hepatitis B 28,664 4,621Ear, Nose & Throat 28,439 14,244

◦ Miscarriages 26,043 3,703

CHAPTER 4. DATA 37

the discussion galvanized by the initiating post. When an initiator posts a response to a thread that she

started, this post is called a self-response.

While features for sub-discussions (nested responses) as well as picking a “best response” in a

thread do exist, they are used infrequently and we do not consider them in our analyses. Moreover, we

have neither demographic data (age, geographic location etc.) describing MedHelp users nor page view

data describing lurking (reading without posting – see § 2.2.1) behavior.

ADD/ADHD Addiction (Forum77)

Allergies – Food Allergy

Arthritis Asthma Autism

Back & Neck Bipolar Disorder

Bone Cancer Breast Cancer Breastfeeding

Cancer Carpal Tunnel Syndr.

Celiac Disease Cerebral Palsy

Cervical Cancer Chemotherapy

the best way? By sparklystars 23 minutes ago

I want to come off 10 percs per day. Is it better to taper, or to go gold turkey???

3

oxycodone By oxyuser 5 hours ago

I have been taking vics for about 5 years now. At times I have taken as much as 40 a day. I’m sorta on day 3. I took 1 pill y…

12

Suboxone withdrawal By liquid_daisy 10 hours ago

I quit cold turkey off 32mgs of suboxone. Today is day 5 and I’m in a lot of pain. I just want to know how long these withd…

3 10

300 DAYS for LEX!!! By happystar 6/12/2013

Guess who had 10 months clean today!?? LEX, you go girl!!! Great job we are all sooooooo proud of you!

19

Can you withdraw from Lyrica? By fl12abs 6/11/2013

My doctor prescribed Lyrica for chronic

2

Suboxone withdrawal By liquid_daisy 6/12/2012

I quit cold turkey off 32mgs of suboxone. Today is day 5 and I’m in a lot of pain. I just want to know how long these withdrawals will last…? Is there anything I can get OTC that will help??? Thanks.

Boo28 on 6/12/2012

Congrats on the 5 days clean! 32mgs is a high dose to CT, but doable. First, some questions: are you on any other medications? What other w/d symptom…

10 responses

yellowPop on 6/12/2012

hi congrats and keep posting for support. I myself jumped from 44mgs although it wasn’t pretty. Physical w/ds tend to last 10 days to 2 weeks but everyone is diff…

liquid_daisy on 6/12/2012

No diarrhea, just cold sweats. I stay busy so that I don’t let my mind wander. Don’t have much of an appetite, but redbulls seem to help… chugged 4 today alrea…

MEDHELP COMMUNITIES

INITIATING POST

RESPONSES

FORUM77 DISCUSSION THREAD

SELF RESPONSE

Figure 4.1: Illustrative example of MedHelp and Forum77 content and structure.

4.1.2 Forum77

MedHelp’s largest forum is dedicated to the topic of Addiction: Substance Abuse2. We dub this commu-

nity Forum773

Our data set covers all Forum77 content from 2007 to mid-2014 (7.5 years), and comprises 80,529

discussions (740,046 total posts) authored by 51,153 unique users. Figure 4.2 illustrates summary statis-

tics describing content and activity on Forum77. As expected, the volume of response posts correlates

strongly with the volume of initiating posts; moreover, both experience a slight decline from 2009 - 2014

(Figure 4.2 (A)). While the number of new users to Forum77 varies widely each month, the number of

2http://www.medhelp.org/forums/Addiction-Substance-Abuse/show/773All of MedHelp’s forums have a unique identifier, and the Addiction: Substance Abuse community’s is 77. We settled on

Forum77 as a convenient way to refer to this community. To our knowledge nobody within the community refers to it as Forum77.

CHAPTER 4. DATA 38

return users, which comprise the core community base, is more consistent: in any given month there

are between 200 - 300 return users participating in the forum (Figure 4.2 (B)). This is consistent with

user tenure distribution on Forum77: while most users have a tenure of ≤ 1 month, a long tail indicates

several thousand users who have tenure > 1 year (Figure 4.2 (D)). Finally, while some initiating posts

get no responses, most get at least one, and modal thread length is 4 posts (Figure 4.2 (C)).7/26/2014 localhost:8081/index_hist.html

http://localhost:8081/index_hist.html 1/2

2008 2009 2010 2011 2012 2013200

1,000

10,00020,000

12 24 36 48 60 72

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

12345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

0 5 10 15 20 25 30 35 40 45 50 55 60

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

01,0002,0003,0004,0005,0006,0007,0008,000

0 5 10 15 20 25 30 35 40

Year

Pos

t cou

nt

Initiating

Responding

7/26/2014 localhost:8081/index_hist.html


2008 2009 2010 2011 2012 2013200

1,000

10,00020,000

2007 2008 2009 2010 2011 2012

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

12345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

0 5 10 15 20 25 30 35 40 45 50 55 60

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

01,0002,0003,0004,0005,0006,0007,0008,000

0 5 10 15 20 25 30 35 40

Year

Use

r cou

nt

Return users

New users



12 24 36 48 60 722003004005006007008009001,000

2,0003,0004,0005,0006,0007,0008,0009,00010,000

20,000

12 24 36 48 60 72

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

Use

r cou

nt

Initiating posts per user +



12 24 36 48 60 722003004005006007008009001,000

2,0003,0004,0005,0006,0007,0008,0009,00010,000

20,000

12 24 36 48 60 72

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

01,0002,0003,0004,0005,0006,0007,0008,000

0 5 10 15 20 25 30 35 40

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

1

10

100

1,000

10,00040,000

0 5 10 15 20 25 30 35 40 45 50 55 60Tenure (months)

Use

r cou

nt

+



12 24 36 48 60 722003004005006007008009001,000

2,0003,0004,0005,0006,0007,0008,0009,00010,000

20,000

12 24 36 48 60 72

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

Use

r cou

nt

Responses per user +



12 24 36 48 60 722003004005006007008009001,000

2,0003,0004,0005,0006,0007,0008,0009,00010,000

20,000

12 24 36 48 60 72

100200300400500600700800

10 20 30 40 50 60 70 8012345678910

2030405060708090100

2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000

20,00030,00040,000

01,0002,0003,0004,0005,0006,0007,0008,000

0 5 10 15 20 25 30 35 40

02,0004,0006,0008,000

10,00012,00014,00016,00018,00020,000

0 1 2 3 4 5 6 7 8 9 10

Thre

ad c

ount

Thread length (# posts) +

A B

C D

E F

Figure 4.2: Summary statistics of Forum77 variables: post volume by month (A), user volume by month(B), thread length distribution (C), user tenure distribution (D), user initiating post count distribution (E),and user response post count distribution (F).

CHAPTER 4. DATA 39

4.2 CureTogether Corpus

CureTogether4 is an online health community that focuses on collecting structured health information

from its members via surveys. The site covers a wide array of medical conditions (589 in our data set),

each associated with a curated collection of symptom, treatment, side effect and cause/trigger terms. By

focusing on collecting structured data, CureTogether circumvents the problem of extracting medically-

relevant information from PAT. However, discussion levels on the site are low: our data set contains

∼3,000 free-text posts on a variety of CureTogether’s medical topics. Despite this, these posts are

detailed and thoughtful and suffice, in Chapter 5, as a suitable PAT source independent from MedHelp.

4http://www.curetogether.com

Chapter 5

Identifying Medically Relevant Terms in

PAT

5.1 Introduction

When we began exploring our MedHelp corpus, we realized that our efforts were severely hampered

by the absence of a good solution to a seemingly simple problem: identifying the medically relevant

terms in PAT. How, for example, might one automatically extract the terms that we have flagged as

medically relevant in the following excerpt from MedHelp’s Addiction: Substance Abuse forum?

So, I’m 62 hours without pills, and its definitely getting worse, I ache all over, the anxiety is

the worst, along with restless legs but I ’m here now, and I’m not sure it can get much worse

so hopefully soon I’ll be out the other side. Last night was horrible. I had around 3 hours

broken sleep, night sweats and the most awful haunting nightmares when I was sleeping.

I’ve taken the l-tyrosine and B6 this morning, I’ll try and force some food down me shortly

and then take the rest of the vitamins.

The ability to distill medically relevant terms from PAT is useful for exploration: it filters out irrelevant

content, allowing for high-level insights into the corpus and facilitating hypothesis generation. More

sophisticated analyses can also be implemented on the extracted terms. The results of co-occurrence

analyses, for example, can improve query expansion and information retrieval over a corpus [194, 219,

245], or can be used to impose additional structure, such as clustering [39] or hierarchical concept

summaries [216], over the source data. In a PAT corpus, significant term co-occurrences could be used

to build a “map” of important links between symptoms and treatments.

40

CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 41

Identifying medical concepts in text is a long-standing research challenge that has spurred the devel-

opment of several software toolkits [17]. Those such as MetaMap1 and the Open Biomedical Annotator

(OBA)2 focus primarily on mapping words from text authored by medical experts to concepts in biomed-

ical ontologies. A biomedical ontology is essentially a controlled collection of terms and the hierarchical

relationships between them. Usually, ontological terms are also categorized or typed (e.g., drug, sign or

symptom, medical device, etc.).

Thousands of biomedical ontologies exist, and differ according to the topic or level of specificity

covered by their terms. For example, the MOFEM3 (Emotion Ontology) covers concepts specifically

related to affective phenomena, while SNOMED-CT4 (Systemized Nomenclature of Medicine - Clinical

Terms) covers a broad array of clinical terms. Curating ontologies is a labor intensive process, in which

people must agree on which terms should be included, removed, combined or split, must categorize said

terms, and must define their hierarchical relationships.

Despite recent efforts to develop an ontology suitable for PAT - the open and collaborative Consumer

Health Vocabulary (OAC) CHV [77, 273, 274] - we suspect that tools like MetaMap and OBA will remain

ill-suited to the task of medical term identification in PAT due to structural differences between PAT and

text authored by medical experts. As we note in § 3.1.2, such differences include lexical and semantic

mismatches [167,272], mismatches in consumers’ and experts’ understanding of medical concepts [99,

272] and mismatches in descriptive richness and length [99,167,272]. Finally, consumer medical jargon

may evolve over time as a patient acquires expertise. This would be a challenge for ontologies which

are, by design, inflexible and brittle.

Our goal is to automatically and accurately identify medically relevant terms in PAT. (Note that we do

not attempt to map terms to ontological concepts; we view this as a separate and complementary task.)

As acquiring annotated data sets is a major obstacle to classifier training, we investigate crowdsourcing

as an alternative option to having medical professionals label PAT (§ 5.4). First, we discuss the process

of designing the crowdsourcing task (§ 5.4.1). Next, we compare crowdsourced annotations from non-

experts (Amazon’s Mechanical Turk5 workers (Turkers)) and medical experts (Registered Nurses hired

via ODesk6) (§ 5.4.2). We find that crowdsourcing PAT medical term identification tasks to non-experts

achieves results comparable in quality to those given by medical experts (§ 5.4.3). While this result

1http://metamap.nlm.nih.gov2http://bioportal.bioontology.org/annotator3http://bioportal.bioontology.org/ontologies/MFOEM4http://www.ihtsdo.org/snomed-ct5http://www.mturk.com6http://www.odesk.com


opens a new avenue for rapid and affordable PAT annotation, not all PAT annotation tasks are amenable

to crowd labeling (§ 5.4.4).

Next, we train a conditional random field (CRF) classifier to automatically identify medically relevant

terms in PAT (§ 5.5). Our classifier, trained on 10,000 crowd-labeled PAT sentences, dramatically out-

performs state-of-the-art annotation tools MetaMap, OBA and TerMINE (§ 5.5.3). We call our classifier

ADEPT (Automatic Detection of Patient Terminology). In an error analysis, we observe that ADEPT

has the most trouble correctly classifying “generic” medical terms (e.g.,pills, medicine, doctor) (§ 5.5.3).

We attribute ADEPT’s success to the suitability of sentence-level context-sensitive learning models, like

CRFs, to PAT medical term identification tasks (§ 5.7).

Finally, we demonstrate ADEPT’s efficacy through applying it to text from our MedHelp corpus (§ 5.6).

First, we compare the top-50 terms extracted from MedHelp’s Arthritis forum by both ADEPT and the

OBA (§ 5.6.1), noting that those recovered by ADEPT are both diverse and richly descriptive of arthritic

conditions, while the majority of those recovered by OBA are spurious. Next, we construct a graph of

co-occurring terms extracted by ADEPT from MedHelp’s Addiction: Substance Abuse forum, Forum77

(§ 5.6.2). The resulting graph suggests that a primary topic of discussion on the forum is withdrawal, and

moreover, that users discuss explicit drugs, especially prescription opioids, on the forum. Our work in

Chapters 6, 7 and 8 further explores Forum77 and confirms that these high-level insights are accurate.

5.2 Related Work

5.2.1 Medical Term Identification

MetaMap, arguably the best-known medical entity extractor, is a highly configurable program that relates

words in free text to concepts in the UMLS Metathesaurus [16,17]. MetaMap sports an array of analytic

components, including word sense disambiguation, lexical and syntactical analysis, variant generation,

and POS tagging. MetaMap has been widely used to process data sets ranging from email to MEDLINE7

abstracts to clinical records [17,31,43].

The Open Biomedical Annotator (OBA) is a more recent biomedical concept extraction tool under

development at Stanford University. OBA is based on MGREP: a concept recognizer developed at the

University of Michigan [138]. Like MetaMap, OBA maps words in free text to ontological concepts; its

7A collection of biomedical publications abstracts. For more information see: http://www.nlm.nih.gov/pubs/factsheets/medline.html


workflow, however, is simpler, comprising a dictionary-based concept recognition tool and a semantic

expansion component that finds concepts related to those present in the exact text [138].

A handful of studies compare MetaMap and/or OBA to human annotators, and tend to find the

tools wanting. Ruau et al. [212] evaluated automated MeSH annotations on PRoteomics IDEntification

(PRIDE) experiment descriptions against manually assigned MeSH annotations. MetaMap achieved

precision and recall scores of 15.66% and 79.44%, while OBA achieved 20.97% and 79.48%. Pratt and

Yetisgen-Yildiz [201] compare MetaMap’s annotations to human annotations on 60 MEDLINE titles: they

found that MetaMap achieved exact precision and recall scores of 27.7% and 52.8%, and partial preci-

sion and recall scores of 55.2% and 93.3%. They note that several failures result from missing concepts

in the UMLS. This is corroborated in an analysis of 376 patient-defined symptoms from PatientsLikeMe

by Smith and Wicks [226], who found that only 43% of unique terms had either exact or synonymous

matches in the UMLS; of the exact matches, 93% were contributed by SNOMED CT.

In addition to ontological approaches, there are several statistical approaches to medical term iden-

tification. NaCTeM’s TerMINE8 is a domain-independent tool that uses statistical scoring to identify

technical terms in text corpora [94]. Given a corpus, TerMINE produces a ranked list of candidate terms.

In a test on eye-pathology medical records, precision was highest for the top 40 – ranked by C-value –

terms (∼75%) and decreased steadily down the list (∼30% overall). Absolute recall was not calculated,

due to the time-consuming nature of having experts verify true negative classifications in the test corpus.

Recall relative to the extracted term list, however, was ∼97% [94].

As we discuss in Chapter 3, a great deal of prior work has focused on training statistical classifiers

for biomedical named entity recognition (NER) tasks [76,87,95,111,124,125,214,222,238,239,267]. In

general, this work demonstrates good results, indicating that statistical classification methods are more

appropriate for biomedical NER tasks than MetaMap and OBA. However, none of this work utilizes PAT

as a primary data source: statistical classifiers require sizable quantities of labeled data for training and

testing, and to date all available such data sets are based on biomedical publication abstracts [145,146,

204,271].

5.2.2 Consumer Health Vocabularies

A complementary and closely related branch of research to ours is Consumer Health Vocabularies

(CHVs): ontologies that link layman and UMLS medical terminology [85,273]. Supporting motivations for

8http://www.nactem.ac.uk/software/termine


developing CHVs include: narrowing knowledge gaps between consumers and providers [273,274], cod-

ing data for retrieval and analysis [77], improving the “readability” of health texts for lay consumers [144]

and coding new concepts that are missing from the UMLS [143, 226]. We are currently aware of two

CHVs: the MedlinePlus Consumer Health Vocabulary9, and the open and collaborative Consumer Health

Vocabulary10 – (OAC) CHV – which was included in UMLS as of May 2011.

To date, most research on CHVs has focused on discovering new terms to add to the (OAC) CHV. In

2007, Zeng et al. [274] compared several automated approaches for discovering new “consumer medical

terms” from MedlinePlus query logs. Using a logistic regression classifier, they achieved an AUC of

95.5% on all n-grams not present in the UMLS. More recently, Doing-Harris & Zeng [77] proposed a

computer-assisted update (CAU) system to crawl PatientsLikeMe, suggesting candidate terms for the

(OAC) CHV to human reviewers. By filtering CAU terms by C-value [94] and termhood [274] scores, they

were able to achieve a 4:1 ratio of valid to invalid terms; however, this also resulted in discarding over

50% of the original valid terms. Given the goals of the CHV movement, our CRF model for PAT medical

term identification may prove to be an effective method for generating new candidates terms for CHVs.

5.3 Data

In this section we describe our data preparation and sampling methods. We use samples from our

MedHelp (§ 4.1) data set for comparing crowdsourced vs. expert sourced labels, and for training and

cross-validation of our CRF classifier. We use a sample from our CureTogether (§ 4.2) data set as a

hold-out gold standard for comparing our CRF classifier to state of the art medical term annotation tools.

5.3.1 Preparation

We analyze our data at the sentence level. This promotes a fairer comparison between machine taggers,

which break text into independent sentences or phrases before annotating, and human taggers, who may

otherwise transfer context across sentences. We use Lucene11 to tokenize our corpora into sentences.

For consistency, we excluded sentences from MedHelp forums that we agreed were tangentially

medical (e.g.,“Relationships”), over-general (e.g.,“General Health”), or that contain fewer than 1,000

9http://www.nlm.nih.gov/medlineplus/xml.html10http://consumerhealthvocab.org11http://lucene.apache.org


sentences. The raw MedHelp data set contains approximately 1,250,000 discussions. After prepara-

tion, the data set comprises approximately 950,000 discussions from 138 forums: a total of 27,230,721

sentences.

5.3.2 Samples

We use the following samples:

MH1K : 1,000 MedHelp sentences sampled uniformly at random; labeled by crowd and experts. We

use this sample to compare expert and crowd labels. We also use the expert labels as a gold standard

for comparing our CRF classifier’s performance against state-of-the-art tools.

MH10K : 10,000 MedHelp sentences sampled uniformly at random; labeled by crowd. We use this

sample to train our CRF classifier to identify medically relevant terms in PAT. We also use it for 10-fold

cross validation of this classifier.

CT1K : 1,000 CureTogether sentences sampled uniformly at random; labeled by experts. We use this

as an independent gold standard for comparing our CRF classifier performance against those of state-

of-the-art tools.

5.4 Labeling Medically Relevant Terms with the Crowd

A common barrier to both training and evaluating medical text annotators is the lack of sufficiently large,

labeled data sets [17,201]. The challenge in building such data sets lies in sourcing medical experts with

enough time to annotate text at a reasonably low cost [201].

Crowdsourcing is the allocation of a series of small tasks (often called micro-tasks) to a “crowd”

of online workers, typically via a web-based marketplace. Crowdsourcing is particularly attractive for

obtaining results faster and at lower cost than other participant recruitment schemes. When the workflow

is properly managed (e.g., via quality control measures such as aggregate voting, or by breaking up tasks

into suitable sub-components such the “find-fix-verify” method proposed by Bernstein et al. [26]) the

combined results are often comparable in quality to those obtained via more traditional task completion

methods [126, 147]. Snow et al. [228] find that non-expert crowds can effectively execute linguistic

annotation tasks (affect recognition, word similarity, textual entailment, temporal ordering, and word


sense disambiguation) that are typically performed by experts. However, designing a crowdsourcing

task such that quality results are obtained is challenging and requires careful though [26,147].

Replacing medical experts with non-expert crowds would address concerns of time and cost, allowing

us to build labeled PAT data sets quickly and cheaply. To test the viability of this idea, we first design

a crowdsourcing task for medical term identification in PAT (§ 5.4.1). Next, we deploy this task to both

experts (in our case, Registered Nurses, or RNs) and non-experts (Amazon Mechanical Turk workers,

or Turkers), and compare their annotations over a sample of 1,000 sentences (MH1K ) (§ 5.4.2).

5.4.1 Task Design and Pilot Study

Amazon’s Mechanical Turk12 is an online crowdsourcing platform where workers (Turkers) can browse

“human intelligence tasks” (HITs) posted by requesters and complete them for a small payment. We de-

signed a simple interface in which a HIT comprised 100 sentences, each of which was accompanied by

a text box into which Turkers could copy medically relevant terms. Our original prompt simply asked Turk-

ers to copy/paste any terms that seemed medically relevant from each sentence into the accompanying

text box. The resulting data contained several inconsistencies, including:

terms taken out of context: users selected terms that had no medical relevance in the context of

the given sentence, but might have medical connotations in other contexts. E.g., “anxiety” in the

sentence “I apologize if my post created any undue anxiety”.

omission: users would often leave an empty response for a sentence that contained a term that

was clearly medically relevant.

numerical measurement inclusion: some users felt that numbers corresponding to medication

dosages, units of measurement, etc. were relevant, while others did not.

concept granularity and scope: in a sentence such as “I have low blood sugar”, users would not

know whether to select “low blood sugar” or just “blood sugar”.

repetition: if a medically relevant term appeared twice in the same sentence (e.g.,“pain” in “I am

in a lot of pain and the meds don’t seem to help, they just take the edge off the pain if anything”),

some users would extract it only once, and others would extract it each time it appeared.

12http://www.mturk.com


Prior work shows that the design of a crowdsourcing task and prompt strongly impacts response

quality [147]. In order to arrive at a suitable prompt that produced consistent results, we iterated on our

original version several times, basing our changes on the design principles outlined by Kittur et al. [147].

We discuss pivotal changes below; Figure 5.1 shows our final prompt and interface.

The most problematic inconsistency was terms taken out of context, which amount to unnecessary

false positives. Subjective tasks are especially difficult for crowd workers [147], and the medical term

identification task is inherently subjective. We discovered, however, that making the task seem less

subjective by asking users to tag words/phrases that they thought doctors would find interesting, all but

eliminated this effect.

The next problematic issue was omission, or unnecessary false negatives. We suspected that one

reason Turkers were cheating was because by doing so they could complete the HIT faster. Kittur et

al. [147] note that to acquire accurate results from Turkers, malicious completion and good-faith comple-

tion should require comparable levels of effort. We changed our interface such that each text box had

to contain some value prior to completion of the HIT, and instructed Turkers to type “NA” into text boxes

corresponding to sentences containing no medically relevant concepts. This helped somewhat, but it

is still easier to type “NA” than to copy/paste several terms into a text box. Kittur et al. [147] also note

that signaling to Turkers that their responses will be verified in a believable manner is thought to reduce

invalid responses as well as increase time spent on task. Before accepting the HIT, we informed Turkers

that four other Turkers would be completing the same HIT, and that their response would be rejected if it

disagreed substantially from the others. We enforced this policy. Implementing these changes resulted

in a drastic reduction of omissions.

Explicitly asking users to ignore numerical measurements and providing illustrative examples on

multi-word concepts reduced conflicting incidences of numerical measurement inclusion and concept

granularity to the point where aggregating over Turker responses produced a good result. However,

similar interventions related to issues of repetition had no effect. Ultimately we propagated the “medically

relevant” label to all unlabeled terms in the sentence that matched an extracted term. It is reasonable

to assume that two identical terms should carry the same label in a sentence, and we observed no

instances in which this assumption was violated.


Instructions (please read to get full credit for this task)

For this HIT, we would like you to extract all words/phrases that are medical concepts from thesentences below. There are 100 sentences; this should take ~15-25 minutes.

To find medical concepts, ask yourself the question: "If I was telling this to my doctor, which wordswould the doctor find interesting?" To simplify things, do not extract numerical values such as age,weight, gender, medication dosage, symptom duration etc. Do extract concepts describing body parts,conditions (and causes and effects of conditions), symptoms, treatments, etc. Remember that somemedically relevant terms are abbreviated (e.g. BS for "blood sugar").

For each sentence, please COPY/PASTE the relevant text EXACTLY (do not re-type it, or correctmisspellings), and SEPARATE each concept with a COMMA. For example:

I gave up smoking 2 weeks ago, and my blood pressure is under control with verapamil (0.5mg twicea day)..smoking, blood pressure, verapamil

For multi-word concepts, include as many words as you can, but make sure that they refer to just ONEconcept. Do not extract overlapping concepts. For example, in the sentence below, the term "bloodsugar" is preferred to "blood".

Shakes in the hands can be symptomatic of low blood sugar.shakes, hand, blood sugar

Finally, many of the sentences will contain no medically relevant concepts. Just enter NA in the boxesin these cases. For example:

You need to take care of yourself before you can take care of someone else.NA

NOTE: you will be able to complete ONLY ONE of these HITs. Please do not attempt to acceptanother hit after completing this one. Have fun!

Submit

Figure 5.1: Final PAT medical term identification task instructions and interface. Turkers were informedthat their answers would be checked against other Turkers’ in the HIT description on the MTurk interface.

5.4.2 Experiment

We use our MH1K data set for this experiment: a uniform sample of 1,000 sentences from the general

MedHelp data set. We deemed 1,000 sufficiently large for an informative comparison between RN and


Table 5.1: Majority vote at the token level over RN responses. Terms identified by RNs as medicallyrelevant are shown in bold. Stopwords (e.g.,“and”, “of”) are excluded from the vote.

RN 1: shakes in the hands can be symptomatic of low blood sugarRN 2: shakes in the hands can be symptomatic of low blood sugarRN 3: shakes in the hands can be symptomatic of low blood sugar

Result: shakes hands symptomatic blood sugar

Turker responses, but small enough to make expert annotation affordable. We split the sample into 10

groups of 100 sentences.

Our experts comprised 30 RNs from ODesk13, an online professional contracting service. In addition

to the RN qualification, we required that each expert have perfectly rated English language proficiency.

Each expert did one PAT medical term identification task (100 sentences), and each group of 100 sen-

tences was tagged by three experts, who were reimbursed $5.00 for completing the task. All tasks were

completed within two weeks at a cost of $150.00.

Our non-expert crowd comprised 50 Turkers recruited from Amazon’s Mechanical Turk (AMT). We

required that the Turkers have high English language proficiency, reside in the United States, and be

certified to work on potentially explicit content. Each Turker performed a single PAT medical term iden-

tification task (100 sentences), and each sentence group was tagged by five Turkers. The Turkers were

reimbursed $1.20 upon faithful completion of the task. All tasks were completed within 17 hours at a cost

of $60.00.

Determining a Gold Standard

We determine a gold standard for each sentence by taking a majority vote over the RNs’ responses.

Voting is performed at the word level, despite the prompt to extract words or phrases from the sentences.

Table 5.1 illustrates how this simplifies term identification by eliminating partial matching considerations

over multi-word concepts. N-gram terms can be recovered by heuristically combining adjacent words.

Comparing Turkers Against a Gold Standard

To test the feasibility of using non-expert crowds in place of experts, we compare Turker to RN responses

directly, aggregating across all 5 possible Turker voting thresholds. This allows us both to evaluate

13http://www.odesk.com


Table 5.2: Turker performance against the RN gold standard. Voting threshold indicates the minimumnumber of Turkers who have to annotate a term as medically relevant for it to be included in the result.Maximum column values are indicated in bold. A corroborative policy of 2+ votes yields high scoresacross the board, and maximizes F1-score.

Vote Threshold F1 Precision Recall Accuracy MCC

1 78.45 67.15 94.31 93.96 0.772 84.43 82.53 86.41 96.29 0.823 83.80 91.67 77.18 96.52 0.824 76.61 95.70 63.87 95.46 0.765 59.81 97.99 43.04 93.26 0.62

the quality of aggregated Turker responses against the gold standard and to select the optimal voting

threshold.

5.4.3 Results

Both the RN and the Turker group achieve high inter-rater reliability scores: κ = 0.709 and κ = 0.707

respectively using Fleiss’ Kappa [88], which measures agreement across two or more voters. Table 5.2

compares aggregated Turker responses against the RNs’ gold standard; voting thresholds dictate the

number of Turker votes required for a word to be tagged as “medically relevant”.

F1-score is maximized at a voting threshold of 2. We call this a corroborated vote, and select 2 as

the appropriate threshold for our remaining experiments. Overall, Turker scores are sufficiently high that

we regard corroborated Turker responses as an acceptable approximation for expert judgment.

5.4.4 Limitations of the Crowd

Crowdsourcing medical term identification in PAT allows us to build large, annotated data sets both

cheaply and quickly. Exploring the crowd’s efficacy at other medical entity annotation tasks is an impor-

tant avenue for future work. Here, we offer some anecdotal insights based on our own attempts to get

the crowd to label specific types of medical terms in PAT. We attempted to implement two tasks similar

to that described in § 5.4.1: in the first, we asked Turkers to identify terms referring to symptoms and/or

conditions (e.g.,“cough”, “asthma”, “headache”). In the second task, we asked them to identify terms

referring to drugs and/or treatments (e.g.,“acupuncture”, “Tylenol”, “cough medicine”).


Although Turkers’ seemed to approach the task earnestly (they spent a reasonable amount of time

on it), the results were surprisingly inconsistent. In fact, some workers defaulted to labeling any terms

that were medically relevant, even though it is unlikely that they had been exposed to the original task

described in § 5.4.1, as more than 6 months had since elapsed. Ultimately, we hypothesized that there

were three factors explaining Turkers’ poor performance:

The first is subjectivity. The task of identifying symptoms or treatments is ambiguous and, in our

experience, more subjective than that of identifying terms that are simply medically relevant. For ex-

ample, do wheelchairs, relaxation classes, birth control or drinking water constitute treatments? Do

sensations, flare-ups, pregnant and being worried constitute symptoms or conditions? The answers to

these questions tend to be “it depends”.

The second is concept scatteredness, which primarily affects the symptom/condition category. Symp-

tom descriptions are often spread across an entire sentence, and Turkers are unsure of how to scope

such concepts. Consider, for example, the phrase “after I took the meds I felt like I’d been hit by a truck”.

Is “felt like I’d been hit by a truck” a symptom? This challenge is also cited by Leaman et al. [154] in work

on mining adverse drug events from user comments on DailyStrength14.

The final factor that likely affected Turker performance was task overlap. The postings of the symptom

and/or condition task and the drug and/or treatment tasks were staggered by a couple of days. However,

we noticed that some people tried to pick out just drugs and/or treatments in a symptoms and/or con-

ditions task, and vice versa. We attribute such mixups to the fact that the same Turkers who had done

the earlier task were also attempting the staggered task, but had habituated to the first task. Allowing

more time to elapse before posting the second task, or preventing Turkers from doing both tasks, should

ameliorate this effect.

We believe that with additional design and iteration, it would be possible to get Turkers to identify

specific types of medical terminology in PAT. For example, a multi-tiered approach such as find-fix-

verify [26] might reduce the level of task subjectivity. Enhancing the interface such that Turkers could

select “core” concepts and then related supporting terms might facilitate accuracy. Refining the task to

make it more specific would likely reap rewards. For example, instead of asking Turkers to “find terms

referring to symptoms or conditions”, they might be asked to “find terms that refer to symptoms related

to the condition Asthma”.14http://www.dailystrength.com


In sum, however, designing a crowdsourcing task can be a resource intensive process, and this

must be traded off against alternative annotation methods. In our later work on Forum77, our data were

sufficiently small that we elected to annotate it ourselves. However, systematically exploring the design

space of crowdsourcing PAT annotation tasks would likely yield high returns in the long term.

5.5 Training a Classifier on Crowd-Labeled Data

We now turn to the question of training a statistical classifier to identify medical terms in PAT automat-

ically. We describe the models that we both use and compare against (§ 5.5.1), before describing our

experiment design (§ 5.5.2). Next, we present our results (§ 5.5.3), along with a failure analysis of our

classifier, ADEPT. Finally, we discuss our results and the limitations of our approach (§ 5.7).

5.5.1 Models

MetaMap, OBA and TerMINE We use the Java API for MetaMap 201215, running it under three con-

ditions: default; restricting the target ontology to SNOMED CT (a high percentage of “consumer health

vocabulary” is reputedly contained in SNOMED CT [226]), and restricting the target ontology to the

(OAC) CHV. We used the Java client for OBA [138], running it under two conditions: default; and restrict-

ing the target ontology to SNOMED CT, as the OAC (CHV) was not available to the OBA at the time of

writing. For TerMINE, we used the online web service16.

Dictionary A dictionary (or gazette) is one of the simplest classifiers that we can build using labeled

training data. Our dictionary compiles a vocabulary of all words tagged as “medical” in the training data

according to the corroborative voting policy; it then scans the test data and tags any words that match a

vocabulary element. Our dictionary implements case-insensitive, space-normalized matching.

ADEPT: A Conditional Random Field Model Conditional random fields (CRFs) are probabilistic

graphical models particularly suited to labeling sequence data [151]. Their suitability stems from the

fact that they relax several independence assumptions made by Hidden Markov Models; moreover, they

can encode arbitrarily related feature sets without having to represent the joint dependency distribution

over features [151]. As such, CRFs can incorporate sentence-level context into their inference proce-

dure. For example, a CRF can discern that the word “tired” represents a medical term in the sentence,15http://metamap.nlm.nih.gov16http://www.nactem.ac.uk/software/termine


“I’m feeling so tired, as though I am oxygen deprived.”, but not in the sentence, “I’m tired of feeling as

though I am oxygen deprived.”” The term “oxygen deprived” is medically relevant in both sentences17:

Our CRF training procedure takes, as input, labeled training data coupled with a set of feature defi-

nitions, and determines model feature weights that maximize the likelihood of the observed annotations.

We use the Stanford Named Entity Recognizer package18, a trainable Java implementation of a CRF

classifier, and its default feature set. Examples of default features include word substrings (e.g.,“ology”

from “biology”) and windows (previous and trailing words); the full list is detailed in Appendix A. We refer

to our trained CRF model as ADEPT (Automatic Detection of Patient Terminology).

5.5.2 Design

To test our second hypothesis, we create a crowd-labeled data set comprising 10,000 MedHelp sen-

tences (MH10K ), and a RN-labeled data set comprising 1,000 CureTogether sentences (CT1K ). Using

the procedures described in § 5.4, this cost approximately $600 and $150, respectively. We train two

models – a dictionary and a CRF – on the MedHelp data set (MH10K ), and evaluate performance via

5-fold cross validation; we compare MetaMap, OBA and TerMINE’s output directly. Finally, we compare

the performance of all 5 models against the CureTogether gold standard (CT1K ).

5.5.3 Results

Table 5.3 shows the performance of MetaMap, OBA, TerMINE, the dictionary model and ADEPT on

MH10K , (MH1K and CT1K ). ADEPT achieves the maximum score in every metric, bar recall. Moreover,

its high performance carries over to the Cure Together test corpus, indicating adequate generalization

from the training data. Figure 5.2 provides illustrative examples of the models’ performance on sample

sentences from MH1K .

Failure Analysis

While ADEPT’s results are promising, assessing points of failure is useful for future improvements and

implementations. Figure 5.3 plots term classification accuracy against logged term frequency in both test

corpora. We observe that while most terms are always correctly classified, a number of terms (∼650) are

never classified correctly. Of these, almost all (>90%) appear only once in the test corpora. A LOWESS

17Note: this is actual output from our final classifier.18http://nlp.stanford.edu/software/CRF-NER.shtml


ADEPT: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma Dictionary: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma MetaMap: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma

OBA: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma TerMINE: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma

ADEPT: last summer i was at home with my daughter who is now 2

Dictionary: last summer i was at home with my daughter who is now 2 MetaMap: last summer i was at home with my daughter who is now 2

OBA: last summer i was at home with my daughter who is now 2 TerMINE: last summer i was at home with my daughter who is now 2

ADEPT: in my case the woman my husband had an affair with reassured him twice she had no stds Dictionary: in my case the woman my husband had an affair with reassured him twice she had no stds MetaMap: in my case the woman my husband had an affair with reassured him twice she had no stds

OBA: in my case the woman my husband had an affair with reassured him twice she had no stds TerMINE: in my case the woman my husband had an affair with reassured him twice she had no stds

ADEPT: i had a chest xray done and they said there was something in my lung Dictionary: i had a chest xray done and they said there was something in my lung MetaMap: i had a chest xray done and they said there was something in my lung

OBA: i had a chest xray done and they said there was something in my lung TerMINE: i had a chest xray done and they said there was something in my lung

ADEPT: mgmt retail sales not overweight good almost great posture Dictionary: mgmt retail sales not overweight good almost great posture MetaMap: mgmt retail sales not overweight good almost great posture

OBA: mgmt retail sales not overweight good almost great posture TerMINE: mgmt retail sales not overweight good almost great posture

Figure 5.2: Sample sentences labeled by ADEPT, the dictionary, MetaMap, OBA and TerMINE.

Table 5.3: Annotator performance against the crowd-labeled data set and the gold standards. Maximumcolumn values are indicated in bold.

Validation data set Annotator F1 Precision Recall Accuracy MCC Parameters

Crowd-labeledMH10K

MetaMap 32.64 21.88 64.20 70.44 0.24 Default34.97 25.45 55.85 76.83 0.26 SNOMED CT34.88 24.48 60.63 74.75 0.26 CHV

OBA 43.77 30.20 79.53 77.21 0.39 Default43.23 36.15 53.76 84.25 0.35 SNOMED CT

Dictionary 46.18 32.34 80.75 79.02 0.42ADEPT 78.41 82.66 74.59 95.42 0.76

MedHelpgold standardMH1K

MetaMap 37.73 28.03 57.67 77.82 0.29 SNOMED CTOBA 45.78 32.10 79.31 78.04 0.41 SNOMED CTTerMine 42.35 52.67 35.41 88.77 0.37Dictionary 37.30 26.34 63.89 74.98 0.29ADEPT 78.33 82.55 74.53 95.20 0.76

CureTogether goldstandardCT1K

MetaMap 39.12 29.33 58.57 74.13 0.27 SNOMED CTOBA 47.28 33.56 79.91 74.74 0.40 SNOMED CTTerMine 43.09 53.11 36.25 86.43 0.37Dictionary 38.74 27.53 65.35 70.65 0.27ADEPT 77.74 78.82 76.69 93.78 0.74


4/27/13 4:29 PMAdept Chart

Page 1 of 1http://localhost:8999/scatter.html

1 term

10 terms

100 terms

500 terms

0 1 2 3 4 5 6 7

ln(frequency) of term in test corpora

0

10

20

30

40

50

60

70

80

90

100C

lass

ifica

tion

accu

racy

(%)

Figure 5.3: Term classification accuracy plotted against logged term frequency in test corpora. Purple(darker) circles represent terms that are always classified correctly; blue (lighter) circles represent termsthat are misclassified at least once. A LOWESS fit line to the entire data set (black) shows that mostterms are always classified correctly. A LOWESS fit line to the misclassified points (blue/lighter) showsthat classification accuracy increases with term frequency.

fit to the points representing terms that were misclassified at least once shows that classification accu-

racy increases with term frequency in the test corpora (and by logical extension, term frequency in the

training corpus). As we might expect, over half (∼51%) of the misclassified terms occur with frequency

one in the test corpora. A review of these terms reveals no obvious term type (or set of term types)


Table 5.4: Examples of ADEPT’s misclassifications in the test corpora.

Frequently Misclassified(FP > 1, FN > 1)

baby, bc, condition, doctor, doctors, drs, health, ice, natural, relief, short, strain,weight

Mostly False Positive(FP > 1, FN ≤ 1)

accident, decreased, drinks, drunk, exertion, external, healthy, heavy, higher, lie,lying, milk, million, pants, periods, prevention, solution, suicidal . . . [37 more terms]

Mostly False Negative(FP ≤ 1, FN > 1)

appointment, clear, copd, hiccups, lack, ldn, massage, maxalt, missed, nurse, physi-cian, pubic, rebound, silver, sleeping, smell, tea, treat, tree, tx . . . [41 more terms]

Infrequently Misclassified(FP ≤ 1, FN ≤ 1)

cravings, generic, growing, hereditary, increasing, lab, limit, lunch, panel, pituitary,position, possibilities, precursor, taste, version, weakness . . . [118 more terms]

likely to be incorrectly classified. Indeed, many are typical words with conceivable medical relevance

(e.g.,gout, aggravates, irritated). Such misclassifications would likely improve with more training data,

which would allow ADEPT to learn new terms and patterns.

It remains to investigate terms that are both frequent and frequently misclassified. Table 5.4 shows

terms from the test corpora that ADEPT misclassifies at least once. Immediately obvious is the pres-

ence of terms that are medical but generic, such as doctor, doctors, drs, physician, nurse, appointment,

condition, and health. These misclassifications likely stem from ambivalence in the training data; indeed,

Yetisgen-Yildiz and Pratt [201] find that human annotators have low certainty over whether to include

general terms such as these in medical term annotation tasks. In either case, specific instructions to

human annotators on how to handle generic terms, or rule-based post-processing of annotations, could

ameliorate such errors.

5.6 Example Applications of ADEPT to PAT

To illustrate ADEPT’s efficacy, we present two applications to PAT corpora. The first is to MedHelp’s

Arthritis forum, with an eye to summarizing its important medical concepts. In this application, we com-

pare ADEPT’s output with OBA’s. Our second application is to Forum77, MedHelp’s Addiction: Sub-

stance Abuse forum, in which our goal is to generate a high-level concept map of its medically relevant

content.


5.6.1 Summarizing Important Medical Content in MedHelp’s Arthritis Forum

A simple way of summarizing the medical content in a PAT corpus is to simply rank all relevant terms by

frequency, and select the top N . Figure 5.4 compares the top 50 medical terms in MedHelp’s Arthritis

forum as determined by ADEPT and the OBA. (We picked OBA instead of MetaMap due to its superior

performance – see Table 5.2). The terms recovered by ADEPT are both diverse and richly descriptive of

arthritic conditions; in contrast, the majority of terms recovered by the OBA are spurious, and serve only

to demote the rankings of the few relevant terms that it does find.

5.6.2 Navigating MedHelp’s Substance Abuse Forum (Forum77)

A natural way of acquiring a casual overview of a corpus’ content is to visualize both the important

medical terms, as well as significant relationships between them. Including term relationships imparts an

extra layer of insight to the underlying content. For example, if drug terms tend to co-occur in sentences,

then it is likely that users compare drugs in their discussions. On the other hand, if drug terms tend to

co-occur with symptom terms, then discussions likely document which drugs treat specific symptoms.

To acquire a high-level topography of Forum77’s medical content, we first apply ADEPT to the Fo-

rum77 corpus. Filtering out infrequent terms (terms that appear < 10 times in the corpus), we score

connections between remaining co-occurring terms with the G2 metric, which rewards significant (or in-

teresting) co-occurrence relationships over common ones [78]. We then use Gephi19, a tool for graph

analysis and visualization, to explore the results interactively.

Note that what follows is a casual analysis in which we utilize Gephi’s internal filtering and clustering

features to facilitate rapid exploration. Our goal is to illustrate a typical point of departure in exploring

a novel corpus of ADEPT-extracted PAT terms. Figure 5.5 shows a co-occurrence graph over ADEPT-

extracted Forum77 terms, with node labels omitted to illustrate the underlying graph structure. Imme-

diately obvious is the presence of two, large, interlinked clusters (dark and light blue). A third cluster

(dark green) is more independent. We examine each of these clusters in greater detail by filtering out

non-member nodes, and recalculating the graph layout.

Figure 5.6 shows the largest (light blue) cluster with node labels. This cluster appears to detail gen-

eral aspects of addiction related to detoxification: suboxone and methadone are synthetic opioids used

in opioid-replacement therapy; detox and taper are direct detoxification references; many other nodes

19http://gephi.github.io


!

!!!!!!!!!!!!!!!!!!!!!!!!!!

ADEPT OBA pain have

arthritis pain symptoms doctor

joints arthritis knees like

feet help hands time

swelling years neck symptoms knee right

fingers did ankles work

legs blood tests joint joint good

rheumatologist does diagnosed need

swollen months meds joints

disease test surgery knee

treatment day leg started

shoulder ago spine try

doctor is a inflammation tests

wrists better test left

stiffness hope painful long

diagnosis year arms disease

toes bad fatigue rheumatologist

shoulders diagnosed joint pain here

wrist days bone hands

muscles old arm sure

osteoarthritis weeks foot knees hip doctors

medication normal negative cause positive lot

skin got cold make

Figure 5.4: Top 50 terms, ranked by frequency, derived from MedHelp’s Arthritis forum as determinedby ADEPT (left) and OBA (right). Terms unique to their respective portion of the list are shown in bold.Terms occurring in both lists are linked with a line. The gradient of these lines show that all co-occurringterms, bar three, are more highly ranked by ADEPT.


Figure 5.5: A graph showing important terms in Forum77 (nodes), and significant co-occurrence rela-tionships between them (edges). Node size is proportional to degree, while colors indicate clusters.Node labels are omitted for legibility; instead, we examine main clusters in-depth in subsequent figures.

detail withdrawal symptoms (anxiety, cramps, body aches, muscle-tremors, muscles-restlessness, etc.).

Overall, this cluster suggests that Forum77 hosts detailed discussions on the process and mechanisms

of opiate withdrawal.

Figure 5.7 illustrates the second-largest (dark blue) cluster. This cluster is almost clique-like, and its

core comprises primarily addictive prescription drugs: oxy (oxycodone), hydro (hydrocodone), xanax,

vicodin, benzo (benzodiazapine) etc. This cluster also details several withdrawal symptoms (tired, chills,


Figure 5.6: The largest cluster in Figure 5.5 suggests that discussions frequently involve detoxificationfrom prescription drugs.

Figure 5.7: The second-largest cluster in Figure 5.5 suggests that discussions frequently pair specificdrugs and the withdrawal symptoms that they cause.


Figure 5.8: The third-largest cluster in Figure 5.8 contains medically relevant terms from Thomas’Recipe: a user-developed schedule for medication-assisted opioid withdrawal.

flu, etc.) as well as body parts (head, legs, skin, etc.), suggesting a great deal of discussion around

specific prescription opioids and their associated withdrawal symptoms.

Finally, Figure 5.8 shows the third-largest cluster (dark green). Like Figure 5.7, the structure is clique-

like. Its nodes constitute a combination of withdrawal symptoms (runny nose, general aches, leg cramps

etc.), terms representing wellness activities or supplements (mild exercise, cycling, vitamin b6, zinc,

l-tyrosine etc.), and non-opiate drugs (ativan, imodium, benzodiazepine). In hindsight, it is clear that

this cluster represents medically relevant terms from Thomas’ Recipe: a user-developed schedule for

medication assisted opioid withdrawal that is popular on Forum77. We discuss Thomas’ Recipe in depth

in § 6.8.1.

These casual explorations of co-occurring ADEPT-extracted Forum77 terms suggest that withdrawal

is a primary topic of discussion on the Forum (Figures 5.6, 5.7). Moreover, users discuss specific drugs,

primarily prescription drugs (Figure 5.7). Without prior knowledge of Thomas’ Recipe (§ 6.8.1), guessing

that Figure 5.8 partially represented a detoxification protocol would be difficult, although the nodes opiate


detox and at-home self-detox might have provided a clue. Overall, our later work in this thesis shows

that these explorations yield accurate, although incomplete, insights into Forum77’s primary content.

5.7 Conclusion

Our work on ADEPT was prompted by the observation that despite the abundance of PAT, tools for

extracting medically relevant content from it are lacking. This, in turn, restricts general exploration and

hypothesis generation over PAT corpora. One major limitation to building such tools is a lack of large,

annotated corpora for training and testing statistical models.

Our first result addresses this by proving that a crowd of non-experts is a sufficient replacement for

medical experts in the PAT medical term identification task (§ 5.4). Through paying careful attention to

existing crowdsourcing design principles, we were able to design a prompt and task that resulted in labels

of comparable quality to those produced by experts (§ 5.4.1). Combined and aggregated according to

a corroborative vote, Turker responses achieve an F1-Score of 84% against our RNs’ gold standard

(§ 5.4.2). As crowds of non-experts are much easier to coordinate than medical experts, this opens

up the option of building large, labeled PAT corpora of high quality both quickly and cheaply. We note,

however, that not all tasks may be suitable to crowd labeling; those that are more subjective or require

specialized knowledge may involve particularly challenging task design (§ 5.4.4).

Next, we addressed the issue of automating the PAT medically relevant term identification task

(§ 5.5). ADEPT, our CRF classifier trained on crowd-labeled data, dramatically outperforms existing

tools MetaMap, OBA and TerMINE (§ 5.5.3). Moreover, ADEPT’s performance carries over to an in-

dependently sourced PAT gold standard from CureTogether. While one limitation of ADEPT is that it

does not identify specific term types (e.g.,drugs, symptoms), it is excellent at finding terms of medical

relevance. This makes it a useful and novel tool for summarizing and exploring PAT corpora (§ 5.6.2).

We attribute ADEPT’s success to the suitability of sentence-level, context-sensitive learning models

like CRFs to PAT medical term identification tasks. Our dictionary, trained on the same data as ADEPT,

achieves high recall because it collects many medical terms from training data, but it achieves low pre-

cision because it cannot discriminate between relevant and irrelevant invocations of these terms. Unlike

ADEPT, for example, the dictionary cannot learn that the word “sugar” is of particular medical relevance

when it co-occurs with the word “diabetes”. The third sentence in Figure 5.2 suggests that context-based

relevance detection may be problematic for MetaMap and OBA, too. In this sentence, the term “case” is


annotated because of its membership in SNOMED-CT as a medically relevant term pertaining either to

a “situation” or a “unit of product usage”.

In concert, our contributions in this chapter constitute an alternative approach to medical term anno-

tation and identification. In Chapter 7 we leverage the lessons learned in this chapter to extract a specific

type of medical term from Forum77 discussions: users’ drugs of choice. First, however, in Chapter 6 we

investigate users’ motivations for participating in Forum77.

Chapter 6

What do People Seek on Forum77?

Forum77 is the largest community on MedHelp, which indicates that it provides something that users

need and find useful. But what do people seek through participation on Forum77? Insight into how and

why users engage with Forum77 is instructional in its own right, but also provides a valuable template

for planning future, targeted explorations of the corpus. Our goal in this chapter is to elucidate users’

motivations for initiating discussions on Forum77.

We first motivate our focus on the topic of addiction (§ 6.1) before covering related work (§ 6.2) and

summarizing the data sets used in this chapter (§ 6.3). Next, we conduct a thematic analysis, developing

a taxonomy of users’ reasons for participation (§ 6.5). In congruence with prior work, the two driving

motivations are seeking emotional support and seeking informational support. Within these categories

are sub-categories specific to the topic of substance abuse, such as seeking information on withdrawal

and expressing concern about relapse. The most prevalent label, accounting for over 30% of all initiating

posts, is the update: a status log devoid of requests for feedback.

Next, we discuss the training and evaluation of two binary statistical classifiers than can distinguish

emotional from informational posts (§ 6.6), and update from non-update posts (§ 6.7). Our classifiers

perform well, achieving F1-scores of 80.12% and 76.54% for emotional vs. informational and update vs.

non-update, respectively.

Finally, we present the results of applying these classifiers to the entire Forum77 corpus (§ 6.8). We

compare and contrast features such as thread longevity and response rates across thread categories.

We also present and discuss Thomas’ Recipe: a highly prevalent informational support artifact on Fo-

rum77 that we came across in the course of our analyses. We conclude that Forum77 serves both as

a user-generated and tested repository of medically-explicit knowledge on managing substance abuse

64

CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 65

withdrawal, as well as a public platform where people broadcast their progress as a mechanism for seek-

ing emotional support and encouragement from others. In this latter capacity, Forum77 is similar to the

offline mutual help groups Alcoholics Anonymous (AA) and Narcotics Anonymous (NA); in its information

providing capacity, however, Forum77 is quite distinct, as AA and NA explicitly eschew the sharing of

medical information [133].

6.1 Why Study Addiction?

We focus on the topic of addiction for 3 primary reasons, which we expand on below. The first is that

addiction is highly prevalent. As such, any insights or results that arise from studying addiction could be

useful and impactful to a large number of people. Second, addiction is highly stigmatized. As a result,

people suffering from addiction are likely to turn online for help, and addiction-related PAT is likely to

contain information that is difficult to acquire through traditional medical channels. Finally, people are

turning online en masse for help with Addiction. Forum77 is MedHelp’s largest forum, but, as we show

in Table 6.1, only one of several online forums dedicated to the topic of substance abuse recovery.

6.1.1 Addiction is Highly Prevalent

Drug and alcohol use disorders, in particular the escalating misuse of prescription drugs, present one

of the most pressing public health issues of the day. Addiction affects 16% of Americans ages 12 or

older (about 40 million people), far exceeding the number of people afflicted with heart disease (27

million), diabetes (26 million), or cancer (19 million) [4]. Deaths due to accidental drug overdose now

exceed deaths due to motor vehicle accidents [251]. In 2008, more than 36,000 deaths were due to drug

overdoses; of these, opioid pain reliever (OPR) overdoses accounted for more than heroin and cocaine

combined [3, 249]. Taking into account workplace, criminal justice, and health care costs, the burden of

prescription drug abuse on the U.S. Economy was $56-$57 billion in 2006-2007 [27,115].

6.1.2 Addiction is Highly Stigmatized

Recent medical research argues that drug dependence is a chronic, relapsing and remitting disorder

that behaves just like other chronic illnesses with a behavioral component, such as Type II Diabetes

Mellitus [169]. Despite this, prescription opioid abuse is a highly stigmatized condition: the opinion


that opioid misuse is a flaw of a person’s moral character, rather than a legitimate medical condition, is

common [187].

This stigma carries into the medical profession. In general, medical professionals feel that addiction

lacks parity with other medical conditions in terms of prestige and importance [176]. In addition, there

is a mutual mistrust between addiction patients, who feel that they are mistreated and stigmatized and

receive poor medical care as a result, and their providers, who find it difficult to evaluate whether patients’

requests for opioids stem from genuine “medically indicated” needs or from addictive behavior [174].

The stigma is compounded by the fact that the most effective treatments for opioid use disorders are

methadone or buprenorphine-assisted replacement therapies, which require patients to continue taking

prescription opioids under the supervision of a medical professional [187]. Finally, as pain treatment

is often the starting point of a longer addiction to prescription opioids, it is common for people with

prescription drug use disorders to acquire their drug of choice via a doctor’s prescription [229,249].

6.1.3 People are Turning Online for Help with Addiction

People with substance use disorders are no exception to the trend of online health forum participation.

Myriad discussion forums focus on addiction recovery and are widely utilized. Table 6.1 describes a

representative sample of these that we curated during a brief search. The result of this is a massive,

growing and (until now) unexamined corpus of text in which users document their experiences with

addiction and their attempts at overcoming it.

6.2 Related Work

Emotional and informational support consistently emerge as the primary reasons for user engagement

in online health communities [36,47,86,122,131,148,149,162,211,243,250,258]. However, little work

attempts to extend analyses of users’ support giving, seeking, or reasons for participation to data sets

that are too large for manual annotation. We discuss this work here, referring the reader to § 2.2.3 for a

thorough discussion of users’ reasons for participation in online health communities, and to § 3.5 for a

summary of prior work on thematic analyses of PAT.


Table 6.1: Summary statistics of a representative sample of online health communities focused on ad-diction recovery. We identified sites through Google searches and gathered statistics (if available) fromsite pages. Data current as of 3/1/2014.

Name Description Members Posts ThreadsJoin

topost

Jointo

read

Forum77medhelp.org/forums/

Addiction-Substance-

Abuse/show/77

Single forum dedicated to re-covery in general.

∼51,153 ∼740,046 ∼80,529 Y N

The Suboxone Talk Zonesuboxforum.com

Multiple forums focused onissues related to Suboxone.

∼11,000 ∼77,000 ∼8,900 Y N

Addiction Recovery Guideaddictionrecovery

guide.org

Collection of resources forassisting recovery; includesonline forum.

N/A 700,000 N/A Y N

Addiction Survivorsaddictionsurvivors.org

Forums focus on opiate, al-cohol, benzodiazepine, andstimulant addiction.

∼15,870 ∼270,000 ∼17,500 Y N

Cyber Recoverycyberrecovery.net

Multiple forums dedicated torecovery in general.

5,078 154,975 23,000 Y Y

Sober Recoverysoberrecovery.com/forums

Multiple forums dedicated toalcoholism and drug abuserecovery.

132,964 >3.5 M 234,311 Y N

Wang et al. [250] successfully use workers on Amazon’s Mechanical Turk1 (Turkers) to quantify the

amount of emotional and informational support contained in both initiating and response posts on Breast-

cancer.org2. They then use this data to train regression models that have correlation scores 0.76 and

0.80 for emotional and informational content, respectively. Investigating whether certain types of support

are important for member retention, they found that receiving high levels of emotional support predicted

for lower dropout risk.

Biyani et al. [28] manually labeled ∼1,000 sentences from the Cancer Survivor’s Network forum3 as

either emotional or informational. An ensemble classifier trained on this data achieved an F1-score of

84% (88% for emotional support, 77% for informational support). Their goal was to determine whether

influential and regular community members differed in terms of the types of support they provided on

the forum. They found that influential members offer significantly more emotional support than regular

community members.

1http://www.mturk.com2http://breastcancer.org3http://csn.cancer.org


To our knowledge, no other prior work attempts to automatically classify informational and emotional

support in PAT. However, some work does investigate methods for labeling or featurizing these data at

scale. Vlahovic et al. [248] found that Turkers produced good labels for emotional and informational

support on posts from a breast cancer support forum. Finally, both Owen et al. [188] and Alpers et

al. [12] evaluate the efficacy of using LIWC4 to automatically identify emotions expressed in posts on

breast cancer support forums. While both find the tool reasonably accurate, they do not attempt to

analyze users’ motives for posting.

Unlike Wang et al. [250] and Biyani et al. [28], we investigate and discuss a more detailed taxonomy

of users’ reasons for participation. In addition to automatically classifying informational and emotional

support, we are also able to train a classifier to identify a specific sub-category of emotional support

posts: the update. While we leave the analysis of response post content to future work, we do investigate

response levels to different categories of initiating posts.

6.3 Data

For clarity, we briefly summarize the data sets used in this chapter.

6.3.1 Thematic Analysis Development Dataset

We use our Forum77 data set (§ 4.1.2) for this work. For our thematic analysis (§ 6.5), we used ∼1,000

initiating posts sampled uniformly at random for each iteration of the analysis, and evaluated inter-

annotator agreement on a 200-post subsample. With a total of 3 iterations, we used ∼3,000 initiating

posts sampled uniformly at random to conduct the thematic analysis.

6.3.2 Labeled Training & Testing Dataset

We created a data set for labeling and classifier training as follows: first, we curated a sample of initiating

posts from recurring Forum77 users by randomly sampling 200 users who had initiated 5 or more posts.

(We restricted the sample to recurring users in order to ensure a more balanced representation of tax-

onomy labels, as we observed in our thematic analysis (§ 6.5) that certain labels (e.g., support giving)

tend to appear only later in a user’s tenure.) Our 200 sampled users authored ∼32,000 initiating posts,

4http://www.liwc.net


of which we took a random sample of 1,000 for subsequent coding. To prevent any user from dominating

the sample, we admitted no more than 30 posts per user.

6.4 Who Posts?

Traditional demographic information such as gender, age, race and socioeconomic status is rarely dis-

cernible from Forum77 posts. However, we were able to determine other aspects of identity, namely

whether a user was posting on their own behalf or on behalf of someone else. We noted that most

users initiate posts in which they are the subject; occasionally, however, users initiating posts in which

someone else is the subject. These proxies range from concerned parents, to members congratulating

each other on clean time, to loved ones posting on behalf of an incapacitated member.

We defined the subject of the post to be self if the author is writing about her own addiction, associate

if the author is writing about someone else’s addiction, or n/a if this information is absent or indeterminate.

Two authors labeled our 1,000 initiating post training data sample with the subject label. Inter-annotator

agreement was 92%, with a Cohen’s Kappa of 0.77.

The distribution of subject labels over the sample data set is: 85% self, 8% associate, and 7% n/a.

While most users post on their own behalf, a significant minority post on behalf of another. Moreover,

the number of posts in which the subject was indeterminate was higher than we expected. Such posts

typically consist of social chatter (e.g., talking about sports). As these results do not suggest anything

interesting or novel, we do not pursue this analysis at scale.

6.5 Users’ Objectives in Initiating Discussions

Thematic analyses are frequently used on PAT to identify structure and patterns in user behavior and

user-generated content (§3.5). To develop a taxonomy describing users’ objectives in initiating discus-

sions on Forum77, we use an adapted General Inductive Approach [236]: over the course of read-

ing ∼3,000 posts, two authors iteratively co-developed a taxonomy describing recurrent and emergent

themes in the posts. On each iteration, the authors used the taxonomy to independently label 1,000

randomly sampled posts. They then revised the rubric based on subsequent error analysis and inter-

annotator agreement scores calculated on a 200-post subsample. The authors executed a total of three

iteration cycles. Figure 6.1 illustrates our thematic analysis process.


Thematic Analysis

Schema Sample n=1,000

Label Set#1 n=600

Label Set#2 n=600

Error Analysis

Consult Addiction Specialist

Final Schema

Figure 6.1: Thematic analysis process. Orange edges indicate the iterative component of the analysis.

Table 6.2 presents our final taxonomy, which was reviewed and approved by an Addiction specialist,

along with label prevalence in our labeled training data set. Table 6.3 presents sample text from posts in

each category in the taxonomy.

6.6 Classifying Informational vs. Emotional Support

6.6.1 Training Dataset Annotation and Agreement

Having finalized our taxonomy, two annotators used it to each label 600 of our 1,000 initiating post train-

ing data sample (§6.3.2). We annotated each post with its primary purpose using the most specific label

available. Inter-annotator agreement for specific purpose labels (Label in Table 6.2) was medium, with

agreement of 67% and Cohen’s kappa [50] of 0.62. Inter-annotator agreement on the three broader cat-

egories informational, emotional and neither (Category in Table 6.2), however, was high with agreement

of 87% and a Cohen’s kappa [50] of 0.78.


Table 6.2: Annotator-derived taxonomy for users’ objectives in initiating a post, with % prevalence in the1,000 post labeled sample on the right. Note that 1.) labels are mutually exclusive, 2) “w/d” stands for“withdrawal”.

Category Label Description %

informational

w/d expectations Questions on what to expect when going through withdrawal, es-pecially regarding symptom severity and duration.

11.8

w/d management Questions about how to manage withdrawal and relieve symp-toms.

8.7

w/d method Soliciting advice on how best to quit drug(s) of choice. Topics in-clude method of quitting (cold turkey vs. tapering) and schedulinga time to detox.

7.8

general information Subject posts medical questions unrelated to withdrawal. 8.5

emotional

seek support Specific requests for support (like keeping in thoughts, prayers,getting in touch).

4.6

give support Primary purpose of the post is to offer encouragement to others,often via relating a personal story of overcoming addiction.

9.9

update Posts that comprise a log-like report of the user’s current status.These are often highly detailed and contain no requests for feed-back or support.

35.5

general guidance Subject posts non-medical questions to the community. Theseoften comprise advice for personal relationships and scenariosrequiring moral judgement.

5.0

neitherrelapse concern Subject is worried that she is going to relapse. While rare, these

posts typically forecast relapse due to a required medical proce-dure that will require prescription pain medication. These postsvaried in their information vs. support leanings, so we excludedthem from either category.

2.8

n/a Impossible to speculate on the purpose of the post. 5.4

6.6.2 Classifier Training

To identify posts as either primarily informational or primarily emotional, we built a logistic regression

classifier (which outperformed Support Vector Machine and Naive Bayes classifiers) using the Stanford

CoreNLP toolkit5. For each post, we used the following features: the number of question sentences,

content unigrams and bigrams, positive and negative word counts with polarity score ≥ 0.8 in Senti-

WordNet [19], and number of days clean, if stated. The last feature was determined by applying the

pattern “X days/weeks/months clean” and “on day X” to the post text. A full feature list is documented in

Appendix B.

5http://nlp.stanford.edu/software/corenlp.shtml


Table 6.3: Descriptions and samples of taxonomy labels. Samples are synthesized in order to preserveuser privacy.

Label Description (+ Additional Notes) Synthesized Sample

w/d expectations What to expect while going through w/d. (Typi-cally users will ask how long symptoms will last,whether the symptoms are normal etc.)

I stopped long term methadone 12 days ago. Iwas wondering if anyone knows how long the anx-iety RLS and hot/cold last? The other symptomsrnt too bad...

w/d management How to handle w/d symptoms. Implies w/d e.(Typically users will try to source ideas for alle-viating pain, RLS etc.)

I’m wondering about the Amino Acid protocol andThomas recipe. What would be the most impor-tant to take from day 1 to 4 during the worst W/Dsymptoms? I know I suffer the most with RLS andchemical chills [...]

w/d method User seeks information on how to quit a sub-stance. (Include questions like whether to go c/tvs. taper, requests for tapering schedules or ad-vice etc.)

I am taking 5000mg of vicodin currently daily cananyone help me with this?

general information User seeks informational advice that is not re-lated to quitting/withdrawal. (Several possibili-ties, including questions about how much wouldit take to overdose etc.)

I’m curious as to how long people were ad-dicted/dependent to their DOC. I know using forlonger makes it harder to quit, and each time youquit WDs are harder than before. As for me, I hada 12 year run with vics/oxies.

seek support User explicitly requests emotional support fromthe community. (Request for emotional supportshould be explicit. Typically users will ask forhelp or prayers or thoughts.)

For those of you who are prayer warriors, pleasecould you pray for my friend, for recovery and pro-tection. Could you also please pray for his family- they are in a very hard place right now. Thankyou!

give support User imparts a strong message of encourage-ment to the community. (Look for terms like “soI just wanted everyone to know that it’s possibleand you can do it”)

Hey y’all! Well today the depression paid me visitbut I kept it caged! Anxiety about 20% Did a2.5 mile run and that helped tons. I can’t say itenough: exercise really helps withdrawals. If youcan then DO IT! When the wds hit don’t crawl intobed - get up and move!

update Update the community on the user’s status The only reason I’m not getting more is the stressinvolved in getting them and setting up a supplybecause you can’t have just one. WD today areok not too bad. It’s my neck that’s killing me andmy body laughing at the Advil I took.

general guidance Non-medical advice that doesn’t fall into any ofthe above categories. (Typical examples includequestions of how to deal with telling spousesabout addiction, whether to cut off a family mem-ber etc.)

Do any of you guys have experience with givinga husband an ultimatum? It seems simple: Gettreated or you’re out. But with 3 young childrenit’s actually quite complicated. Help.

relapse concern Often patients claiming to be clean but need amedical procedure that will require pain meds.

i had an accident yesterday that got me stuck inthe emergency room. today i’m 21 days off myroxies [...]. i ’m scared of going back because Iknow i’ll be given pain meds [...]

n/a Impossible to determine I’ve been away for few days and everything seemsdifferent. Anyway I hope everyone is doing great.


6.6.3 Classifier Performance

The final classifier performs well, achieving an accuracy of 80.98% in 10-fold cross validation versus

a baseline of 59.7% in which every post is labeled with the majority class. Table 6.4 shows precision,

recall, and F1 scores averaged over all 10 folds.

Table 6.4: Classifier performance for labeling initiating posts as seeking informational support or emo-tional support. Performance scores are averaged over 10 folds.

Label Precision Recall F1

support 84.57 83.40 83.84information 76.18 77.12 76.41

Average 80.37 80.26 80.12

6.7 Classifying Updates vs. Non-updates

6.7.1 Classifier Training

To automatically label all posts with update or non-update labels, we again built a logistic regression

classifier, using the same training and testing dataset from § 6.6.1. The non-update posts contain all

posts that are not an update or n/a. We added two features to those used in our informational vs. emo-

tional classifier (§ 6.6): whether the post mentions a number of days (using the pattern: “day” or “days”

followed by a number), and time elapsed (days) since the user’s last initiating post. Table 6.2 shows that

the ratio of update to non-update posts is roughly 1:3. To compensate for this class imbalance, during

classifier training we randomly sub-sample such that non-update post quantity is at most 1.5x that of

update posts. We do not change the test set.

6.7.2 Classifier Performance

Our classifier achieves an accuracy of 78.40% compared to the majority-class baseline accuracy of

62.55% in 10-fold cross validation. Table 6.5 shows precision, recall and F1 scores.


Table 6.5: Classifier performance labeling posts as either update or non-update. Performance scoresare averaged over 10 folds.

Label Precision Recall F1

update 72.15 69.29 70.09non-update 82.36 84.16 82.99

Average 77.25 76.72 76.54

6.8 Results

Users post primarily on their own behalf: In our sample, ∼85% of initiating posts were written by

the author on her own behalf, while only ∼8% were written on behalf of someone else. This differs from

reports by the Pew Research Center that find that ∼50% of online health inquiries are made on behalf

of another [90, 91]. It is possible that the stigmatized nature of addiction prevents users from disclosing

their situation to loved ones, who might otherwise ask questions on their behalf. Another possibility is

that the act of posting on Forum77 during the physically uncomfortable and painful process of withdrawal

is cathartic in and of itself: a benefit unavailable to proxy participants.

Informational and emotional support are the driving motivations for initiating discussion: In

congruence with prior work, our thematic analysis revealed that seeking informational and emotional

support drives user participation on Forum77. Applying our classifier to the entirety of the Forum77 data

set, we find that users seek both types of support in roughly equal proportion: 47% of all initiating posts

seek primarily informational support, while 53% of all initiating posts seek primarily emotional support.

This stands in contrast to our manually-annotated sample (Table 6.2) in which only 36.8% of initiating

posts are informational. Given that our machine-labeled sample comprises recurring Forum77 users,

one potential explanation for this is that longer-tenure or more involved users seek emotional support

more than users who post only a couple of times on the forum.

Informational posts seek explicit medical advice about withdrawal: Users primarily seek knowl-

edge on withdrawal methods, management and expectations in informational posts. Table 6.2 shows

that in our sample, almost 75% of informational posts specifically discuss the topic of withdrawal. A

casual analysis of informational posts also reveals that the type of information requested by users is

often explicitly medical in nature, such as the pharmacological management of withdrawal. A prevalent


example of this is Thomas’ Recipe, an opioid withdrawal tapering schedule that has evolved on Forum77

over time (§ 6.8.1).

Informational threads receive fewer responses, but have a longer lifespan: Approximately 95%

of both informational and emotional initiating posts receive a response. Of these, initiating posts that pri-

marily seek emotional support receive more responses than those seeking informational support (mean

8.7 vs. 7.4, median 6 vs. 5). The distributions are significantly different (Mann-Whitney-U test, n1 =

39,553, n2 = 38,954, U = 758,376,673, p < 0.001).

The “lifespan” of a discussion is the number of days between its initiating post and the last response

on record. On average, initiating posts that seek primarily informational support have a lifespan 2.5

times as long as those that seek primarily emotional support (mean 74.4 days vs. 27.6 days, median 0

(< 24 hours) vs. 0). The differences in means are statistically significant (Mann-Whitney-U test, n1 =

37,112, n2 = 41,395, U = 817,010,310, p < 0.001). Most (56% of informational and 59% emotional)

discussions have a lifespan of 0 days (<24 hours). Excluding these, informational discussions remain

dominant in terms of lifespan (mean 170.3 days vs. 68.8 days, median 2 days vs. 1 day).

Update posts are the most prevalent type of emotional post: Our classifier identifies some 15,000

out of ∼55,000 (30%) initiating posts as updates. Update posts comprise a log-like status update of

the user’s current condition, and rarely explicitly request any sort of response from the community. For

example:

I was used to taking 8-10 5/325 oxycodones a day. Havent taken any of them since Friday

but I took one Oxy 40mg Sat and one on Sunday morning. Its been almost 24 hrs and not to

bad so far but im sure there is more to come.

Despite the lack of specific requests, update posts do indeed trigger a community response, as we

discuss in the next paragraph.

Update posts have more responses & more unique contributors, but shorter lifespans: To further

assess the role that update posts play, we compared several features of threads that were initiated by

update vs. non-update posts. Update threads have a shorter average lifespan than other threads (mean

= 10.8 days vs. 30.0, sd = 88.8 vs. 151.1; t435332 = -18.2, p < 0.001). It is possible that the personal

nature of an update post makes them difficult to repurpose. Other differences are small: on average,


Update Non-

update

55% 9.7 days

45% 4.4 days

71% 22 days

29% 8.2 days

Figure 6.2: Normalized transition probabilities and average transition times between consecutive updateand non-update posts.

threads initiated by update posts net slightly more responses (mean 7.19 vs. 6.65; t27230 = 7.2, p <

0.001) and slightly more unique contributors (mean 4.91 vs. 4.35; t27126 = 10.6, p < 0.001).

Time elapsed between consecutive update posts is short: Figure 6.2 shows users’ transition fre-

quencies between initiating update and non-update posts, along with the average number of days be-

tween transitions. Users posting consecutive updates do so in comparatively quick succession, averag-

ing 4.4 days between each update.

6.8.1 Thomas’ Recipe: An Informal Collaboration

During our analysis, we noticed that not only do users share explicit medical advice with one another:

they test, evaluate, modify and re-share it. In others words, users informally collaborate on developing

treatment protocols that are effective at assisting withdrawal. A prevalent example of this on Forum77 is

Thomas’ Recipe.

Thomas’ Recipe6 is a detailed treatment protocol for medication-assisted opioid withdrawal manage-

ment. It was written in the early 2000’s7 by a Forum77 user who had years of experience detoxing from

opioids, but no medical qualifications. Over the years, the original Thomas’ Recipe has evolved. Ta-

ble 6.6 shows a version of Thomas’ Recipe from circa 2000, while Table 6.7 shows a version from circa

2006. While the core content remains, the newer version has a great deal more structure and formal-

ization. Details of the recipe have also changed. For example, the older recipe recommends a 4000mg

6http://www.medhelp.org/tags/health_page/45/Addiction/Thomas-Recipe-Re-Posted?hp_id=167While our data set officially starts in 2007, it also contains some posts from as far back as 1999. We believe that this was

either a pilot program or another forum that was acquired by MedHelp.


dose of L-Tyrosine, while the newer recipe suggests beginning with a 2000mg dose and scaling up as

necessary.

An informal assessment of iterations of Thomas’ Recipe on Forum77 suggest that these changes are

a result of user testing and feedback. Users’ comments, too, suggest that over time, they have modified

Thomas’ Recipe to make it more generally applicable and effective:

“I’m actually doing pretty good I’ve taken the Thomas recipe from day 1 but I’ve also added

Vitamin D, and niacin.”

“I have a modified Thomas Recipe that seems to have done wonders on my withdrawals if

anyone is interested. (No Xanax or Valium etc) Added Potassium pills, Ensure protein drinks

(since I cant eat anything solid yet).”

“If it helps any, I did a modified Thomas’ Recipe. I didn’t use any pharmaceuticals and added

some additional supplements (Magnesium, Potassium and Calcium for RLS and Melatonin

for sleep).”

Thomas’ Recipe is wildly popular on Forum77. Approximately 1.72% of all posts in our data set

mention it directly. Moreover, it is not constrained just to MedHelp: these days Thomas’ Recipe is hosted

on a number of addiction recovery sites8 9 10 11, and a Google search for “Thomas’ Recipe” brings up

sponsored advertisements for opiate withdrawal remedies.

The recipe’s prevalence is likely testimony to the fact that it does genuinely assist the process of

opiate withdrawal. Forum77 users swear by its efficacy, calling it a “life saver”, a “god send”, and some-

thing that “works wonders”. To evaluate the efficacy of Thomas’ Recipe, we showed it to a psychiatrist

specializing in addiction. She noted that not only was the recipe very similar to a treatment she might

have recommended professionally, but also that it contained novel elements that would facilitate the

withdrawal process.

6.9 Discussion

Forum77 serves as a valuable, user-generated repository of medical information pertaining to the pro-

cess of addiction recovery. Moreover, this information is not static: it is curated, tested and modified. As8http://www.drugs.com/forum/featured-conditions/thomas-recipe-opiate-withdrawal-35169.html9http://www-personal.umich.edu/~timaster/biopsych/home.html

10http://opiatewithdrawaltips.com/thomas-recipe11https://www.drugs-forum.com/forum/showthread.php?t=12568


we saw in the example of Thomas’ Recipe (§ 6.8.1), users actively collaborate on developing effective

treatment protocols. The continual evolution of informational artifacts on Forum77 is likely a contributing

factor to the fact that informational discussions have significantly longer lifespans than emotional dis-

cussions. Another factor that we have observed lengthening the lifespan of informational discussions is

that some users repurpose them, sometimes years after the initial post, to describe their own situation.

In doing so, users may feel that they are not starting from scratch, that they have a ready made descrip-

tion of their condition, or that they are leveraging work that the previous initiator put into finding other

Forum77 members who could address their specific issue.

While users do explicitly seek emotional support on Forum77, most emotional posts are not explicit

requests, but rather, update posts. The prevalence of the update post suggests that users place value

in having a community bear witness to their struggle with addiction. The fact that update posts garner

slightly more responses on average than non-update posts shows, too, that responses are expected. It

is possible that users publicize update posts (rather than writing them, for example, in a private journal)

as a self-enforcement mechanism to help them progress with cessation. Qualitative evidence shows that

users feel a great deal of embarrassment and shame when a withdrawal attempt fails, and that failing

may even delay their return to the community.

In addition to having a community of witnesses, users derive utility from the process of documentation

itself. Authors find it valuable to reflect upon their past posts, which serve as reminders and evidence

of both accomplishments and regressions. For example, one user reflects on something that she was

scared to do:

I just found some old post about no desire for sex. Whew! I was so scared to ask the

question.

Another laments a relapse:

I cna’t believe I’m at 25 days when I was in the hundereds before. I’m so angry at myself for

relapsing and still keep beating myslef up!!

Readers, too, find others’ chronicles both informative and illustrative. This user mentions reading

through hundreds of old posts to glean insight into what his withdrawal will be like:

This is my first d/x and pray that it will be my last. I’ve read through tons of old posts and

they definitely help.


Another poster used narratives on Forum77 to help her husband prepare for the process of her

recovery:

i have showed him this site and let him read some of your stories, so he knows its not all

going to be plane sailing

6.9.1 Limitations and Future Work

The primary limitation to our work is our requirement that a post be labeled as either informational or

emotional. In our experience, while only one of these labels tends to be dominant in an initiating post,

Wang et al. [250] and Biyani et al. [28] do show that finer-grained labeling is possible at scale. Although

picking the dominant label was sufficient for examining our analysis questions, a more nuanced analysis

might benefit from more detail.

A natural avenue for future work is to analyze response posts in addition to initiating posts. While

Wang et al. [250] utilize the same scales of emotional and information support in scoring both initiating

and response posts, our informal analyses of Forum77 response posts suggest that response categories

would require an entirely new descriptive taxonomy. (For example, a fairly common response tactic

that we observed that does not manifest in Table 6.2 is the hijack : when a user attempts to shift the

focal attention of active thread participants away from the initiator and onto herself, usually by claiming

identical circumstances to the initiator. This tactic often kills the thread.) Having derived this taxonomy,

however, one could start to ask questions such as, “What is the most effective way of getting informational

support?”, or “What types of initiating threads draw a diverse crowd of respondents?”.

6.10 Summary

We set out in this chapter to answer the question: “What do users seek on Forum77?”. We first motivated

our focus on the topic of addiction, noting that both its prevalence and stigma make it a potentially

rewarding focus of study (§ 6.1). We then presented related work on identifying types of support seeking

on online health forums (§ 6.2), and described the data samples used in this chapter (§ 6.3).

Through conducting a thematic analysis over a sample of initiating posts, we found that, in congru-

ence with prior work, users seek both informational and emotional support on Forum77. Moreover, we

discovered that the most prevalent form of emotional support seeking was to issue update posts: es-

sentially status logs containing no explicit request for a community response (§ 6.5). With some feature


engineering, we were able to train two binary statistical classifiers to distinguish emotional from informa-

tional posts (§ 6.6), and update from non-update posts (§ 6.7), with F1 scores of 80.12% and 76.54%,

respectively. Applying our classifier to the entire Forum77 data set, we then analyze differences between

these post categories (§ 6.8). We find, for example, that informational posts have a longer lifespan than

emotional posts, and that while update posts make no explicit request for feedback, they garner more

responses on average than non-update posts. We also analyze Thomas’ Recipe (§ 6.8.1), an informa-

tional artifact of Forum77 that provides users with instructions for medication-assisted detoxification from

opioids.

In conclusion, Forum77 provides two main services to users: first, it serves as a repository of in-

formation on opioid abuse that is generated, tested, and modified for improved efficacy by community

members. Second, it offers a space where the disclosure of personal progress (whether forward or

backward) can be witnessed by others and recorded for posterity. In Chapter 7 we turn our attention to

identifying which drugs Forum77 users abuse.


Table 6.6: Thomas’ Recipe (circa 2001)

THOMAS RECIPE

Here’s my tried-and-true do-it-yourself ”cold turkey” detox protocol.

Supplies you’ll need first:

As many Valium, Xanax, Librium or Klonopin as you can get your hands on.

— first day off the opiate, use enough Valium or whatever, to, if possible, sleep through most of the first couple days. Then startdecreasing the dose until you’re down to nothing in about 5 or 6 days. You’ll have to do the math. The Valium or one of its sisterdrugs will help tremendously with the anxiety and, somewhat, with the body aches. Valium may make you eat like a pig and, whenwithdrawing from narcotics, one usually craves sweets, so I’d be ready to indulge myself, along with some good escapist movies.That always worked for me.

Around-the-clock access to either hot baths or a Jacuzzi.

–speaking of those goddamn mostly thigh cramps that seem to love to show up in the middle of the night, have that hot bath orJacuzzi at the ready. Don’t hesitate to spend the majority of the week in that hot water if that’s what it takes to get you throughit. You may be wrinkled, but you’ll have your sanity. Don’t underestimate what the hot baths can do to relieve the withdrawaldiscomfort. They really work. Heating pads between the thighs can help with those cramps, too, but not as much as the hot baths.

Brand-name-only Imodium (over the counter at the supermarket)

– if you’re a normal hydro addict, you’ll be getting the runs by no later than the second or third day off the lorcet. In my experience,it’s an especially unpleasant variety. At the first impulse, take two or three and respond to returning urges with two tabs. It’simportant that you do it immediately.

L-Tyrosine (qty 50 of the 500mg caps) - an amino acid available at the health food store.

– chronic use of narcotics depletes the brain of several critical neurotransmitters responsible for well-being and mental performanceand attitude.

Plus: Bottle of 100 mg B6 caps

My experience detoxing with this stuff says take 4000 (four thousand) mg. (8x500mg caps of L-Tyrosine) with two 100mg B6 capsevery day for your ”detox week” to provide your brain with the raw material it needs to replenish its stores of these neurotransmitters.Many feel the difference on the very first dose. ***Take it on an empty stomach, either first thing in the morning or at bedtime. Youcan continue this regimen after the first week if it continues to make you feel good. I continue to use it every other day with veryfew exceptions. After a few weeks, I cut down on the dosage, though, as it can cause the runs at high doses.

Multi-vitamins (most junkies don’t eat too well, so this one’s just for good sense).

Take a look at this link. According to this doc, you also need to add copper, phosphorus and Vitamin C to replenish the dopamine,and the norepinephrine. You might have to do some hunting at the health food store to find the right vitamin or vitamins to supplyall this stuff. I got a pretty good result from just the L-Tyrosine and B6, however.

I also understand from another contributor that zinc and magnesium help replenish and restore vital substances depleted bynarcotics use.

WARNING: This same site says to avoid L-Tyrosine if you’re on an SSRI (serotonin reuptake inhibitor) such as Prozac, etc.

Good luck.

Thomas

Sourced on 9-02-2014 from: http://www.medhelp.org/posts/Addiction-Substance-Abuse/How-Long-Untill-You-Are-Normal/show/43582


Table 6.7: Thomas’ Recipe (circa 2006)

THOMAS RECIPE

PLEASE NOTE: I am not a doctor, simply a long-time Rx opiate junkie who has had many opportunities to develop a way todetox. This is a recipe for at-home self-detox from opiates based on my experience as well as that of many other addicts. It is notintended as professional medical advice. It is always wise to make sure none of the recipe ingredients or procedures conflict withmedications you may be taking. Likewise, if you have any medical condition, disease, allergy or any other health issue, consultyour doctor before using the recipe.

Thanks, Thomas

If you can’t take time off to detox, I recommend you follow a taper regimen using your drug of choice or suitable alternate – theslower the taper, the better.

For the Recipe, You’ll need:

1. Valium (or another benzodiazepine such as Klonopin, Librium, Ativan or Xanax). Of these, Valium and Klonopin are bestsuited for tapering since they come in tablet form. Librium is also an excellent detox benzo, but comes in capsules, makingit hard to taper the dose. Ativan or Xanax should only be used if you can’t get one of the others.

2. Imodium (over the counter, any drug or grocery store).

3. L-Tyrosine (500 mg caps) from the health food store.

4. Strong wide-spectrum mineral supplement with at least 100% RDA of Zinc, Phosphorus, Copper, Magnesium and Potas-sium (you may not find the potassium in the same supplement).

5. Vitamin B6 caps.

6. Access to hot baths or a Jacuzzi (or hot showers if that’s all that’s available).

How to use the recipe:

• Start the vitamin/mineral supplement right away (or the first day you can keep it down), preferably with food. Potassiumearly in the detox is important to help relieve RLS (Restless Leg Syndrome). Bananas are a good source of potassium ifyou can’t find a supplement for it.

• Begin your detox with regular doses of Valium (or alternate benzo). Start with a dose high enough to produce sleep. Beforeyou use any benzo, make sure you’re aware of how often it can be safely taken. Different benzos have different dosingschedules. Taper your Valium dosage down after each day. The goal is to get through day 4, after which the worst WDsymptoms will subside. You shouldn’t need the Valium after day 4 or 5.

• During detox, hit the hot bath or Jacuzzi as often as you need to for muscle aches. Don’t underestimate the effectivenessof hot soaks. Spend the entire time, if necessary, in a hot bath. This simple method will alleviate what is for many the worstopiate WD symptom.

• Use the Imodium aggressively to stop the runs. Take as much as you need, as often as you need it. Don’t take it, however,if you don’t need it.

• At the end of the fourth day, you should be waking up from the Valium and experiencing the beginnings of the opiate WDmalaise. Upon rising (empty stomach), take the L-Tyrosine. Try 2000mgs, and scale up or down, depending on how youfeel. You can take up to 4,000mgs. Take the L-Tyrosine with B6 to help absorption. Wait about one hour before eatingbreakfast. The L-Tyrosine will give you a surge of physical and mental energy that will help counteract the malaise. Youmay continue to take it each morning for as long as it helps. If you find it gives you the ”coffee jitters,” consider loweringthe dosage or discontinuing it altogether. Occasionally, L-Tyrosine can cause the runs. Unlike the runs from opiate WD,however, this effect of L-Tyrosine is mild and normally does not return after the first hour. Lowering the dosage may help.

• Continue to take the vitamin/mineral supplement with breakfast.

• As soon as you can force yourself to, get some mild exercise such as walking, cycling, swimming, etc. This will be hard atfirst, but will make you feel considerably better.

—Thomas

Sourced on 9-02-2014 from: http://www.drugs.com/forum/featured-conditions/thomas-recipe-opiate-withdrawal-35169.html

Chapter 7

Identifying Drugs of Choice

Monitoring drug use at a population level is crucial for observing, managing, and responding to substance

abuse-related issues, such as the emergence of new “designer drugs”, or the existence of particularly

vulnerable populations. Drug use trends could also be useful for exploring more theoretical aspects of

addiction, such as the Gateway Hypothesis [139], which proposes that drug use follows a progressive

and hierarchical sequence in which the user begins with legal addictive substances (e.g. alcohol and

cigarettes), before progressing onto marijuana and, finally, illicit substances.

The stigmatized [174,176,187] and often illegal nature of substance abuse, however, can make such

data collection difficult. Existing substance abuse surveillance efforts are restricted to convenient popula-

tions: schools (Monitoring the Future1), hospital emergency room visits (Drug Abuse Warning Network2),

state run treatment facilities (Treatment Episodes Dataset3), and in-person mutual help groups (Narcotics

Anonymous4). However, as membership in each of these populations can be compelled, these surveys,

while large-scale and thorough, fail to capture a more representative sample of drug users.

Despite the fact that millions of people voluntarily participate in online health communities for sub-

stance use disorders, almost no prior work attempts to derive drug usage data from PAT. Our goal in this

chapter is to profile substance use in the Forum77 population, and to compare this against traditionally

surveyed drug-using populations. We begin by developing a method for automatically identifying Fo-

rum77 users’ drugs of choice (DOCs) from their initiating posts (§ 7.3). As this task is context-sensitive,

we build on lessons learned in Chapter 5 and train a conditional random field (CRF) classifier that identi-

fies DOCs with F1, Precision and Recall scores of 84.65%, 91.12% and 79.46%, respectively. Next, we

1http://www.monitoringthefuture.org2http://www.samhsa.gov/data/dawn.aspx3http://wwwdasis.samhsa.gov/webt/information.htm4http://www.na.org/?ID=PR-index

83

CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 84

manually develop a map for resolving identical entities (e.g. Vicodin and Hydrocodone) extracted by our

classifier, and mapping these to classes.

Applying our classifier to the entire Forum77 data set, we develop a profile of substance use in

the Forum77 population. We contrast this with survey data on the face-to-face peer recovery group

Narcotics Anonymous (NA), as well as survey data on individuals who present to addiction treatment

centers (TEDs) and emergency rooms (DAWN) (§ 7.4). After normalizing each data set for comparison

(§ 7.4), we present both comparative results as well as substance use trends on Forum77 over time

(§ 7.5). Compared to other measured drug-using populations, prescription opioid use is highly prevalent

in Forum77, while use of more traditionally-abused substances (e.g. alcohol, marijuana and cocaine)

is notably scarce. Over time, opioid replacement therapy drugs have become increasingly prevalent on

Forum77, while use of other prescription opioids has declined. We discuss possible explanations for and

implications of these results (§ 7.6) before concluding (§ 7.7).

7.1 Related Work

Two branches of prior work apply to this chapter: the primary one is syndromic surveillance, which

is concerned with the utilization of of health-related data for the purpose of detecting, analyzing and

monitoring potential disease outbreaks [128]. We discuss syndromic surveillance in depth in § 3.2. The

second is work related specifically to observing substance abuse trends in online data, which we discuss

below.

Surprisingly little work attempts to survey substance use via online data, although the potential for

doing so has been recognized [44, 113]. In August 2014, the National Institute on Drug Abuse (NIDA)5

announced the funding of a 5-year initiative to build a substance abuse surveillance system using web

data [113]. A related system, called the “Psychonaut Web Mapping Project” already exists in Europe,

and has demonstrated an ability to give timely and accurate information related to the outbreak of novel

drugs [73]. The project aggregates data scraped from myriad sites, including discussion forums, online

stores, and Google search queries, the latter of which have also been shown to correlate with demand for

specific substances [65]. This is unsurprising, given that the Internet plays host to a highly competitive

market for illicit substances [54, 244]. Dasgupta et al. [66] were even able to show that black market

prices for prescription opioids can be accurately assessed via crowdsourcing. Although sparse, this

5http://www.drugabuse.gov


prior work supports the supposition that PAT is a promising data source for extracting substance use

data.

7.2 Datasets

Users typically offer information about the substance(s) they are using in initiating posts, in which they set

the tone and topic of discussion, and disclose the issue for which they are seeking help. As respondents

may or may not offer similar information about themselves, we restrict our analysis to Forum77’s initiating

posts, of which there are 78,507 authored by a total of 28,005 unique users.

Training & Testing Dataset Our classifiers require labeled data for training. As we felt that our fa-

miliarity with the data set would expedite labeling and reduce errors, we use 500 posts from the 1,000

initiating-post sample described in § 6.3.2. For completeness, we re-specify our sampling methodology

from § 6.3.2 here: first, we curated a sample of initiating posts from recurring Forum77 users by ran-

domly sampling 200 users who had initiated 5 or more posts. Our 200 sampled users authored ∼32,000

initiating posts, of which we took a random sample of 1,000 for subsequent coding. To prevent any user

from dominating the sample, we admitted no more than 30 posts per user.

Analysis Dataset We conduct our final analysis on all of Forum77’s initiating posts (78,507 posts

authored by some 28,005 unique users).

7.3 Automatically Identifying Drugs of Choice

In this section, we describe how we automatically identify DOCs from Forum77 initiating posts. After

defining the term drug of choice, we manually annotate our training & testing data set. Next, we trained

a CRF classifier to automatically identify drugs of choice in Forum77 initiating posts. Finally, we resolve

the extracted DOC entities to specific categories to facilitate analysis and comparison.

7.3.1 Definition of Drug of Choice

In the context of Forum77 data, we define a drug of choice (DOC) as any substance that the user

indicates that she is, or was, addicted to. Such indications can be direct (e.g. “I am addicted to

percs/patches”) or implied (e.g. “I need to get off 32mgs subox”). We also include as DOCs phrases that


unequivocally imply a misused substance (e.g. “chasing the dragon” implies opium, “blazing” implies

marijuana), although we found such occurrences to be rare.

Identifying DOCs in Forum77 text is a context sensitive task: whether a substance plays the role

of treatment or addiction depends on the user. Methadone and buprenorphine, opioids used in opioid

replacement therapy, are common examples. Valium, which is both an addictive benzodiazepine and an

ingredient in Thomas’ Recipe for aiding opioid withdrawal (§ 6.8.1), is another.

7.3.2 Data Annotation

Using the definition above, two authors each labeled DOCs in 300 of the 500 posts in our sample. Inter-

rater agreement calculated on the 100 overlapping posts was high, with a Cohen’s kappa [50] of 0.84.

Of the total sample, 276 (∼ 55%) of posts contained DOC mentions.

7.3.3 Classifier Training & Evaluation

As discussed in § 5.5.1, conditional random field (CRF) models are particularly well suited to identifying

specific entities in text [151]. CRFs are also context sensitive. For example, a CRF could leverage other

words in a sentence to determine whether a term like methadone refers to a substance being abused

vs. a substance being used as a treatment. This, in addition to the fact that prior work has successfully

utilized CRF models to identify a variety of medical terms [159, 222], makes it an appropriate choice for

the challenge of identifying DOCs in text.

We trained a CRF to automatically identify DOCs mentioned in initiating posts on our labeled training

and testing data set. For training, we exclude annotations of general drug terms such as pills, meds and

drugs. As we observed in our work on ADEPT in Chapter 5, generic terms are uninformative as well as a

significant source of classifier error [201]. For full documentation of classifier features, see Appendix C.

Results

Our CRF performs well at identifying DOCs from initiating posts. On 10-fold cross validation it achieves

an F1-score of 84.65%, and Precision and Recall scores of 91.12% and 79.46%, respectively. Ta-

ble 7.1 shows a breakdown of performance across different types of terms. The CRF performs best

on drug terms that are both specific and correctly spelled (e.g. marijuana, oxycodone) and infor-

mal/morphological variations thereof (e.g. pot, oxides), and performs worst on generic drug terms (e.g.

stuff, pain pills). Table 7.2 illustrates the results of applying our DOC classifier to sample sentences,


Table 7.1: DOC classifier performance across term categories. The classifier performs best on correctlyspelled, specific drug terms; worst on general drug terms.

Category Examples F1 score (%) Precision (%) Recall (%)

All terms 84.7 91.1 79.5

Specific drug terms,spelled correctly(53.1% of all terms)

marijuana, ultram, phenobarbital,hydrocodone

87.0 90.3 83.9

Informal & morphologicalvariations of drug terms(34.5% of all terms)

roxies, oxyz, subs, pot, vics,blues, hydros, smokes

84.6 93.4 77.2

General drug terms(12.8% of all terms)

pain pills, painkillers, powder,stuff, substances

79.7 94.0 69.2

Table 7.2: Examples of DOCs extracted by our CRF classifier. Identified SOA terms are shown in boldin the context of their originating sentence, and the resolved drug name, generic name and category areshown on the right.

Sentence ResolvedDrug

ResolvedGeneric

ResolvedCategory

My doc prescribed suboxone on Sunday to helpquitting from vicodin.

Vicodin hydrocodone opioid

I need help. I am on vic for the last 20 years. Vicodin hydrocodone opioid

She began with meth months ago and now is usingcoke.

cocainemethamphetamine

cocainemethamphetamine

cocainestimulant

As for myself, it was a 7 year run with percs/patches. Percocet oxycodone opioid

and resolving these to drug categories as per § 7.3.4. Note the model’s sensitivity to context: in the first

sentence, suboxone is not extracted because it is being used as a treatment for the author’s addiction to

Vicodin.

7.3.4 Drug Term Resolution

The DOC terms extracted by our classifier vary widely in terms of spelling (we saw 58 variations on

Vicodin alone) and specificity (users refer to drugs with brand, generic and even class names). For

example, somebody might refer to Suboxone as buprenorphine, or even just as an opiate. Resolving

related drug terms to common entities is necessary for analysis and comparison.


Table 7.3: Summary of similarities and differences between our Forum77, NA, TEDS and DAWNdatasets. Forum77 is unique in that participation is always voluntary and that users report only sub-stances that they deem relevant.

Forum77 NA TEDS DAWN

Population size 19,634 8,837 1,844,720 131,698

Time in which data were generated 2007-2011 2011 2011 2011

Data self-reported? Yes Yes Yes Yes

Duplicate users in dataset possible? Yes Yes Yes Yes

Survey population membership voluntary? Yes Not always Not always Not always

Users can report multiple substances Yes Yes Yes Yes

Substances reported only those which userperceives as relevant

All All All

To resolve drug names, we compiled a list mapping misspellings in our data set to a single drug

name (either brand or generic). We then mapped all brand names to their respective generic names,

and finally, categorized each substance into a general class (Table C.1). We ultimately resolved ∼1,200

terms to 90 entities in 10 drug classes (see Appendix C).

7.4 Comparing Real-World DOC Distributions

We compare our results to survey data on the face-to-face peer recovery group Narcotics Anonymous

(NA), as well as survey data on individuals who present to addiction treatment centers (TEDs) and emer-

gency rooms (DAWN). We use the 2011 (most recently available) reports for each of these surveys, and

compare results to the Forum77 data set spanning 2007-2011. We include multiple years of Forum77

data as we find that the DOC distributions in the Forum77 population vary only slightly over time. Below,

we describe how we process each data set, and summarize key similarities and differences between

them (Table 7.3). Final categorical alignment for cross-survey comparison between surveys is described

in Table 7.4.


Table 7.4: Alignment of categories across the Forum77, NA, TEDS and DAWN datasets for comparativepurposes. Exact category terms from each survey have been preserved in this table for replicability.

Forum77 NA TEDS DAWN

Alcohol Alcohol Alcohol Alcohol

Cocaine Cocaine, Crack Cocaine/Crack Cocaine

Hallucinogens Hallucinogens (LSD, PCP) PCPOther Hallucinogens

LSDPCPMisc. hallucinogens

Heroin Opiates (heroin, morphine) Heroin Heroin

Inhalants Inhalants (glue, Nitrous Oxide) Inhalants Inhalants

Marijuana Cannabis (pot, hashish) Marijuana/Hashish MarijuanaSynthetic cannabinoids

Methadoneand Suboxone

Methadone/Buprenorphine Methadone (non-RX) Methadone/Buprenorphine

Opioids Opioids (Oxycodone, Vi-codin, Fentanyl)

Opiates/Synthetics Opiates/Opioids

Stimulants EcstasyStimulants (speed, crystalmeth)

MethamphetamineOther AmphetaminesOther Stimulants

AmphetaminesAmphetamine-dextroamphetamineGHBMDMAMethamphetamineMethyphenidate

Sedatives Tranquilizers (Klonopin,Valium, Xanax)

BarbituatesBenzodiazepinesNon-Barbituate sedativesOther non-benzodiazepinetranquilizers

BarbituratesBenzodiazepinesKetamineMisc. anxiolyticssedatives and hypnotics

7.4.1 Forum77

Our classifier identifies DOCs for 19,634 (70%) of the 28,005 users who initiated discussions on Fo-

rum77, corresponding to ∼50% of the 78,507 initiating posts analyzed. This corroborates our observa-

tion that ∼55% of the posts in our 500-post training and testing sample contained DOC mentions. To

acquire a distribution of DOCs in the Forum77 population, we count, for each drug category (see Ta-

ble 7.4) the number of unique users who abused a drug in that category. We then normalize the counts

by the DOC-identifiable population size.


7.4.2 Narcotics Anonymous

Narcotics Anonymous (NA) conducts an annual membership survey in which respondents identify both

main drugs used as well as any other drugs used on a regular basis [2]. Responses are identified using a

checklist of drug categories (Table 7.4). As the results are published only in aggregate form, we acquired

the raw data from NA for the online component of the survey for analysis. Omitting entries with either a

0 second response time or in which the user declined to answer the drug-related questions, there were

8,837 respondents.

Categorizing heroin in the NA survey data: While both DAWN and TEDS have a separate category

for heroin, NA groups heroin in to the category “Opiates (heroin, morphine etc.)”. To align the NA data

set with DAWN and TEDS, we classify “Opiates (heroin, morphine, etc.)” with “Heroin”, based on the

assumption that most users in this category are using heroin rather than morphine or other opiates.

7.4.3 TEDS

The Treatment Episode Dataset is an annual survey detailing peoples’ self-reported drug use upon

admission to state and national rehabilitation facilities [241]. There is no need to process this data set

further, and we report results directly from the TEDS 2011 survey (1,844,720 respondents).

7.4.4 DAWN

The Drug Abuse Warning Network (DAWN) is a nationally representative public health surveillance sys-

tem that monitors drug-related emergency department visits to hospitals. The survey records up to 22

drugs related to an emergency room visit [231]. We considered only DAWN data set instances cor-

responding to drug misuse (131,698 instances). As 95.5% of the users in this population mention at

most three drugs, we consider only the first three substances mentioned. From these, we filter out sub-

stances that are common but not typically abused, such as insulin. Finally, we map the remaining drugs

to categories using the DAWN Drug Reference Vocabulary6.

6Available at http://www.samhsa.gov/data/dawn.aspx


7/4/2014 localhost:8081/index2.html

http://localhost:8081/index2.html 1/1

0% 25% 50% 75%FORUM77

0% 25% 50% 75%TEDS (2011)

0% 25% 50% 75%NA (2011)

0% 25% 50% 75%DAWN (2011)

OpioidsSuboxoneSedativesAlcoholCocaineHeroin

MarijuanaStimulants

HallucinogensInhalants

Figure 7.1: Drug of choice distributions (% of population using) across the Forum77, TEDS, NA andDAWN data sets.

7.5 Results

Forum77 users struggle with opioid addiction at much higher rates than other surveyed popu-

lations of drug users Figure 7.1 shows substance usage distributions across the Forum77, TEDS,

NA and DAWN surveys. Prescription opioids, utilized by ∼70% of the population, are by far the most

prevalent DOC, followed by opioid replacement therapy opioids Methadone and Suboxone (25%). This

is more than double the population prevalence reported in any of the other three surveys.

Relatively few Forum77 users mention struggling with traditionally abused drugs: Alcohol, mar-

ijuana and cocaine are the three most prevalent DOCs in the NA, TEDS and DAWN populations (Fig-

ure 7.1). However, these three substances are conspicuously scarce in the Forum77 population. For

example, alcohol is reportedly abused by approximately 80%, 55% and 37% of the NA, TEDS and DAWN

populations, respectively, but only by 10% of Forum77 users.

After peaking in 2008, the Forum77 population slowly declines: Figure 7.2(a) shows the number of

active monthly users by DOC on Forum77. In February 2008, ∼180 unique hydrocodone users initiated

a discussion on Forum77. In contrast, the corresponding number of users for February 2014 is ∼60.

The decline in population of hydrocodone and oxycodone users is steeper than that of other DOCs. To

analyze DOC prevalence over time accounting for population decline, we normalize by population size

(Figures 7.2(b) and 7.2(c)).

Hydrocodone and oxycodone are the most prevalent DOCs on Forum77, but this prevalence de-

clines over time: Figure 7.2(b) shows the prevalence of the six most common opioids in the Forum77

CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 927/4/2014 localhost:8081/trends_interactive_raw.html

http://localhost:8081/trends_interactive_raw.html 1/1

hydrocodone

oxycodone

suboxone

methadone

tramadol

heroin

20

40

60

80

100

120

140

160

180

2007 2008 2009 2010 2011 2012 2013 2014

Num

ber

of

month

ly u

sers

by S

OA

Raw Data Smoothing scale (0-100):

LOESS fit Smoothing parameter [0.25, 0.5, 0.75]:

(a) Number of unique monthly users for the 5 most prevalent opioids in Forum77 from 2007-2014.7/2/2014 localhost:8081/trends_interactive.html

http://localhost:8081/trends_interactive.html 1/1

hydrocodoneoxycodonesuboxonemethadonetramadolheroin

0

10

20

30

40

50

2007 2008 2009 2010 2011 2012 2013 2014

Per

cent

age

of d

rug-

iden

tifia

ble

popu

latio

n us

ing

(%)



(b) Unique monthly users for the 5 most prevalent opioids from Forum77 as a percentage of thepopulation. LOESS [48] fit lines with 95% confidence intervals indicate trends.7/2/2014 localhost:8081/trends_interactive_agg.html

http://localhost:8081/trends_interactive_agg.html 1/1

Rx opioids

ORT opioids

heroin

0

10

20

30

40

50

60

2007 2008 2009 2010 2011 2012 2013 2014

Perc

enta

ge o

f dru

g-identifiable

popula

tion u

sin

g (

%)



(c) Unique monthly users of opioid replacement therapy (ORT) opioids, other prescription opioids andheroin as a proportion of the Forum77 population. LOESS fit lines with 95% confidence intervalsindicate trends.

Figure 7.2: Prevalence of major opioids in the Forum77 population over time.


population over time. Locally weighted smoothing (LOESS [48]) is used to fit lines to each series, and

95% confidence intervals for each fit are shown. In 2007, hydrocodone and oxycodone are utilized by

approximately 45% and 33% of the population, respectively. By 2011, they each have a prevalence of

approximately 30%, which declines to about 27% (hydrocodone) and 26% (oxycodone) by 2014.

Opioid replacement therapy (ORT) opioids methadone and buprenorphine increase in prevalence

over time: Figure 7.2(c) aggregates the data shown in Figure 7.2(b), showing the prevalence of ORT

opioids (methadone and buprenorphine), other prescription opioids (e.g. oxycodone, hydrocodone etc.),

and heroin in the Forum77 population over time. While prescription opioids remain the most preva-

lent DOCs, this prevalence declines from about 70% to 56% over time, while ORT opioid prevalence

increases from approximately 19% to 28%.

Heroin prevalence increases slightly in 2013: On average, about 5% of Forum77 participants abuse

or misuse heroin until 2013, when the proportion of heroin users starts to increase noticeably, reaching

10% and looking to keep increasing by the end of our data set (Figures 7.2(b) and 7.2(c)). Moreover,

Figure 7.2(a) indicates a small absolute increase in heroin users from mid-2013 onwards, indicating that

the increase illustrated in Figures 7.2(b) and 7.2(c) is not purely an artifact of population normalization

with a decline of hydrocodone and oxycodone users.

7.6 Discussion

Prescription opioids are the strongly dominant DOC on Forum77, with their prevalence far exceeding

that measured in other drug-using populations. We suspect that this is the result of several factors.

First, users may be more receptive to seeking help anonymously online than discussing the issue with a

health care provider, since the healthcare provider may be the unwitting source of the opioids in the first

place [249]. Second, despite a robust evidence base for the medical treatment of opioid addiction [230],

few physicians have training in such treatment [263] and the condition remains highly stigmatized within

the medical community [176, 187]. Third, the more traditional self-help venues for addiction support,

namely Alcoholics Anonymous and Narcotics Anonymous, demand overcoming the stigma associated

with attending such meetings. The fact that opioid use disorders tend not to stem from recreational drug

use, which such venues are historically associated with, likely enhances this stigma. Finally, prescription

painkiller overdoses are growing at a significantly faster rate in the female population [8]. This, combined


with the fact that women are more likely than men to seek help online for health issues [37, 57, 90–92,

165], could partially account for the high prevalence of prescription opioid users on Forum77.

The scarcity of alcohol, marijuana and cocaine, the three most prevalent drugs present in the NA,

TEDS and DAWN surveys, could suggest a low number of recreational drug users in the Forum77

population. Alternatively, it is possible that Forum77 users are using alcohol and marijuana, but do not

see this use as problematic and so do not mention it. As we note in Table 7.3, the Forum77 data set

is unique in that users mention DOCs at their own discretion, and are not encouraged to disclose all

substances that they might be abusing. It is also possible that users approach different communities for

these issues: MedHelp, for example, has a separate, albeit very small, forum dedicated to alcoholism7.

Temporal trends indicate an increase in prevalence of opioid replacement therapy (ORT) opioids and

heroin, and a corresponding decline in other prescription opioids. It is possible, perhaps even likely, that

these trends reflect real-world drug usage: Cicero et al. [46] report a recent increase in heroin usage due

to oxycodone being more difficult to acquire and tamper with. In addition, survey data report a steady

increase in national buprenorphine usage [232] over time, and a slight decrease in non-medical use of

prescription opioids in the younger population [242]. While non-medical use of prescription opioids has

increased in the population of users 50 and older [242], this demographic is less prevalent online [7].

However, drawing epidemiological conclusions from these data without further study into what other

factors might be influencing these trends is ill advised.

7.6.1 Limitations & Future Work

While our work is the first to analyze drug usage trends in an online population, several challenges

remain. Foremost is extending similar analyses to a variety of online forums. Analyzing multiple data

sources would yield more comprehensive insights, and would also help to triangulate features in PAT

that are universally useful for monitoring substance abuse trends.

Finally, a difficult but necessary challenge is to investigate whether and how drug usage trends re-

flected in PAT align with those observed in the real world. As we discussed in Chapter 2, online health

seeking populations are not necessarily representative of real-world populations. As such, understand-

ing the relationship between PAT-observed and real-world drug usage trends would be necessary prior

7http://www.medhelp.org/forums/Alcoholism/show/158


to utilizing such data for monitoring and surveillance. In sum, however, our contributions in this chap-

ter both propose a viable methodology for automatically identifying DOCs from PAT, and lend the first

data-driven insights into drug usage in an online community.

7.7 Summary

Our goal in this chapter was to profile substance use in Forum77, and compare this to substance use

reported in traditionally surveyed drug-using populations. The ability to monitor population-level drug use

trends is valuable. Despite the popularity and uniqueness of OHCs focused on the topic of substance

abuse, however, no work to date focuses on automatically identifying users’ drugs of choice (DOCs) from

PAT. As such, our contributions – a method for automatically extracting and resolving DOCs, as well as

insights on the Forum77 population acquired through the application of this method – are both novel and

useful.

To automatically extract a user’s DOCs from her Forum77 initiating posts, we used manually-labeled

data to train a CRF classifier (§ 7.3.2 and 7.3.3). We use a CRF classifier as the problem of identifying

DOCs is context sensitive: many commonly abused drugs are also used as legitimate treatments for

withdrawal. Our CRF classifier is highly accurate, achieving F1, Precision and Recall scores of 84.65%,

91.12% and 79.46%, respectively (§ 7.3.3). Finally, to facilitate analysis and comparison, we resolve

extracted entities (e.g. vics, benzos) to drugs (e.g. Vicodin, benzodiazepines), and drugs to categories

(e.g. opiates, sedatives) (§ 7.3.4).

To profile substance use on Forum77, we applied our method to the entire set of initiating posts

on Forum77 (78,507 posts authored by some 28,005 users), and compared our results to those from

three surveys: the Narcotics Anonymous annual membership survey, the Treatment Episode Dataset,

which surveys users in state-funded rehabilitation facilities, and the Drug Abuse Warning Network, which

collects data on substance abuse related admissions to emergency departments (§ 7.4). Our results

(§ 7.5) show that Forum77 users are disproportionately addicted to prescription opioids, while more

traditionally-abused substances, such as alcohol, marijuana and cocaine, are infrequently reported. Our

analyses of drug usage trends on Forum77 over time suggest that Forum77 may reflect real-world trends

in substance use.

Chapter 8

Quantifying Recovery and Relapse

8.1 Introduction

Despite the prevalence of online health forums for substance use disorders, we have little understanding

of the role that they play in the process of cessation. For example, when in the cycle of abuse are they

most helpful to users? As we noted in Chapter 7, most substance abuse data are collected at point-

of-care facilities. As such, online health communities (OHCs) are uniquely poised to offer quantified

answers to questions that have previously been answered only anecdotally. For example, in a cohort

of people with substance use disorders attempting recovery, what percentage relapse? Of those who

recover, how long do these recovery periods tend to last?

Our goal in this chapter is to educe patterns of relapse and recovery as they manifest on Forum77.

We begin by describing the process of prescription abuse drug cessation and related prior work (§ 8.2),

and describing the data samples used in this chapter (§ 8.3). We then make the following contributions:

A quantified taxonomy of phases of addiction as expressed by users on Forum77 (§ 8.4). Our taxon-

omy, developed in concert with an addiction specialist, is based on Prochaska’s Transtheoretical Model

(TTM) of behavior change [203], and serves both as a labeling rubric for mapping text to phases of

addiction, as well as a quantified summary of phase-based activity on Forum77. We use the taxonomy

to manually label initiating post sequences from 191 Forum77 users (2,266 posts total) with the labels

USING, WITHDRAWING or RECOVERING. We find that Forum77 is most heavily utilized when users are

WITHDRAWING.

An analysis of activity and linguistic features across the phases of addiction (§ 8.5). We identify

features that are characteristic of each phase, and leverage them to train a conditional random field

(CRF) model to automatically label users’ phases of addiction over their tenure on Forum77. Our CRF

96

CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 97

achieves an F1-score of 67.6% against a baseline F1-score of 20%. Using CRF-labeled sequences, we

are able to identify (1) whether a user relapsed at some point during their tenure, and (2) whether a user

was RECOVERING at the time of her final initiating post, with F1-scores of 78% and 82%, respectively.

An analysis of transition, relapse and recovery based on the CRF-labeled phase sequences of 2,848

Forum77 users (32,345 posts) (§ 8.6 and § 8.7). We find that overall, progressive transitions are more

prevalent than regressive transitions. Moreover, despite the fact that relapse is common (almost half

of users relapse at some point during their tenure), the chances of a user RECOVERING by her final

post are favorable. Finally, we observe a significant correlation between high forum engagement (both

frequency of participation and volume of response posts authored) during a user’s phases of USING and

WITHDRAWING and the probability that she is RECOVERING when she leaves Forum77.

We discuss our results in the context of Forum77’s efficacy as a withdrawal aide, implications for

future forum design, and implications for Addiction research (§ 8.8) before concluding (§ 8.9).

8.2 Background

To our knowledge, our work is the first to investigate the topic of prescription drug abuse cessation in

social media. Given the secretive and stigmatized nature of this condition [174, 176, 187], our contri-

bution provides a unique and often overlooked perspective on prescription drug abuse: that of patients

themselves. In this section, we provide an overview of prescription drug abuse as well as the traditional,

in-person mutual help groups Alcoholics Anonymous (AA) and Narcotics Anonymous (NA). Next, we

present work that, like ours, attempts to infer a person’s health state from her social media contributions.

For a review of literature analyzing the efficacy of OHC participation, we refer the reader to § 2.2.4.

8.2.1 The Prescription Drug Abuse Cycle

Prescription drug abuse (or “nonmedical use”) is defined as “the use of a medication without a prescrip-

tion, in a way other than prescribed, or for the experience or feelings elicited” [249]. Opioid pain relievers,

such as hydrocodone, oxycodone, morphine and codeine, are the most frequently abused prescription

medications [5]. In 2010, some 5.1 million Americans reported misusing prescription pain relievers in

the last month, followed by sedatives (2.6 million) and stimulants (1.1 million) [5].


Withdrawal

Withdrawal (or detoxification) is a painful process that is frequently compared to having a bad case of

influenza [6, 84]. Common withdrawal symptoms include agitation, anxiety, muscle aches, insomnia,

sweating, abdominal cramping, diarrhea, goose bumps, nausea and vomiting [6]. Typically, symptom

onset aligns with the first missed dose in the case of a “cold turkey” approach, or within a few days of dose

reduction in the case of a taper [84]. Symptom severity peaks within a few days of final exposure, and

gradually reduces as the user’s physical dependence on the drug weakens [84]. Withdrawal duration,

dependent on biological factors, drug and dosage levels, and withdrawal method, ranges broadly from

7-10 days (cold turkey) [102] to 20-35 days (methadone-assisted taper) [84].

Self-Detoxification

Research on easing the withdrawal process focuses primarily on medication-assisted detoxification over-

seen by a medical professional, with almost no work on the subject of self-detoxification. We found two

studies in which attendees of the same London methadone treatment facility were interviewed about

prior self-detoxification attempts. In both studies, most patients had attempted self-detoxification, and

many had made multiple attempts [102, 184]. The short-term success rate of achieving 24 hours of

abstinence per episode was 41% [184], while the medium-term success rate of achieving 10 days of

abstinence per episode was 24% [102]. The design of these studies naturally exclude patients who suc-

cessfully maintain long-term abstinence. When asked why their attempts had failed, subjects pointed to

lack of support during detoxification [102], as well as easy access to drugs and severity of withdrawal

symptoms [102,184]. Patient-reported strategies for effectively completing withdrawal include distraction

and avoidance, especially in the form of physical activity [102]. In addition, Green et al. [106] showed that

informing patients in full as to the type and severity of withdrawal symptoms that they were likely to expe-

rience resulted both in lower self-reported symptom severity scores as well as an increased probability

of completing the detoxification process.

Relapse & Recovery

Relapse rates for opioid use are high. Reported reuse statistics for individuals having gone through

detoxification programs range from 81-91% [103, 227]. However, long-term prognoses are more favor-

able, with evidence suggesting that 45-51% of patients may achieve sustained abstinence, and that

sustained abstinence is a gradual process [103].


“Recovery” is a hotly contested term in drug use disorder communities. Many align with the Alcoholics

Anonymous viewpoint that addiction is an uncurable disease and, as such, an individual never fully

“recovers” from addiction [1]. Rather, users who reach sustained sobriety are referred to as being “in

recovery”. In this work, we refer to users who have overcome physical withdrawal as RECOVERING.

8.2.2 In-Person Mutual Help Groups

Alcoholics Anonymous (AA), founded in the 1930s, is one of the most utilized services for substance

use disorders in the world, with over 4 million members across 100 different societies [133]. It has also

given rise to other peer recovery groups for addiction, like Narcotics Anonymous (NA) and Gamblers

Anonymous (GA). AA and NA are almost entirely based on mutual support, even condemning the giving

of medical advice as outside the expertise of the group, instead encouraging members to see a doctor if

medical or psychiatric problems arise [133].

Three decades of accumulated evidence demonstrates that active participation in such groups for

addiction improves outcomes [155], although success rates are ill-defined and vary across studies [20].

A high participation level in AA is reported to be one of the strongest predictors for abstinence [190,

223]. For example, Pagano et al. [190] found that users who actively helped other AA members had

a relapse rate of 55%, while those who did not relapsed at a rate of 75%. Correspondingly, many of

the benefits of AA are thought to stem from the social network that it provides its members, who afford

each other support, role modeling and experiential advice [140]. Kelly et al. [141] find that through their

interactions with other AA members, users experience increased abstinence self-efficacy, increased

spirituality/religiosity and reduced negative affect. Having a sponsor is also thought to help newcomers

avoid relapse [237].

8.2.3 Inferring Health State from Social Media

The idea that social media users’ health states will be somehow reflected in the content that they con-

tribute, and that it may be possible to predict health state from these data, has captured the interest of

several researchers. De Choudhury et al. [69–71] analyze how postpartum depression (PPD) might be

reflected on both Twitter and Facebook. Using their findings, they leverage activity and linguistic fea-

tures to build models that can predict the onset of PPD from Facebook data [71]. In other social media

studies, both activity features, such as social engagement and connectivity, and linguistic features, such


as affect and writing style, have been shown to be useful indicators of depression [72, 129, 191, 208],

neuroticism [208] and post-traumatic stress disorder [118].

A related challenge is to identify a user’s current phase within a specific medical condition. Jha and

Elhadad [136] found that a combination of linguistic and activity features are helpful for identifying can-

cer stages I–IV. Murnane and Counts [180] conducted an analysis of smoking cessation as reflected on

Twitter. They find that linguistic features of positive and negative sentiment, as well as social interac-

tion variables, were significant differentiators between users who relapsed and users who ceased their

smoking behavior during the time of the study. Finally, Wen and Rose use logistic regression and flex-

ible pattern matching over posts from an online cancer community to extract pre-defined events onto a

timeline [252].

8.3 Data

Typically, users present their own current substance use situation (e.g., drugs used and number of days

clean) in initiating posts. In contrast, users are liable to discuss a wide range of substance abuse

situations in response posts, including their own and the initiator’s. Accordingly, we restrict our analysis

to Forum77’s initiating posts, of which there are 78,507 authored by a total of 28,005 unique users.

Below, we describe the data sets that we use for taxonomy development, classifier training and testing,

and analysis.

Taxonomy Development: Our taxonomy development (§ 8.4) is an iterative process; for each iteration

we randomly sampled 1,000 of Forum77’s initiating posts.

Training & Testing Dataset: In § 8.4.4 we describe the importance of labeling sequences of initiating

posts rather than randomly sampled individual posts (as we did for taxonomy development). For our

labeled data set (§ 8.5.1) we randomly sample 200 users who had authored > 5 initiating posts on

Forum77, and all of their 2,266 initiating posts.

Analysis Dataset: We analyze all initiating post sequences of users who authored > 5 initiating posts

on Forum77. This totals 41,387 initiating posts authored by 2,848 users.


8.4 Exploring & Modeling Phases of Addiction

To systematically analyze phases of substance abuse in Forum77, we require both a valid taxonomy of

phases and a rubric mapping post text to these phases. Towards this aim, we derive a rubric based on

labels from the Transtheoretic Model (TTM) of behavior change, which we describe below.

8.4.1 Transtheoretical Model for Behavior Change

The Transtheoretical Model (TTM) is a framework that describes six stages of change that a per-

son traverses in order to manifest permanent behavior change. Established in 1997 by Prochaska &

Velicer [203], the TTM has been applied to a range of behaviors, from smoking cessation [75, 180, 247]

and substance abuse [175], to sustainable energy usage [123]. The intuitiveness and universal appli-

cability of the TTM make it a useful descriptive tool; however, care should be taken before utilizing it to

inform treatment or intervention [175,253].

According to the TTM, a person begins in the stage of pre-contemplation, in which she is not thinking

about initiating a behavior change. After contemplation, she moves on to preparation, in which she

makes preparations necessary to initiate a behavior change. The person then moves on to action, a

concerted and deliberate attempt to affect short-term behavior change. If successful, the person enters

a period of maintenance, in which she tries to sustain the behavior change in the long term. If successful,

the person eventually enters the stage of termination [203]. As there is considerable debate over whether

addiction is a terminable condition [1], we omit this stage for our purposes.

8.4.2 Rubric Development

In order to match Forum77 posts to TTM stages, we randomly sampled 1,000 initiating posts. Two au-

thors mapped these posts to stages in the TTM, assigning descriptive labels to emergent sub-categories

specific to the topic of addiction (e.g., tapering and cold turkey are both part of the TTM stage Action) in

the style of a General Inductive Approach [236]. We repeated this process several times, reviewing the

rubric with an addiction specialist prior to finalization. (Note: this is the same thematic analysis process

as that described in Figure 6.1 in § 6.5.)


8.4.3 A Taxonomy of the Phases of Addiction

Table 8.1 describes our resulting phase taxonomy, along with example posts (synthesized from genuine

posts to preserve user privacy) and the prevalence of each label in our final 1,000 initiating post sample.

Although descriptively interesting, several of the labels in the taxonomy (e.g., intent to quit and about to

quit) are rare. For parsimony, and to aid subsequent classification accuracy, we collapse labels into three

categories: USING, WITHDRAWING and RECOVERING. This improves inter-annotator agreement (over a

100-post, independently labeled sample) from a Cohen’s Kappa of 0.73 to 0.78.

8.4.4 Labeling People, not Posts

Moving forward, we want to analyze addiction phases at the level of individual people. Two factors that

emerged in our taxonomy development (see Table 8.1) convinced us that labeling randomly sampled

posts would be insufficient for such analyses, and that we should instead label users’ entire post se-

quences. The first was the high prevalence (9.8%) of n/a labels. These posts are often social in nature

and, taken independently, impossible to assign to a class. However, when read in the context of the

author’s previous and subsequent posts, the label is usually obvious (see Figure 8.1). The second factor

was the low prevalence of relapse labels. We noticed that while many users relapse, few announce the

fact directly. Rather, most users will mention a relapse when they are already committed to another ces-

sation attempt (e.g., about to quit or even quitting again). However, a relapse can still be observed in a

regressive sequence, such as WITHDRAWING → USING (see Figure 8.1). Based on these observations,

in the rest of this paper we label sequences of posts.

8.5 Characterizing the Phases of Addiction

Phases of addiction coincide with distinct physiological and psychological states. In this section, we

analyze activity and linguistic features that might characterize an author’s phase on an initiating-day. We

define an initiating-day to be any day on which the user initiated a thread on Forum77. If the author

initiated multiple posts, we combine them for analysis. Our goal is two-fold: (1) to characterize phases

of addiction as they are expressed on Forum77, and (2) to identify discriminative features that might be

used for classification.


Table 8.1: Addiction Phase Taxonomy derived via a thematic analysis.

Final Category TTM phase Label Description Synthesized Example %

USING Pre-contemplation

Using Subject is using sub-stances and demon-strates no intention toquit.

it has been forever since I’ve beenhere and not much has changed.I am still using the prescribedamount of oxycodone for neckpain.

3.1

Addicted Subject is using sub-stances and indicatesthat she is addicted, butdemonstrates no intentto quit.

my girlfriend and i r both addictedto percs but she is taking waymore than me and keeps gettingchest painonce every other week.

7.4

Relapse Subject has used sub-stances again after anattempt to quit.

I just messed up majorly. I was6 days clean, doing OK-ish, whenmy mother stopped by with 10Vics “incase I needed them”. Ofcourse, being the WEAK person Iam, I took them all right there.

1.3

Contemplation Intent to quit Subject expresses de-sire to stop abusing asubstance in the future.

I want off roxies. is methadonethe answer. I need to work daily.I cannot do withdrawls. PLEASEHELP!

9.3

Preparation About to quit Subject notes timeand/or plan (e.g., ta-pering schedule) toquit.

i was planning to quit the firstweek of March. True to form ad-dict fashion I’m out of both moneyand pills. So I‘m about to go ctnow instead of next week when I‘dplanned.

2.5

WITHDRAWING Action Quitting Subject is in withdrawal;method unspecified.

Today is my 5th day of FREE-DOM! I havent experienced anyw/ds yet. So much energy.

39.1

Tapering Subject is in withdrawal;detoxification method isa taper.

Have some Vics I am taking. I amdown to 6 a day. I plan to go downto 3 a day then 1 a day until I amdone!

6.4

Cold Turkey Subject is in withdrawal;detoxification method iscold turkey.

I am on day 6 of CT from 150mg+a day of ocycodone. I‘m doing finejust some overall anxiousness

3.3

RECOVERING Maintenance In recovery Subject has finisheddetoxing; no physicalwithdrawal symptomsexpressed

Just an update to tell you that Ihave 67 clean days today. I feelamazing. I sleep well now and feelgood! I’ve had a lot of discussionsabout aftercare.

17.8

n/a Impossible to determinestatus based on post

I’ve been away for few days andeverything seems different. Any-way I hope everyone is doinggreat.

9.8

8.5.1 Sample & Labeling

To study how addiction phase sequences change over time, we restrict our analysis to users who have

initiated at least 5 threads on Forum77 (n=2,848 out of 29,196 users who initiated at least one post).

Of these, we randomly sampled 200 users (∼7% of the full 2,848) and all of their initiating posts. We


First post Last post

Hey guys. Just checking who’s hanging around on the forum tonight. Peace!

Day 4 off vics today and some cravings but I’m going strong!! -WilB

6 days today and feeling pretty terrible. The restless legs are killing me, can’t…

Label sequences, not posts

Absence USING WITHD. RECOV.

105

Relapse

Figure 8.1: Illustration of how sequence analysis can (1) reduce NA labels by leveraging context fromsurrounding posts, and (2) capture relapse events in regressive sequences without requiring the user toexplicitly state that she relapsed.

discarded 9 users from the sample: two who had authored more than 100 posts, one account that

belonged to MedHelp, and six accounts for which there was no clear ownership (several different people

appeared to be using the same MedHelp account). The resulting sample contains 2,266 initiating posts

(average 11.9 posts per user) and comprises ∼5.5% of the full 41,387 initiating posts authored by the

2,848 users who have authored ≥ 5 posts on the forum.

Two authors categorized each initiating post in the sample using the taxonomy presented in Table 8.1.

We labeled each user’s data in chronological order so as to transfer context learned from surrounding

labels. Disagreements (which were rare) were relabeled based on a consensus reached after discussion.

8.5.2 Activity Features

We identify 15 activity characteristics that describe an initiator’s global activity over time, her local activity

5 days prior to the initiating-day in question, and both the initiator’s and respondents’ activity on the

initiating-day. The features capture user activity volume (e.g., number of posts initiated in the last 5

days), engagement (e.g., days elapsed since last response to another user) and attention (e.g., number

of unique respondents to a user’s initiating post on the initiating-day). For a full description of all features,

as well as summary statistics of their distributions across each class, we refer the reader to Table D.3.


8.5.3 Linguistic & Content Features

LIWC Features

Differences in word use and linguistic style are believed to reveal a range of information about people,

from psychological state to social identity [196]. The Linguistic Inquiry and Word Count (LIWC) [195]

software calculates 80 linguistic variables over text. In prior work, LIWC has been used to characterize

and distinguish women suffering from Post-Partum Depression (PPD) [71], individuals at risk for depres-

sion [72] and smokers on Twitter who are at risk for relapse [180]. We calculate all 80 LIWC variables

over initiating post text as well as over all responses received on the initiating-day. We then examine

differences in these variables across the USING, WITHDRAWING and RECOVERING phases (Tables D.1

& D.2).

Days Mentioned and Question Features

In addition to the LIWC features, we calculate three variables over initiating post text. Users frequently

mention how long they have been clean at the time of posting. We extract days clean automatically by

using hand written patterns, such as “clean X days” and “X weeks off”, where X represents a number.

We convert X to days if necessary. We also use a more relaxed version of this feature, called days

mentioned, in which we do not require the user to explicitly mention terms like “clean” or “off”. Finally,

we count the number of questions asked by identifying sentences that start with a question word and/or

end with a question mark. This feature has proved helpful in prior work [71]. We find that including these

three extra features improves classifier performance by ∼2.2%.

Phase-Specific Term Features

Finally, we count how many phase-specific words occur in both initiating post text as well as response

text. To determine whether a term t is particularly descriptive of a phase p, we calculate its frequency-

based odds ratio. If fp(t) is the number of posts of phase p that contain t, then:

OR(t, p) =fp(t) ∗ fp(t)

fp(t) ∗ fp(t)


The odds ratio is a measure of strength of association. We calculate the odds ratio for each term

across each phase, and retain terms with an odds ratio >2. Table 8.2 shows sample terms for both

initiating and response posts.

Table 8.2: Sample phase specific terms for the USING, WITHDRAWING and RECOVERING categories.

Initiating Posts Response Posts

USING withdrawls, wants, hate, addicted, scared,tried, stop

situation, willing, treatment, withdrawl, op-tion, advise, rehab, counseling

WITHDRAWING rls, hot, restless, aches, slept, arms, legs,headache, wd, worst, stomach, tramadol

potassium, heating, fluids, baths, pad, show-ers, legs, melatonin, hot, slept, bananas

RECOVERING craving, recovery, lately, sober, fight, truly,clean, cravings, true, worth

inspiration, accomplishment, congratula-tions, sharing, thank, miss, proud, paws

8.5.4 Results: Activity and Linguistic Features

We present linguistic features over initiating posts in Table D.1, linguistic features over response posts in

Table D.2, and activity features in Table D.3. Unless otherwise mentioned, we use Kruskal-Wallis tests

to assess statistical significance. A non-parametric test is appropriate for data that are not expected to

follow a normal distribution (such as ours), and a Kruskal-Wallis test determines whether any pair in a

trio of distributions is significantly different.

Our feature analysis indicates that both users’ activity and users’ content and linguistic characteristics

differ measurably across addiction phases. We discuss particularly descriptive features of each phase

below.

USING: This phase is characterized by long absences from the forum and, correspondingly, low levels

of recent activity. Users who are USING have, on average, been absent from forum participation in all

capacities for more than twice as long as users who are WITHDRAWING or RECOVERING (40 vs. ∼18

days since last activity ). A longer absence from the forum may partially explain why USING posts are, on

average, longer (208 vs. ∼180 words): users must account for lost time and bring their audience back

up to speed.

Both days clean and days mentioned vary widely in USING posts, and have surprisingly high median

values. Examining the underlying data provides an explanation: users who are USING often mention how


long they had been clean prior to relapse in statements such as, “I was clean for 4 months before...” or

“I would have had 717 days clean today”.

Finally, USING posts offer the lowest levels of positive affect (16% less than WITHDRAWING and 32%

less than RECOVERING), and the highest levels of discussion around the topic of health (16% more

than WITHDRAWING and 36% more than RECOVERING); characteristics that are mirrored in responses to

USING posts. The lack of positivity resonates with the fact that users who are USING have either relapsed

or failed to progress towards recovery.

WITHDRAWING: In recent activity, users who are WITHDRAWING issue more initiating posts and self

responses than those who are USING or RECOVERING. In addition, they have the smallest average

number of days since last initiating post (21 vs. 31 RECOVERING and 50 USING) and days since last

self-response (29 vs. 42 RECOVERING and 66 USING).

As we might expect, WITHDRAWING users express the lowest numbers of days clean and days men-

tioned. In addition there is a great deal more language about feeling, biological processes and the body.

These observations align with the nature of detoxification as an uncomfortable physical process from

which people constantly seek relief [84].

Responses to WITHDRAWING posts are not particularly distinctive. Aside from expressing slightly

more anxiety, and writing slightly more about feeling and the body, other linguistic variables tend to take

on a value somewhere in between those of responses to USING and RECOVERING. It is possible that

respondents try to influence users from one side of the spectrum to the other, modifying their language

according to the user’s progress.

RECOVERING: These users are highly active, especially in the area of responding to other peoples’

posts. In recent activity they issue, on average, 15.2 responses to other peoples’ threads, compared to

5.5 by users who are WITHDRAWING and 1.9 by users who are USING. Moreover, unlike WITHDRAWING

and USING users, their # initiating posts# responses authored tends to be <1.

Linguistic features also suggest that RECOVERING users tend to focus on others. The pronoun you

is used almost 100% more while the I pronoun is used less, and language is more social. Moreover,

users express significantly more positive affect (25% more than WITHDRAWING, 48% more than USING)

and less anxiety (18% less than WITHDRAWING, 16% less than USING). The evident outward focus of

initiating posts from RECOVERING users resonates with the 12th step in traditional twelve-step programs


such as AA, which encourage people to strengthen their sobriety by using their experiences to help

others achieve it [1].

Responses to RECOVERING posts are distinct in that they express substantially more positive affect

(27% more than responses to WITHDRAWING, 57% more than responses to USING). They also tend to

host a notable quantity of exclamation marks (100% more than WITHDRAWING, 350% more than USING).

Inspection reveals that this is an expression of excitement and encouragement in response to good

news, for example, “hoooooorrrraaaahhhhh!!!!!!!!!” and “I am so PROUD of YOU!!!!!”.

8.6 Automatically Classifying Addiction Phase

Informed by our feature analysis, we next train a statistical classifier to automatically label Forum77 posts

as USING, WITHDRAWING or RECOVERING. Analyses of phase sequences can give insight into events

such as relapse and recovery. Our classifier allows us to scale such analyses to the entire Forum77 data

set. Below, we describe our classifier and report its performance. We discuss relapse and recovery in

§ 8.7.

8.6.1 Model & Features

A user’s path through the different phases of addiction forms a natural sequence. A conditional random

field (CRF) [151] is a probabilistic graphical model that performs inference over sequences, rather than

individual data points. By taking into account prior and subsequent data items in a sequence, CRFs

are context sensitive. For example, unlike a CRF, a non-sequence-based classifier might have difficulty

classifying a post like, “I’ve been away for a few days and everything seems different. Anyway I hope

everyone is doing great...”, even if it was sandwiched between two posts that were obviously USING, as

the post itself contains no clues as to the user’s phase.

Accordingly, we train a 3-class CRF to annotate a user’s sequence of initiating-days with the labels

USING, WITHDRAWING or RECOVERING. We use an adapted a version of the Stanford Named Entity

Recognizer package, a trainable, Java implementation of a CRF classifier1, that analyzes sequences of

documents (default unit of analysis is a token). Tables D.1, D.2 and D.3 indicate the subset of features

that we used for classifier training. We selected features based on apparent discriminability and itera-

tive evaluation through 10-fold cross validation. In order to improve robustness and model potentially

1http://nlp.stanford.edu/software/CRF-NER.shtml


Table 8.3: CRF performance scores aggregated over 10 runs of 10-fold cross validation, with randomlyshuffled input sets.

Label Precision Recall F1 score Accuracy

Combined 68.3 68.0 67.6 69.8USING 62.4 61.7 61.4WITHDRAWING 70.6 71.9 70.9RECOVERING 72.1 71.2 70.9

Baseline 14.0 33.0 20.0 43.0

non-linear responses, we binned numeric features into octiles: ranks that divide the data evenly into 8

groups. While using quartiles is arguably more common in standard practice, we found that using octiles

improved classifier performance.

8.6.2 Performance

Table 8.3 shows precision, recall and F1 scores for the CRF classifier. Our classifier achieves an F1

score of 67.6% against a baseline F1 score of 20.0%, acquired by labeling each instance with the

majority class, WITHDRAWING.

It is useful to know which labels the CRF is likely to confuse. Figure 8.2 shows the CRF classifier’s

confusion matrix. Diagonal entries indicate counts of correctly-classified instances. The strong diagonal

indicates a relatively high level of accuracy. Most classification errors occur between adjacent phases:

confusing USING and WITHDRAWING, and confusing WITHDRAWING and RECOVERING is common, but

confusing USING and RECOVERING less so. This resonates with a point prevalent in the addiction litera-

ture: stages of recovery are not black and white but rather fall on a spectrum [79,168].

8.6.3 Results

We analyze the result of applying our CRF classifier to the entirety of the Forum77 membership base

who have initiated > 5 posts (2,848 users, 32,345 initiating posts). Our results give us insight into

common transitions between addiction phases, enabling us to answer questions such as, “If a user is

WITHDRAWING today, how likely is it that she will be RECOVERING on her next initiating-day?” and “what

is the most frequent phase change observed on Forum77?”


6/4/2014 localhost:8080/index_transition.html

http://localhost:8080/index_transition.html 1/1

327.2 131.8 62.2

150.2 686.9 142.7

52.2 139.8 560.2

Using Withd. Recov.

Rec

ov.

With

d. U

sing

GOLD LABELS

CR

F LA

BEL

S

Using Withd. Recov.

Rec

ov.

With

d. U

sing

GOLD LABELS

CR

F LA

BEL

S

Figure 8.2: Confusion matrix for our CRF classifier aggregated across 10 randomized runs of 10-foldcross validation.

Figure 8.3(a) shows the normalized transition frequency matrix for USING, WITHDRAWING and RE-

COVERING. The most common transitions lie along the diagonal, indicating that users typically initiate

consecutive posts in any one phase. Self-transitions aside, the progressive edges between consecutive

stages (USING→ WITHDRAWING and WITHDRAWING→ RECOVERING) are the most common, accounting

for approximately 6% and 5.2% of total transitions, respectively. In contrast, regressive edges between

consecutive stages (WITHDRAWING → USING and RECOVERING → WITHDRAWING) are less common,

accounting for 2.6% and 1.1% of total transitions, respectively.

Figure 8.3(b) shows conditional transition probabilities across states. The likelihood of a same-

state transition increases with the progressiveness of the state. For example, there is a 71% chance

that a USING user will be USING in her next post, an 81% chance that a WITHDRAWING user will be

WITHDRAWING in her next post, and a 91% chance that a RECOVERING user will be RECOVERING in her

next post.

Figure 8.4 shows the distributions of phase length in days for each phase. We calculate phase

length as the number of days between the first and last post in a contiguous sequence. The typical

WITHDRAWING phase lengths align well with those reported in the literature on addiction, which suggests

a 7–35 day duration depending on the detoxification method used, as well as other factors [84,102].




17.35 6.04 1.12

2.56 33.85 5.23

1.78 1.11 30.96



70.79 24.64 4.57

6.15 81.29 12.56

5.26 3.28 91.46

Using Withd. Recov.

Rec

ov.

With

d. U

sing

TARGET STATE

SOU

RC

E ST

ATE

Using Withd. Recov. TARGET STATE

Using Withd. Recov.

Rec

ov.

With

d. U

sing

GOLD LABELS

CR

F LA

BEL

S

Using Withd. Recov.

Rec

ov.

With

d. U

sing

Target State

Sou

rce

Sta

te

Using Withd. Recov. Target State

Using Withd. Recov.

Rec

ov.

With

d. U

sing

Gold Labels

CR

F La

bels

(a) (b) Figure 8.3: (a) Normalized transition frequencies between addiction phases (e.g., USING→ RECOVERING

edges comprise 1.12% of the total transitions in the CRF-labeled data) and (b) conditional transitionprobabilities (e.g., the probability of a user moving from USING to RECOVERING is 4.57%.)

8.7 Automatically Classifying Relapse and Recovery

Relapse and recovery are critical events in the process of addiction that are often viewed as “failure”

or “success”. Prior work in the addiction literature suggests that recovery is a long, iterative process

of which relapse is a part [103]. Leveraging our CRF classifier, we present methods for identifying (1)

if a user has relapsed during her tenure on the forum, and (2) if a user is RECOVERING on her last

initiating-day on Forum77. We then investigate if relapse adversely correlates with a user’s chance of

RECOVERING. Finally, we identify activity features during USING and WITHDRAWING phases that discrim-

inate between users who wrote their final post on Forum77 in a state of RECOVERING, and those who

did not.

8.7.1 Identifying Relapse

To identify a relapse incident, we codify three regressive transition patterns:

RECOVERING→ { WITHDRAWING, USING }

WITHDRAWING→ USING

WITHDRAWING→ (45+ days absent)→ WITHDRAWING

CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 1127/24/2014 localhost:8080

http://localhost:8080/ 1/1

1 3 8 24 55 600

20406080

100120140160180200220240

7/24/2014 localhost:8080


1 3 7 16 35 600

20406080

100120140160180200220240260

7/24/2014 localhost:8080


1 7 17 36 79600

20406080

100120140160180200

USING phase length (days)

WITHDRAWING phase length (days)

RECOVERING phase length (days)

Median Q1 – Q3 (1.5 * IQR) within Q1, Q3

Figure 8.4: Distributions of phase lengths. A red bar indicates the median value, while the dark blueregion indicates the middle spread. The light blue region indicates values that fall within 1.5 ∗ theinterquartile range of the middle spread.

This last pattern is based on the observation that a general window for withdrawal duration is 7-35

days [84, 103]. As such, if a user was absent for more than 45 days, and then returned in a state of

WITHDRAWING, it is likely that she failed in her initial attempt and has restarted. While it is possible that

this pattern will capture individuals on a slow taper, in our experience it is unlikely that such users would

be inactive for a full 45 days.

We identify whether a user relapsed or not during her tenure on Forum77 by testing whether any of

the above patterns exist in her sequence of phase transitions. To evaluate the efficacy of this approach,

we apply it to both the gold label sequences as well as the CRF-labeled sequences in our labeled sample


Table 8.4: Performance for identifying relapse events (top) and whether a user’s final state is RECOVER-ING (bottom). Combined scores across classes are shown in bold.

Identifying a relapse event


Combined 79.92 78.18 78.04 78.42Relapse 86.11 66.67 75.15No relapse 73.73 89.69 80.93

Baseline 25.65 50.00 33.91 51.30

Identifying final initiating post phase


Combined 81.47 81.52 81.49 81.57RECOVERING 79.78 80.68 80.23¬RECOVERING 83.17 82.35 82.76

Baseline 26.84 50.00 34.93 53.40

data set. Using this technique, we achieve an F1-score of 78% and accuracy of 78% in identifying

Relapse and No relapse, compared to baseline scores of 33.9% and 51.3% if we labeled each user with

the majority class, No relapse (Table 8.4).

8.7.2 Identifying Recovery

To identify whether a user was RECOVERING when she last initiated a post on Forum77, we simply

examine the final phase label in her transition sequence. Using the CRF-labeled sequences, we classify

a user’s last post as RECOVERING or ¬RECOVERING with an F1-score of 81.5% and accuracy of 81.6%;

the comparative baselines are 34.9% and 53.4%, in which all last posts are labeled as ¬RECOVERING

(Table 8.4).

8.7.3 Results

Using the methods described above, we identify users who are RECOVERING at the time of their last

initiating post on Forum77, as well as users who have relapsed at least once during their tenure on

Forum77. We apply this analysis to the entirety of the Forum77 membership base who have initiated >

5 posts (2,848 users, 32,345 initiating posts).


6/3/2014 Sankey Diagram


May 22, 2012Mike Bostock

Sankey Diagrams

First post Last post

Usi

ng 4

8%

With

d. 4

4%

37%

17

%

Rec

ov. 4

6%

Rel

apse

48%

N

o re

laps

e 52

%

Figure 8.5: Aggregated user transitions from start to end state. Bar widths denote population proportion.For example, 48% of users in our sample relapsed during their tenure on Forum77.

Do users tend to recover on Forum77? Overall, users progress towards recovery during their tenure.

Figure 8.5 shows the distribution over start state, relapse, and end state for the 2,848 users described

above. Most users first initiate contact on the forum when they are USING (48%), followed by WITH-

DRAWING (44%). In contrast, only 17% of users are USING by the time of their last post, while 37% are

WITHDRAWING and 46% are RECOVERING.

Does relapsing hurt recovery likelihood? Roughly half of users experience a relapse during their

tenure. Users who experience no relapse are significantly more likely to end in RECOVERING than users

who relapse (53.4% vs. 44.4% end in RECOVERING, χ21 = 55.1, p < 0.001). Despite this, RECOVERING

is still the most likely end state for Forum77 users who relapse.

Are relapses associated with longer tenure? Given the documented prevalence of relapse [103,

227], the observation that more than half of the users in our data set experience no relapse is surprising.

Analyzing tenure values reveals that the average tenure of no relapse users is 128 days, compared to

418 days for users who relapse. One hypothesis is that users who experience no relapse do relapse

after leaving the forum and do not return.

What differentiates users who are ultimately RECOVERING? We define a user as active if she ini-

tiated a post on the forum in the last 45 days of our data set, and remove these. We then analyze

users’ global activity characteristics (Table D.3) aggregated over their USING and WITHDRAWING posts

(RECOVERING posts are omitted as this is the phenomenon that we are studying). Table 8.5 shows the

results.


Table 8.5: Comparison of activity features for users who are and are not RECOVERING in their last initiat-ing post. Per-user values are aggregated over USING and WITHDRAWING posts. Statistical significanceis determined using Kruskal-Wallis tests (*** p < 0.001) after Bonferroni corrections.

RECOVERING not RECOVERING

Activity Characteristic p Mean Med. IQR MAD Mean Med. IQR MAD

# initiating posts authored *** 8.99 5 8 4.44 9.89 6 6 2.96# self responses authored *** 19.56 8 16 10.37 17.04 9 16 8.89# responses authored *** 45.56 9 31 13.34 33.81 8 24 10.37

# initiating posts# responses authored *** 0.73 0.50 0.76 0.44 1.04 0.67 0.83 0.49

Days since last init. *** 16.39 3.33 12.41 3.95 27.05 8.30 28.36 10.53Days since last self-response *** 17.47 3.00 13.38 3.95 29.53 8.29 31.45 10.81Days since last response *** 15.92 1.66 7.32 2.47 25.30 4.37 21.75 5.99Days since last activity *** 14.11 1.80 6.09 1.90 20.94 4.80 20.09 5.79# self responses *** 1.93 1.50 1.64 1.19 1.83 1.50 1.50 1.11# replies received *** 5.63 5.00 3.40 2.37 5.56 4.83 3.30 2.29# respondents *** 4.09 3.83 2.00 1.60 4.01 3.70 2.03 1.42

Users who leave the forum in a state of RECOVERING are significantly more engaged in forum activity,

even when they are USING and WITHDRAWING. The average time lapse between any form of activity

(initiation, self-response and response) is about 30% shorter for those who are RECOVERING when

they leave. Moreover, their activity is focused outwardly on other community members: users who are

RECOVERING author, on average, 50% more responses than those who are ¬RECOVERING (average

45.6 vs. 33.8), but author slightly fewer initiating posts (average 9.0 vs. 9.9). These results resonate

strongly with prior work on AA that finds that both active participation in AA and explicitly focusing on

helping other members correlates with sustained abstinence [190,223].

8.8 Discussion

Our motivating goals were to study phases of addiction as they manifest on Forum77 and to analyze

the forum’s effectiveness in promoting recovery. In this section, we discuss Forum77’s efficacy as a tool

for supporting users through withdrawal, relapse and sustained recovery, drawing on post excerpts to

contextualize our findings. We then discuss how our results might inform future interface design, before

touching on potential implications for addiction treatment.


8.8.1 Use and Efficacy of Forum77

Supporting Withdrawal: Our results suggest that Forum77 is an effective tool for helping users through

opioid withdrawals and physical detoxification. In general, users progress more often than they regress

(Figure 8.3), and these local progressions translate into a global trend of many users reaching a state of

RECOVERING during their tenure. When first initiating a post, 48% of users are USING, 44% WITHDRAW-

ING and 8% RECOVERING; in their most recent initiating post, however, only 17% of users are USING,

37% are WITHDRAWING and 46% are RECOVERING, despite the fact that almost half of the population

experiences a relapse (Figure 8.5). If we interpret our results as a 46% success rate on users’ final

detoxification attempt before leaving the forum, this is an improvement over self-detoxification success

rates reported in the addiction literature [102, 184]. We must be cautious here, however, as we are

comparing across different study designs.

Forum77’s efficacy at supporting detoxification may be attributable, in part, to both the strong social

support and the detailed information on withdrawal that members receive from each other. Both of these

factors have been shown to improve withdrawal outcomes [102, 106, 184], and qualitative remarks from

users suggest that Forum77 meets the mark on both. “I have tried to cope by myself for too long. Its

so hard to deal with something like addiction by your self”, wrote one user. “[T]here is so much support

and advice on getting through this and addiction I am living proof it works!!!!!!”, and “i was on here once

before and was able to achieve 9 months of sobriety due to the support i had here and from meetings.”

remarked others. In other cases, simply discovering a supportive community might galvanize a cessation

attempt: “up until 3 weeks ago, I had no intentions of quitting, i was just looking to find some stuff on

addiction...and i just happened to run across this forum...”.

Relapse and Shame: Despite the favorable prognosis that users are more likely to reach a state of

RECOVERING during their tenure (Figure 8.5), we do not know whether they maintain this state upon

leaving. It is possible that the same strong support network that helps users through detoxification

deters them from wanting to admit a relapse. Quantitatively, although almost half of our sample relapsed

(Figure 8.5), we rarely observed posts in which users reported a relapse immediately after the fact

(Table 8.1).

The hypothesis that users are too ashamed to admit relapse until they implement a renewed attempt

to quit is qualitatively well supported. Statements such as “I suck!! I am so sorry, I’ve been too em-

barrased too admit I fell off the proverbial wagon around Christmas.” are common. Others, such as


“haven’t posted in a few weeks because, of course, i slipped up and am ashamed. but now i am back

on track with the sub” and “Im in day 3 of detox, i was too embarassed to post the first 3 days...” echo

these sentiments, and suggest that some users feel that a new detoxification effort is required as proof

of commitment before returning to the community.

Supporting Sustained Recovery: Without observing users’ behavior outside the forum, we cannot

quantify Forum77’s effectiveness at supporting long term recovery. Qualitatively, however, some users

feel that this is something that Forum77 could improve upon. One user summarizes: “I wonder if there

is not a need for a forum community for long-term support. This community is great, but is skewed

towards the short-term wd symptoms and getting through the initial physical pain of wd.”. Also prevalent

are observations that the forum does not sufficiently prepare users to handle post-acute withdrawal

syndrome (PAWS): “I wish people would warn others about this PAWS thing”, wrote one user. “i was

doing so good i made it to about 100 days sober ... the PAWS really got me”, expressed another.

Moreover, users who return to Forum77 after some time may find that their support network has moved

on. One user who was struggling not to relapse asked “Where are all of the friends i made here that I no

longer see?!?”.

Other users, however, give qualitative evidence in support of Forum77’s efficacy at aiding sustained

recovery. “I have not posted much lately but continue to log on and read ppls posts and I believe that

is a key aspect in my recovery”, states one user. Another wrote “when I get a craving I come here

and read, even if I read it before, it helps me think of what I went through what I’m going through and

how others cope”. We found that higher engagement, in the form of activity levels and volumes of

responses contributed, correlate with the chances of a user being in a phase of RECOVERING by the end

of her tenure. Extending this idea, one possibility is that remaining engaged with the forum (even in the

form of “lurking”) after reaching a state of RECOVERING helps to prevent relapses in a similar way that

continued participation in AA correlates with longer periods of sobriety [190,223]. A deeper analysis into

the mechanisms through which Forum77 does and does not support long-term recovery is an important

topic for future work.

8.8.2 Implications for Forum Design

Computational tools for automatically identifying addiction phases, relapses, and whether a user’s tenure

ends in RECOVERING could prove valuable to communities like Forum77. One question commonly asked


by users is what to expect when they quit their drug of choice, and having access to this information has

been shown to improve the chances of a successful cessation attempt [106]. Using phase sequence

data labeled by our CRF classifier, users could set realistic expectations by exploring patterns based

on thousands of others’ prior experiences. Having a realistic perspective of the process of relapse and

recovery may also reduce the number of instances in which users feel too embarrassed or ashamed to

return to Forum77 after relapsing. Finally, exposing such data could help people find others who exhibit

similar patterns to their own. Finding “people like me” is one of the primary stated reasons for user

participation in online health communities [90].

While Forum77 appears to promote detoxification effectively, we observed that users have mixed

feelings about how well it supports sustained recovery. It is possible that this could be addressed via

altering community dynamics. For example, as we suggested above, continued participation in Forum77

post RECOVERING might help users achieve sustained recovery. Efforts focused on decreasing user

churn and increasing member retention could support this. Alternatively, in a similar vein to AA’s spon-

sorship program, which is thought to promote sustained recovery [237], we might consider automatically

matching newcomers with long-term members who would act as formal mentors (or sponsors). Finally,

it is possible that the community dynamics that support detoxification are different from those that would

support sustained recovery. In this case, a forward reference to a different community might help RE-

COVERING Forum77 users plan what to do next.

8.8.3 Implications for Addiction Treatment

Forum77 accrues, at scale, information that is difficult to acquire through formal medical channels. First,

abusing prescription drugs usually entails deceiving one’s doctor. Second, addiction research data are

typically acquired at point-of-care facilities (e.g., emergency rooms) or surveys at high schools or col-

leges. Although the ethics and privacy of such analyses must be carefully considered, it is possible that

data extracted from sites like Forum77 (e.g., CRF-based transition frequencies, recovery trends, etc.)

could help medical professionals and policy makers better understand patients’ experiences with drug

abuse. For example, insight into the day to day difficulties of opioid-assisted withdrawal might inform

policy for improving the management of this popular treatment down the road. It is also possible that

research like ours could illuminate poorly understood aspects of addiction: to our knowledge, ours is the

first attempt to quantify the cycle of addiction.


8.8.4 Limitations

One limitation of this work is the selection bias of our subjects: users who come to Forum77 are likely

already open to (or at least, considering) the possibility of quitting. This problem is well known to those

hoping to analyze the efficacy of Alcoholics Anonymous [20]. As such, care should be taken in applying

our results to a more general population who misuse prescription medication. We cannot assume, for

example, that a random sample of people who misuse prescription medication would similarly progress

towards recovery if they were asked to participate in Forum77. We also cannot draw epidemiological

conclusions that apply to the population as a whole from these data. However, the size of Forum77,

the prevalence of the opioid epidemic, and the increasing popularity of online health communities alone

make the forum worth studying.

Another limitation is the acceptable, but still improvable, accuracy of our CRF classifier. While we

were able to use CRF-based sequences to identify relapse, and whether a user’s final post was written

when she was RECOVERING with high accuracy, improving our underlying classifier performance would

open up more nuanced analyses. Finally, having page view data would allow us to incorporate measures

of passive participation (“lurking”), which would add a new dimension to our study. We hope to address

such opportunities in future work.

8.9 Summary

Our goal in this chapter was to analyze the process of opioid withdrawal, recovery and relapse on Fo-

rum77, MedHelp’s Addiction: Substance Abuse community. Drawing on literature from the Addiction

community, we first present an overview of prescription drug abuse and present key concepts and ter-

minology (§ 8.2). Next, using Prochaska’s Transtheoretical Model for behavior change, we develop a

taxonomy of phases of addiction that comprises three main categories: USING, WITHDRAWING and RE-

COVERING (§ 8.4). The majority of initiating posts are authored when users are WITHDRAWING. Next, we

analyze linguistic and behavioral features across the USING, WITHDRAWING and RECOVERING phases.

Several significant differences characterize each phase (§ 8.5), and we leverage these results to train a

CRF model to automatically annotate users’ phase sequences (§ 8.6). We can identify relapse events,

and whether a user was RECOVERING when she authored her final post, with high accuracy from our

CRF-annotated sequences (§ 8.7).


Applying our classifiers to 2,848 users (§ 8.6.3 and § 8.7.3) reveals that progressive transitions to-

wards RECOVERING are much more prevalent than regressive transitions. Moreover, despite the fact that

almost 50% of users relapse during their tenure, leaving Forum77 in a state of RECOVERING is the most

probable outcome for all users. Finally, we find that increased participation in the community correlates

with a user RECOVERING by the end of her tenure: users who are RECOVERING by their final initiating

post are significantly more engaged with the community when they are USING and WITHDRAWING than

users who are ¬RECOVERING by their final initiating post.

To our knowledge, ours is the first work to investigate the efficacy of online mutual help groups for

prescription drug abuse. Our results, which help to illuminate a previously poorly understood resource,

suggest that Forum77 is an effective detoxification aid. Based on our findings, we also highlight several

ways in which Forum77 might be enhanced to better support its users (§ 8.8), such as exposing aggre-

gate user data describing the cycle of addiction, or matching newcomers with sponsors. Finally, as the

type of information shared on Forum77 is difficult to acquire at scale through traditional channels, we

note that the tools and insights presented here may be of use to the addiction research community.

Chapter 9

Conclusion

This dissertation presents both methods for automatically extracting medically-relevant data from patient

authored text (PAT) as well as insights derived through the application of these methods. In concert,

our contributions both underscore PAT’s latent potential for illuminating poorly understood or clandestine

medical topics that may be invisible to traditional medical data collection, as well as offer viable methods

that dramatically improve our ability to realize this potential. In this final chapter, we reiterate the contri-

butions of this thesis (§ 9.1) and present principal opportunities for future research (§ 9.2) before offering

concluding thoughts (§ 9.3).

9.1 Contribution Summary

Our work is predicated on the observation that despite being both abundant and uniquely valuable,

patient authored text (PAT) is a heavily underutilized health data resource. In Chapter 2 we presented

an overview of prior work describing online health seeking behavior and, more specifically, online health

community (OHC) participation. Synthesized via a cross-disciplinary literature review, this chapter serves

to illuminate how people use the Internet as a health resource. In Chapter 3 we present a novel review

of prior work that utilizes PAT as a primary data source. We discuss goals, data sources, methodological

approaches and outcomes, providing a contextual background against which to interpret and evaluate

the rest of our work. To our knowledge, this review is the first such synthesis of prior work focused on

extracting value from PAT.

The development of ADEPT (Chapter 5) – our CRF classifier that automatically identifies medically-

relevant terms in PAT – was prompted by our observation that existing biomedical term annotation toolk-

its perform poorly on PAT. While statistical classifiers present an attractive alternative, acquiring large,

expert-annotated PAT corpora on which to train and test them is a major challenge. To this end, we prove

121

CHAPTER 9. CONCLUSION 122

that a crowd of non-experts yields annotations comparable in quality to experts’ for the PAT medical term

identification task. Our result offers an alternative method for acquiring large annotated PAT corpora both

quickly and cheaply. However, our task design failed to yield similar quality results for more specific PAT

annotation tasks (e.g. identifying all symptom terms). This underscores the tradeoff between design-

ing crowdsourcing tasks and annotating the data oneself. Applying ADEPT to large PAT corpora yields

high-level insights useful for summarization and hypothesis generation; however, the tool is too broad

for fine-grained analysis. For higher-resolution insights, we narrow our focus to the topic of addiction: a

highly prevalent but stigmatized medical condition.

Understanding why people author PAT is crucial for matching it with appropriate research questions.

In Chapter 6, we investigate users’ motivations for participating in Forum77: MedHelp’s Addiction: Sub-

stance Abuse community. Our thematic analysis over initiating posts concurs with prior work stating that,

in general, people seek both informational and emotional support from OHCs. However, our analysis

also reveals distinct sub-categories of these two kinds of support. Of particular interest is the update:

a prevalent emotional support seeking post in which the user does not explicitly request a community

response. We train two logistic regression classifiers: the first distinguishes emotional from informational

support-seeking posts; the second, update from non-update posts. Applying these to the entire Forum77

data set reveals that update posts garner slightly more responses on average than non-update posts.

The prevalence of update posts suggests that users value the forum as a place where their personal

progress can be witnessed by others and recorded for posterity. Forum77 also serves as a repository

for information on opioid withdrawal. In fact, Thomas’ Recipe, a protocol for medication-assisted opioid

withdrawal that evolved on Forum77, suggests that Forum77 users actively collaborate on developing

effective treatment protocols.

In Chapter 7 we investigate the distribution of drugs of choice (DOCs) in the Forum77 population. A

close reading indicates that identifying DOCs is a context sensitive problem, as a variety of substances

can serve as either addiction or treatment. A CRF classifier trained on manually annotated data is able to

identify DOCs with high accuracy. Our resulting analysis, which compares the Forum77 DOC distribution

to those of other drug-using populations, reveals that the Forum77 population struggles disproportion-

ately more with prescription opioids, and disproportionately less with traditionally abused substances

such as alcohol, marijuana and cocaine. While it is difficult to ascertain whether Forum77 reflects real-

world drug use trends, our results do suggest that Forum77 represents a population of drug users that

is not well covered by existing monitoring systems.


Finally, in Chapter 8, we analyze the process of opioid withdrawal, recovery and relapse on Fo-

rum77. Through a thematic analysis, we develop a taxonomy describing phases of addiction based on

Prochaska’s Transtheoretic Model for behavior change. Phases of addiction are accompanied by distinct

physiological and psychological changes, and this is mirrored in users’ usage of the site: exploring activ-

ity and linguistic features from posts across the phases USING, WITHDRAWING and RECOVERING reveals

several significant differences. We leverage these differences to train a sequence-based CRF model

to annotate users’ phase sequences automatically. We can also identify relapse events from these se-

quences, as well as whether a user’s final post was made in a state of RECOVERING, with high accuracy.

Our resulting analysis of all Forum77 users’ transition sequences indicates that despite the fact that

relapse is common, leaving the forum in a state of RECOVERING remains the most probable outcome.

Moreover, we show that high engagement with the community correlates with the probability of a user

RECOVERING by her last initiating post on the forum. Overall, these results suggest that Forum77 is an

effective detoxification aide. To our knowledge, this work is the first that attempts to quantify the phases

of addiction and the transitions between them.

9.2 Future Work

Given the considerably high levels of enthusiasm currently surrounding health-related technology, our

contributions present a timely foundation and reference. However, many limitations to realizing the full

value of PAT remain. In this section, we articulate key opportunities for future research.

9.2.1 Supporting the Methodological Process

Figure 9.1 (replicated from Chapter 1) illustrates the stages of our methodological process for extracting

insights from PAT. At present, most of the stages in the main process (top row) must be cobbled together

in an ad-hoc fashion by the researcher. This hurts efficiency, replicability and makes comparison be-

tween studies difficult. Developing this process into more of a standardized pipeline would enable closer

synergy between disparate research efforts, and make it easier to identify quality results. We suggest

several areas for improvement below.


Labeled Data (auto)

Medical Discovery

PAT interface design

application

Content Schema

Labeled Data

(human) Classifier Features PAT

close reading

annotation training

Insights Process-ed Data

schema revision

processing & analysis

tuning

Future Work

108 Figure 9.1: Our general methodological process. Nodes in grey show avenues for future work supportedby our contributions.

Interface Support for Thematic Analysis

Thematic analyses are frequently used to develop deep insights into text-based corpora and to inform

future analyses. Moreover, as we note in Chapters 6 and 8, not only do the results of thematic analyses

stand as their own qualitative contribution, they also indicate junctions at which we may shift from a close-

reading to a large-scale, automated analysis. In spite of their complexity and importance, there is no

interface support for thematic analyses: provenance of this iterative process is never recorded; reasons

(and supporting examples) for making particular decisions about categories are lost; and the clustering,

combining, and splitting of categories is done primarily in the researchers’ working memories. Based on

our own experience, a starting point for interface support would provide visual “sand boxes” for comparing

and organizing data elements into categories; support for flagging items that either especially support,

or especially contend, the proposed taxonomy; and facilitate the easy expression of categorization rules.

Aside from making thematic analyses more efficient and consistent, externalizing the process in this

fashion would make resulting taxonomies easier for a third party to verify, compare against and reuse.

Improved Tools for Annotation

Related to the matter of interface support for thematic analyses is interface support for data annotation.

In our work, we conducted this process primarily through the use of shared spreadsheets. While this

makes data output easy, it hinders comparison between non-adjacent data elements; does not support


the capture of spontaneous updates to annotation rules that arise from encountering novel examples;

and only weakly supports collaboration between annotators. Examples of features that an annotation in-

terface might provide include visual support for clustering and comparing data elements; automatic label

suggestions based on underlying text analytics; iterative updating of annotation rules in response to new

data elements; and automatic evaluation of inter-annotator agreement that facilitates rapid exploration of

agreements and errors. Not only would such an interface make the annotation process faster and more

consistent, but it may also encourage standardization in annotation and reporting practices.

Mapping the Limits of the Crowd in PAT Annotation Tasks

In Chapter 5 we showed that the crowd can replace medical experts for some PAT annotation tasks. How-

ever, correctly designing crowdsourcing tasks is sufficiently time consuming that in subsequent chapters,

we elected to annotate our data manually. Exploring the crowd’s ability to perform a variety of PAT anno-

tation tasks, however, remains a crucial avenue for future work. Without it, it would be difficult to scale

analyses such as ours to larger forums or to multiple data sets. More importantly, however, this would

make it easier to create and share large, labeled corpora within the research community. Due to our

data sharing agreement with MedHelp, we were unable to share any of our labeled data sets. However,

making a large, labeled PAT corpus available to the public would be the most direct way to stimulate

research on these topics.

9.2.2 PAT Interface Design & Support

Despite their popularity, the general structure of online health communities (OHCs) has barely changed

since the late 1990’s. However, both insights and classifiers derived through the PAT analysis pipeline

could prove valuable if incorporated into OHCs. As we show in Figure 9.1, closing this loop may create

a virtuous cycle, in which the results of interface improvements result in higher volumes and quality of

PAT. This, in turn, would lead to more fine-grained insights and improved classifiers. While we do not

implement any interface changes in this work, we have several suggestions.

Expose Aggregate Data to Users

OHC participants spend hours doing tasks that often amount to simple aggregation, such as calculating

treatment popularity, establishing what Forum77’s most popular DOC is, and estimating the probability

of a successful detoxification attempt conditional on a specific withdrawal method. This is inefficient: not


only are OHCs difficult to navigate for these sorts of tasks, but often many users will conduct identical

analyses at different points in time. In the best case, exposing such data to users could alleviate users’

need to reinvent the wheel for each analysis, freeing their time for alternative tasks.

Support Data Entry

One critique of PAT is that it is often incomplete in terms of containing all relevant medical information.

Nudging users towards providing more complete accounts of their conditions would enrich our analyses

and enhance PAT’s credibility as a data source. One example is “symptom autocomplete”: rather than

relying on users to remember and list all of their symptoms (some of which may not even be severe

enough to notice), it would be relatively straightforward to automatically suggest (or “autocomplete”)

symptoms based on the ones already entered.

Automatically Construct User Timelines

Personal timelines are commonplace in social media and the quantified self movement. Our work on

users’ reasons for participating in Forum77 (Chapter 6) indicate that they value its archival features.

Making it easier for users to browse their histories, especially histories enhanced with structured data

provided by classifiers, could facilitate an array of tasks, from discovering behavioral patterns to finding

other “people like them”. Quantitative uses aside, a timeline comprises a narrative of important life

events, failures, and accomplishments that would have strong emotional significance to users. Given the

chance, it is likely that users would take it upon themselves to curate their own timelines: a situation that

could be leveraged to have users label their own data.

9.2.3 Making the Leap to Medical Discoveries

Our work adds to a growing body of proof that medically-relevant insights are automatically extricable

from PAT. However, the holy grail is to move from medical insights to actionable medical discoveries. In

our own work, efforts along these lines might include extending our work on identifying drugs of choice

(Chapter 7) to support real-time identification of new drugs, or extending our work on phases of addiction

(Chapter 8) to prove that participation in Forum77 measurably reduces the number of relapses that

someone experiences. However, making such leaps is nontrivial. Challenges include understanding how

signals in PAT correspond to real-world trends, in spite of the fact that PAT rarely contains demographic

data; clinically verifying results, which is both slow and expensive; and developing new experimental


designs that are compatible with online health seeking behavior. Such challenges could only be met

through a close-knit collaboration with medical professionals who agree that PAT is a valuable data

source.

9.3 Concluding Remarks

Patient authored text is the abundant byproduct of hours of human intelligence spent on complex, health-

related problem solving tasks. As long as this valuable resource is underutilized, researchers, patients

and medical professionals alike will be deprived of the unique insights and benefits that it has to offer.

Although this dissertation takes a step towards leveraging some of the considerable work that patients do

in managing their own health, this is only the tip of the iceberg: we anticipate a future in which technology

creates, supports and encourages synergy between patients, providers and data.

Appendix A

ADEPT Supplementary Material

Table A.1: The following features are specified when training our CRF. Other features retain their defaultvalues as described at http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html

Property Name Type Value Description

useClassFeature boolean TRUE Include a feature for the class (as a class marginal). Puts a prior on theclasses which is equivalent to how often the feature appeared in the trainingdata.

useWord boolean TRUE Gives you feature for w

useNGrams boolean TRUE Make features from letter n-grams, i.e., substrings of the word

noMidNGrams boolean TRUE Do not include character n-gram features for n-grams that contain neither thebeginning or end of the word

useDisjunctive boolean TRUE Include in features giving disjunctions of words anywhere in the left or rightdisjunctionWidth words (preserving direction but not position)

maxNGramLeng int 7 If this number is positive, n-grams above this size will not be used in themodel

usePrev boolean TRUE Gives you feature for (pw,c), and together with other options enables otherprevious features, such as (pt,c) [with useTags)

useNext boolean TRUE Gives you feature for (nw,c), and together with other options enables othernext features, such as (nt,c) [with useTags)

useSequences boolean TRUE

usePrev boolean TRUE

useNext boolean TRUE

maxLeft int 1 The number of things to the left that have to be cached to run the Viterbialgorithm: the maximum context of class features used.

useTypeSeqs boolean TRUE Use basic zeroeth order word shape features.

useTypeSeqs2 boolean TRUE Add additional first and second order word shape features

useTypeySequences boolean TRUE Some first order word shape patterns.

wordShape String chris2useLC Either none for no wordShape use, or the name of a word shape functionrecognized by WordShapeClassifier.lookupShaper(String)

128

Appendix B

F77 Purpose Supplementary Material

Table B.1: Features used to train our purpose classifiers, which distinguish emotional from informationalsupport seeking, as well as update from non-update posts.

Feature Name Description

containsQuestion whether the post contains a question (binary)

numQuestions number of questions the post contains

unigrams all words present in the post

bigrams all bigrams (two consecutive words) present in the post

timeMentioned number of days clean time (if mentioned). Extracted using the following twopatterns:

X:NUM (day—days—week—weeks—months—month—year—years) (clean—off)

on? ”day—days” X:NUM

where NUM is any number and ”—” represents the OR operator. We thenconvert weeks/months/years to days and use the number of days as the featurevalue. The default value is 0.

numPosWords number of words with a positive sentiment score in SentiWordNet of ¿ 0.8

numNegWords number of words with a negative sentiment score in SentiWordNet of ¿ 0.8

daysMentioned whether the user mentions a number followed by the term “day” or “days”’

days since last initiating post the number of days since the user’s last initiating post

129

Appendix C

F77 Drug of Choice Supplementary Ma-

terial

Table C.1: Drug term resolution map, manually compiled from classifier output. The i column indicateswhether the drug category is included in our analysis in Chapter 7.

Category Drug name Resolved drug terms i

alcohol alcohol acholic, acoholic, alcahol, alchohol, alchol, alcholo, alcohol, alcoholic, alcoholoc, alcolhol, alcololic,alocholic, alocohol, champagne, beer, beers, vodka, wine, drink alcohol, drink beer, drinking alco-hol, drink wine, drinking beer, drinking wine, drinks, beer bottles, beer drinking, alcohol drinking,alcohol drinks, alcoholic drink, drink, drinking

◦

cigarettes cigarettes cigarettes, cigaretts, cigarrettes, cigars, cigerattes, ciggaretes, ciggarettes, ciggaretts, ciggerettes,ciggies, ciggs, cigrattes, cigs, smoke, smoke cigarets, smoke cigarettes, smoke cigs, smoked,smoker, smokes, smokes cigarettes, smokes ciggaretts, smokes cigs, smokin, smokin cigs, nico-tine, smoking cigarettes, smoking, smoking cigs

◦

cocaine cocaine cocaine, cocain, cocaine, cocane, coccaine, cociane, coke, coaine, powder, smoke cocaine, smokecoke, smokin coke, smoking coke, smoking crack, smoking crack cocaine

◦

hallucinogens hallucinogens hallucinogens, mescaline ◦

psilocybin mushroom psilocybin mushroom, mushrooms, shrooms, psychedelics ◦

heroin heroin heroin, herioin, herion, heroin, heroin cocaine, heroine, smoking heroin, smack, smoke heroin,heroin heroin, heroin smoking

◦

marijuana marijuana marijuana, marajuana, marihuana, marijanna, marijauna, marijuan, marijuana, marijuana smoker,marijuanna, marijuanna smoker, marjuana, marjuana smoke, pot, pot brownies, pot smoke, potsmoker, pot smokers, pot smokin, weed, weed smoker, smoke marijuana, smoke marijuanna,smoke pot, smoke weed, smoked pot, smokes marijuana, smokes pot, smokes weed, smokinpot, smoking marijuana, smoking pot, smoking weed, dope, pot smoking, smoking weed, smok-ing dope, smoke dope, hash, hashish, smoked weed, smokin dope, marijuana smoke, marijuanasmoked, marijuana smoking

◦

methadone methadone methadone, mehadone, mehtadone, mehtadone pain killers, metadone, methadoen, methadome,methadon, methadone, methadone pain killers, methadones, methadont, methadose,methandone, methaodne, methatdone, methdaone, methdone, methedome, methedone, method-one, methodone pain pills, methondone, methone, mdone

◦

suboxone sub, suoxone, subbies, subboxin, subboxine, subboxone, subetex, subitext, subloxone, subo,subone, subonoxe, subooxone, subotex, subox, suboxan, suboxe, suboxen, suboxene, subox-ens, suboxin, suboxine, suboxins, suboxne, suboxom, suboxome, suboxon, suboxone, subox-ones, suboxtone, suboxyn, suboxzone, subozone, subroxone, subs, soboxan, soboxen, sobox-ene, soboxin, soboxine, soboxion, soboxon, soboxone, soboxones, sabonxon, saboxan, saboxen,saboxin, saboxins, saboxon, saboxone, subtex, subutec, subutek, subutex, subutext, subutox, sub-uxone, subx, subxone, syboxin, syboxone, symboxin, buprenorphine, buprenorphine, bupenor-phine, bupenorphrine, bupernepherine, bupernorphine, bupremorphine, buprenex, buprenophine,buprenorphene, buprenorphine, bupreorphine

◦

Continued on next page

130

APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 131

Table C.1 – Continued from previous page


opioid codeine codeine, codeiene, codein, codeine, codeine otc pills, codeine painkillers, codeine sulphate, co-dene, codene pain pills, codien, codiene, codiene painkillers, codiens, codine, codone, coedine,tylenol 3, tylenol3

◦

dextropropoxyphene dextropropoxyphene, darovcet, darv, darvacet, darvacets, darvaset, darvecet, darvecette, darvicet,darviset, darvo, darvocet, darvocets, darvocett, darvocette, darvon, darvoncet, darvos, darvoset,darvs, darvys, davocet, davort, dextropropoxyphene

◦

dialudid diladid, diladin, diladud, dilantin, dilatin, dilaudad, dilauded, dilaudeds, dilaudid, dilaudin, dilauid,dillauded, dilodid, dilodids, diloted, dilotid, dilotted, diloudid, diluadid, diluadids, diludid, diluidid,hydromorphone, hydromophone, hydromorophone, hydromorphcontin, hydromorphine, hydromor-phone

◦

fentanyl actiq, fenatyl, fentaly, fentanol, fentanyl, fentanyl pain patch, fentanyl pain patches, fentayl, fentenal,fentenyl, fentinol, fentnyl, fentora, fentyl, fentynal, fentynal pain patches, fentynl, fentynol, fentynyl,fetynal

◦

hydrocodone hydrocodone, hrdrocodone, hudro, hycodan, hycodne, hydo, hydocodone, hydorcodone, hydors,hydos, hydos-75, hydr, hydracodone, hydrco, hydrcodene, hydrcodone, hydro, hydro codeine,hydro-codone, hydroc, hydrocdone, hydrochodone, hydroco, hydrocod, hydrocodan, hydrocode,hydrocodeine, hydrocoden, hydrocodene, hydrocodien, hydrocodiene, hydrocodin, hydrocodine,hydrocodine pills, hydrocodne, hydrocodon, hydrocodone, hydrocodones, hydrocodons, hydrocon-done, hydrocondone pain medication, hydrocone, hydrocordon, hydrocordone, hydrodcodone, hy-drododone, hydrodone, hydromet, hydromorphone hydrochloride, hydros, hydrycodone, hyrdo, hyr-docodone, hyrdos, hyrdro, hyrdrocodone, hyrdros, hyro, hyrocodone, hyros, smoke hydro

◦

lortab lortab, loratab, loratabs, loratb, lorcet, lorcets, lorcett, lorecet, lorecets, lorects, loretab, loricet,loritab, loritabs, lorocet, lorocets, lorotabs, lorset, lortab, lortab◦, lortab◦-5, lortabs, lotab, lotabs,loracet, loracets

◦

meperidine meperidine, demerol, demeral, demerol, demoral, demorol ◦

morphine morphine, mophine, moraphine, morhine, morhphine, morhpine, moriphine, morophine, morp, mor-phane, morpheine, morphen, morphene, morphin, morphine, morphines, morphone, morpine,mscontin, morphine mscontin, morphine sulf, morphine sulphate, ms-contin, avinza, ms contin,oramorph, kadian

◦

norco norco, noco, noraco, norc, norce, norco, norco vicodin, norcos, norcs, nordco, noreco, norko,noroco, norocs, narco, narcos

◦

opiates opiates, opates, opiade, opiants, opiat, opiate, opiates, opiats, opiete, opiets, opiot, opiote, opiotes,opitaes, opitate, opitates, opites, oopiate, opaite, opaite pain meds, opaites, opiate meds, opiatenarcotic pain pills, opiate narcotics, opiate pain killer, opiate pain killers, opiate pain medication,opiate pain medications, opiate pain medicines, opiate pain meds, opiate pain pill, opiate painpills, opiate painkillers, opium, opiads-heroine/percs/hydro, opiate drug, opiate narcotic pain, opiatepain, opiate pain med, opiates percs, opiates vicodin, opiates xanax, oppiates, smoking opium

◦

opioids opioids, opiod, opiods, opioid, opioids, opoid, opoids, opiod drug, opiod narcotic, opioid meds,opioid pain med, opioid pain medications, opioid pain meds

◦

oxycodone oxycodone, roxcodone, roxi, roxicdone, roxicet, roxicets, roxicodne, roxicodone, roxicodones, roxi-contin, roxicontins, roxicotin, roxies, roxiodone, roxis, roxocodone, roxy, roxy codone, roxy3, roxy4,roxycet, roxycodine, roxycodone, roxycodones, roxycontin, roxycontins, roxycotin, roxycottin, roxys,oxcodone, oxcontin, oxcotin, oxcy, oxcycodone, oxcycontin, oxcycotin, oxcyontin, oxcys, oxen, oxey,oxeys, oxi, oxicoden, oxicodon, oxicodone, oxicontin, oxicotin, oxicotines, oxicoton, oxie, oxie co-dine, oxies, oxocodone, oxtcontin, oxxy, oxy, oxy codone, oxy contin, oxy-contin, oxy4, oxy8, oxy8s,oxyc, oxyco, oxycocet, oxycocets, oxycod, oxycode, oxycodeine, oxycoden, oxycodene, oxycodin,oxycodine, oxycodne, oxycodon, oxycodone, oxycodones, oxycodpne, oxycoidone, oxycondin, oxy-condone, oxyconin, oxycontiin, oxycontin, oxycontine, oxycontins, oxyconton, oxycontontin, oxy-coontin, oxycotdin, oxycoten, oxycotin, oxycotine, oxycotins, oxycotion, oxycoton, oxycotten, oxy-cottin, oxycottins, oxycotton, oxydocone, oxydodone, oxydone, oxyicodone, oxyies, oxyir, oxynorm,oxys, oxytocin, oxyxodones, oxyz, oycodone, blues, blue pills, ocs, ocycodone, oxy hydro, oxyocs, oxy vics, oxy-norm, oxy/percs/tabs, oxycodone oxycontin, oxycodone pain meds, oxycontin,oxycotontin, smoking oxy, smoking oxycontin

◦

oxymorphone oxymorphone, opana, opanas ◦

percocet percocet, perc, percacet, percacets, percaset, percasets, perccet, percecet, percecets, percet,percets, percicet, percks, perco, percocect, percocet, percocete, percocets, percocett, percocette,percocetts, percocite, percocoet, percoct, percodan, percodone, percoet, percoets, percoset, per-cosets, percot, percote, percots, percs, perkacet, perkeset, perkocet, perkocets, perks, perocaet,perocet, perocets, persocet, pecocet, pecocets, pers, perts

◦

tramadol tramadol, tradol, tram, tramacet, tramadal, tramadaol, tramado, tramadol, tramadole, tramadols,tramadon, tramal, tramdol, tramedol, tramidol, trammadol, tramodal, tramodol, tramol, trams,tranadol, ulram, ultam, ultracet, ultram, ultrams, ultrm, ultrum

◦





vicodin vics, vicks, vic, vicadan, vicaden, vicadin, vicadine, vicadon, viccodin, viccoding, vicdin, vicdon,vicdone, viciden, vicidin, vicidine, vicidon, vicidons, viciodin, vico, vicodan, vicodein, vicodeine,vicoden, vicodene, vicodens, vicodent, vicodien, vicodine, vicodines, vicoding, vicodins, vicodion,vicodn, vicodon, vicodone, vicodyn, vicoin, vicondin, vicos, vicotin, vidodin, vik, vikcs, vike, vikes,vikoden, vikodin, viks, viocdin, viocidin, viocoden, viodin, vivodin, vivodins, vocidin, vocodin, vicodin

◦

vicoprofen vicaprofen, vicobrofin, vicoprofen, vicoprofin, vicoprohen, vicoprophen, vicoprophin, vicroprofen,vicuprofen

◦

OTC acetaminophen acetaminophen, acetamenophin, acetamenophine, acetaminaphen, acetaminaphin, ac-etaminophen, acetem, aceteminophen, acetomenophine, acetominophen, acetominophin

◦

benadryl benadryl, benadril, benadryl, benadryll, bendryl, benedryl, benodryl, benydryl ◦

dextromethorphan dextromethorphan, dxm ◦

ibuprofen advil, ibeprofen, ibogaine, ibp, ibprofen, ibprofin, ibprohin, ibprophin, ibu, ibupofen, ibupro, ibupro-fen, ibuprofin, ibuprophen, ibuprophin, ibuprophren, ibupropin, mortin, mortrin, motrin, neurofen,neurophen, nurofen

◦

melatonin melantonin, melatonin, meletonin, melitonin, melotonin ◦

naproxen naproxen, aleeve, aleve, aleive, alieve, alleve ◦

nyquil nyquil, nyquill ◦

paracetamol paracetamol, paracetemol, paracetomal, paracetomol, parecetamol ◦

tylenol tyelonol, tyenol, tyl, tylanol, tylenal, tylenol, tylenol oc, tyleonol, tylinol, tylonal, tylonel, tylonol, tylox,tyloxes, tynenol, tyneol, tylenol

◦

sedative alprazolam alpralozam, alprazalam, alprazolam, alprozalam, alprozolam, ◦

ativan ativan, adavan, adavant, adavin, adivan, advan ◦

barbiturates barbiturates, barbituates, butalbital, phenobarbital, barbs ◦

benzodiazepine benzodiazepine, benzo, benzocaine, benzodiazapenes, benzodiazapines, benzodiazepams, ben-zodiazepines, benzodiazpines, benzoes, benzoids, benzos, oxazepam

◦

buspirone buspirone, buspar ◦

chlordiazepoxide chlordiazepoxide, librium ◦

clonazepam clonazepam, klnopin, klodopin, klon, klonapin, klonapins, klonepin, kloni, klonidine, klonipin,klonipin oxycontins, klono, klonoin, klonoipn, klonopan, klonopin, klonopine, klonopines, klonopins,klonpin, klonpion, klonzapam, klopin, kloponin, kolonapin, kolonipin, kolonopin, kolonopins, kpins,clonazepam, clonazepams, clonazepham, clonozepam, clonapin, clonapine, clonopin, clonopine,clonipin, clonipine, clonipins, colonopin

◦

diazepam diazapam, diazapams, diazepam, diazipam ◦

eszopiclone eszopiclone, lunesta ◦

fioricet fioricet, fioricet, fierocet, fioracet, fiorcet, fiorecet, fiorecett, fioricet, fiorocet, fiurecet, floricet, fiori-nal, fiorinals, fiorinol, fiornal, fiorinal, fiorinals, fiorinol, fiornal

◦

flunitrazepam flunitrazepam, rohypnol ◦

gabapentin gabapentin, nerontin, neuotin, neuratin, neurontin, neuronton, neurontrin, neurotin, neuroton, neu-rontin

◦

ghb ◦

lorazepam lorazapam, lorazapan, lorazepam, lorazepan, lorazopam, lorazpam, lorezapam, lorezepam ◦

soma soma, soma pills, somas ◦

valium valium, valiums, vallium, vallum, valuem, valuim, valuims, valum, valume ◦

xanax xaanx, xana, xanac, xanacs, xananx, xanax, xanex, xanix, xannax, xannies, xantax, xantex, xanx,xanxa, xanxax, xnax, xznax, zanax, zanaz, zanex, zanix, zannax, zantac, zanx

◦

zolpidem zolpidem, ambein, ambian, ambiem, ambien ◦

sedatives sedatives, ketamine ◦

stimulant adderall adderal, adderal, adderall, adderalll, adderol, adderral, adderrall, adderrol, addreall, aderol, aderoll,aderrall, dexedrine, dextroamphetamine

◦

amphetamine amphetamine, amphetamines ◦

amphetamine ◦

LSD lsd, acid ◦

mdma mdma, ecstacy, ecstasty, ecstasy, exstacy, extacy ◦





methamphetamine methamphetamine, meth, meth smoker, methamphedamines, methamphetamine, metham-phetamines, methamphetimines, methanphetamine, smoking meth

◦

methylphenidate methylphenidate, ritalin, ritilan, ritilin, concerta ◦

modafinil alertec ◦

general general meds, drugs, drug, med

narcotics narcotics, narc, narc meds, narc pain meds, narc painkillers, narcan, narcanon, narcatic, narcatics,narcodics, narcotic, narcotic meds, narcotic pain killers, narcotic pain medication, narcotic painmedications, narcotic pain medicine, narcotic pain medicines, narcotic pain meds, narcotic painpill, narcotic pain pills, narcotic pain reliever, narcotic pain relievers, narcotic pain-killers, narcoticpainkillers, narcotic pills, narcotics, narcotis, narcs, narctoics

painkillers pain pill, analgesic, analgesics, pain meds, pain pills, pain killers, painkillers, pain medication, painmedicine, pain medications, pain relievers, pain kiilers, pain killer, pain killer pills, pain killlers, painkills, pain kller, painpills, pain med, pain reliever, painkillers hydros, painkliiers, painmeds, painsmeds, pill, pills, narcotic pain, pils, ls, pilss, pharmaceuticals, pain

antidepressant amitriptyline amitriptyline, amiltriptyline, amitriptaline, amitripthyline, amitriptyline, amitripyline, amitryptaline,amitryptilline

aripiprazole aripiprazole, abilfy, abilify

citalopram citalopram, celexa, celexia, celxa

duloxetine duloxetine, cymbalta, cybalta, cymalta, cymbalata, cymbalta, cymbalts, cymbata, cymbolta, cyn-balta

fluoxetine prozac, fluoxetine, fluoxtine

lexapro

paroxetine paroxetine, paroxetine, paroxotine, paxatine, paxial, paxil, paxill

trazodone trazadone, trazodone

venlafaxine venlafaxine, effexor, efforex, effxor, eflexor

wellbutrin bupropion, buproprio, wellbutrin, welbrutrin, welbutrin, wellbrutrin, wellbutrin

zoloft zoloft, zoloff

NA albuterol albuterol sulphate, albuterol

amoxicillin amoxicillin, amoxcillin, amoxicillin, amoxxillin

antibiotics antibiotics, anitbiotics, anitdepressants, anphetamines, antabuse, antibiotic, antibiotics, antibitics,antibotics

carisoprodol carisoprodol

clonidine clonidine, cloadine, clondine, clonidine, clonine, clonodin, clonodine, colodine, colondine, coloni-dine, colonodine

cyclobenzaprine cyclobenzaprine, flexaril, flexarill, flexeral, flexerall, flexeril, flexerill, flexerils, flexerol, flexiril, flexirils,flexirl, flexril, flexrill

naloxone naloxone, nalorex, naloxone

naltrexone naltrexone, naltexone, naltraxone, naltrex, naltrexone, naltrexone hydrochloride, naltexone, naltrax-one, naltrex, naltrexone, naltrexone hydrochloride

prednisone prednisone, predensone, predinsone, predisolone, predisone, prednisolone, prednison, pred-nisone, prednizone

pregabalin pregabalin, lyrica

quetiapine quetiapine, seraquel, seraquil, sereoqol, serequel, serequil, serezone, seriquil, seroqel, seroquel,seroquell, seroquels, seroquil, serqual, serquil

steroids steroids, roids

vitamins vitamins, vitaimns, vitamans, vitamians, vitamines, vitamins, vitamns, vitams, vitc, vite, vitiamins,vitiams, vitimans, vitimins, vits, supplements

zaleplon zaleplon, sonata


Table C.2: The default feature list for Stanford’s NER classifier is at nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html. Here, we list all features whose defaultvalues were changed to train our DOC classifier.

Feature Name Feature Value

useTag trueuseClassFeature trueuseWord truemaxNGramLeng 3useNGrams trueusePrev trueuseNext trueuseSequences trueusePrevSequences truemaxLeft 1useTypeSeqs falseuseTypeSeqs2 falseuseTypeySequences falsewordShape chris2useLCuseLemmas trueuseDistSim truedistSimLexicon We used Twitter word clusters [189] and word clusters generated using the Brown hierar-

chical word clustering algorithm [32,157] on all MedHelp posts.useDisjunctive truedisjunctionWidth 3cleanGazette truegazette We utilized a dictionary composed from several online lists of commonly misused sub-

stances. Table C.3 shows all dictionary terms.


Table C.3: Gazette of common substances used as a feature in the DOC classifier. This gazette wascompiled from a range of online resources.

Acamprosate, acid, actiq, adderall, aerosol propellants, alcohol, alprazolam, ambien, amidone, amobarbital, am-phetamine, amphetamines, amytal, anadrol, anexsia, angel dust, antabuse, apache, ativan, avinzaBarbs, beer, bennies, bidis, big o, biocodone, biocondone, biphetamine, biscuits, black beauties, black stuff, blueheaven, blues, blunt, buprenorphine, butalbital, butane propane, butorphanolCactus, campral, captain cody, carisoprodol, cat valium, chalk, charlie, china girl, china white, chlordiazepoxide,cigarettes, cigars, clarity, clonazepam, clonidine, cocaine, cocaine hydrochloride, codeine, cody, coke, concerta,crack, crack cocaine, crank, crosses, crystal, crystal meth, cubes, cyclohexylDamason-p, dance fever, darvocet, darvon, demerol, demmies, depade, depo-testosterone, desoxyn, dexedrine,dextroamphetamine, dextromethorphan, dextropropoxyphene, dextrostat, di-gesic, diacetylmorphine, diazepam, di-codid, dilaudid, dillies, disulfiram, dolophine, dope, downers, duodin, durabolin, duragesic, duramorph, dxmEcstasy, empirin, empirin with codeine, equipoise, eszopicloneFentanyl, fioricet, fiorinal, fiorinal with codeine, fizzies, flake, flunitrazepam, forget-me pillGamma-hydroxybutyrate, ganja, gasoline, georgia home boy, ghb, glues, goodfella, goop, grievous bodily harm, gymcandyHalcion, hash, hash oil, hearts, hemp, heroin, hillbilly, hycodan, hydrococet, hydrocodone, hydromorphone, hydrosInhalant, isoamyl isobutylJackpot, jif, jointKadian, kapanol, ketalar sv, ketamine, klonopinLa turnaround, laam, laudanum, laughing gas, levacetylmethadol, librium, liquid ecstasy, liquid x, liquor, little smoke,lorazepam, lorcet, lortab, love boat, lover’s speed, lsd, luminal, lunesta, lysergic acid diethylamideMagic mint, magic mushrooms, maria pastora, marijuana, mary jane, meperidine, meperidine hydrochloride, mesc,mescaline, meth, methadone, methadose, methadrine, methamphetamine, methaqualone, methylphenidate, mex-ican valium, microdot yellow sunshine, miss emma, monkey, morphine, mrs. o, ms contin, msir, murder 8, mush-roomsNaltrexone, nembutal, nitrites, nitrous oxide, norco, numorphone, numporphanO bomb, o.c., octagons, opana, opium, oramorph, orlaam, oxandrin, oxy, oxycet, oxycodone, oxycontin, oxycottonPaint thinners, palladone, panacet, paregoric, pcp, peace, peace pill, pentobarbital, percocet, percocet:oxy, perco-dan, percs, peyote, phencyclidine, phennies, phenobarbital, poppers, pot, pumpers, purple passionQuaaludeR-ball, red birds, reds, reefer, revia, ritalin, roach, robitussin, robitussin a, robitussin a-c, robitussin b, robitussin c,robo, robotripping, roche, rohypnol, roids, roofies, roofinol, rophies, roxanol, roxicodone, roxicondone, ryzoltSally-d, salvia, schoolboy, secobarbital, seconal, shepherdess’s herb, shrooms, sinsemilla, skag, skippy, sleepingpills, smack, smoke, snappers, solvents, soma, sonata, special k, speed, steroids, stilnox, stop signs, sublimaze,suboxone, subutex, symtanTango and cash, temesta, the smart drug, tnt, tooies, toot, tramadol, tramal, tranks, triazolam, triple c, truck drivers,tussionex, tylenol, tylenol with codeine, tyloxUltram, uppersValium, vicodin, vicoprofen, vike, vitamin k, vitamin r, vivitrolWatson-387, weed, white horse, white stuff, wineXanax, xodolYellow jackets, yellowsZaleplon, zolpidem, zydone

Appendix D

F77 Phase Supplementary MaterialTable D.1: LIWC features for the three classes in the labeled dataset over initiating posts. Only statisti-cally significant variables are shown. Statistical significance is determined using Kruskal-Wallis tests (*p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise error rateacross all 184 variables (includes activity features). Column c denotes (◦) if the feature is used in ourCRF classifier.

Initiating Post Linguistic Features

USING WITHDRAWAL POST-WITHDRAWALc p Mean Median SD Mean Median SD Mean Median SD

Word count ◦ * 208.20 151 211.06 178.92 127.00 168.81 183.23 124.50 209.24Dic ◦ *** 89.26 90.17 4.89 88.10 88.89 6.26 89.38 90.54 6.59Numerals ◦ *** 1.28 0.89 1.51 1.75 1.33 1.97 1.32 0.83 2.04Function words ◦ *** 60.50 60.92 5.31 58.40 59.28 6.45 59.74 60.48 7.07Pronoun *** 18.51 18.68 4.32 16.99 17.17 4.70 17.97 18.16 5.28Personal pronoun ◦ *** 12.83 13.05 3.83 11.49 11.54 4.19 11.88 11.86 4.60Pronoun: I ◦ *** 9.72 9.97 3.60 9.02 9.14 3.76 7.89 8.18 4.31Pronoun: you *** 0.98 0.41 1.70 1.02 0.13 2.04 2.05 0.99 2.89Pronoun: he/she ◦ *** 1.14 0 2.08 0.74 0 1.82 1 0 2.46Pronoun: they ◦ *** 0.65 0.20 1.05 0.47 0 1.12 0.54 0 1.03Pronoun: impers. * 5.68 5.33 2.82 5.49 5.26 2.81 6.09 5.76 3.35Verb ◦ ** 18.54 18.69 3.76 17.64 17.59 4.20 18.13 17.96 4.91Present tense ◦ *** 12.56 12.55 3.90 11.53 11.24 4.09 11.95 11.63 4.45Numbers ◦ ** 0.71 0.48 0.93 0.75 0.37 1.12 0.54 0 0.89Social ◦ *** 7.60 6.59 4.79 6.38 5.26 5.18 8.85 7.89 5.90Humans ◦ * 0.49 0 0.76 0.40 0 0.79 0.57 0 1.04Affect ◦ *** 5.30 5.00 2.76 5.76 5.54 3.09 6.41 6.11 3.52Affect: positive ◦ *** 2.80 2.45 1.99 3.33 2.86 2.85 4.14 3.50 3.16Affect: anxiety ◦ ** 0.61 0.25 0.88 0.55 0 0.98 0.45 0 0.90Cognitive Mech. ◦ * 17.27 16.98 4.50 17.14 17.09 4.95 17.93 17.96 5.11Certain ◦ * 1.21 1.03 1.22 1.41 1.21 1.41 1.57 1.36 1.53Inhibition ◦ * 0.50 0.23 0.70 0.41 0 0.74 0.43 0 0.76See ◦ * 0.34 0 0.65 0.30 0 0.80 0.50 0 1.14Feel ◦ *** 0.73 0.45 1.10 1.18 0.83 1.50 0.85 0.50 1.23Biological ◦ *** 3.87 3.46 2.63 4.01 3.70 2.90 3.31 2.89 2.72Body ◦ *** 0.58 0 1 1.13 0.63 1.53 0.68 0 1.12Health ◦ *** 3.00 2.63 2.29 2.58 2.13 2.36 2.20 1.72 2.25Relative ◦ *** 13.46 13.39 4.65 15.04 14.75 5.25 13.72 13.61 5.23Time ◦ *** 7.24 6.86 3.46 8.51 7.87 4.21 7.33 7.02 4.23Home ◦ *** 0.30 0 0.54 0.40 0 0.77 0.68 0.14 1.18Comma ◦ ** 3.01 2.17 3.36 2.75 1.94 3.27 2.19 1.63 2.43QMark ◦ * 1.35 0.52 2.87 1.34 0.40 2.58 1.50 0 4.92Other Punctuation ◦ *** 0.81 0 1.77 0.89 0 1.91 0.62 0 2.05

136

APPENDIX D. F77 PHASE SUPPLEMENTARY MATERIAL 137

Table D.2: LIWC features for the three classes in the labeled dataset. Only statistically significant vari-ables are shown. Statistical significance is determined using Kruskal-Wallis tests (* p < 0.05; ** p <0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise error rate across all 184variables (includes activity features). Column c denotes (◦) if the feature is used in our CRF classifier.

Response Post Linguistic Features

USING WITHDRAWAL POST-WITHDRAWALc p Mean Median SD Mean Median SD Mean Median SD

Word count *** 494.69 347.00 506.67 427.38 284.00 487.46 356.29 210.50 439.75Words per sentence *** 19.21 15.40 18.60 17.04 14.09 14.73 14.98 12.99 14.25Numerals ◦ * 0.75 0.43 1.02 0.95 0.68 1.12 0.95 0.56 1.49Function words *** 59.01 59.85 4.56 56.95 57.69 5.41 55.82 57.06 7.17Personal Pronouns *** 10.86 11.36 3.99 10.21 10.53 3.81 10.86 11.58 4.71Pronoun: she/he ** 0.68 0 1.35 0.44 0 1.16 0.64 0 1.63Pronoun: they *** 0.66 0.41 0.91 0.49 0.27 0.66 0.49 0.13 0.90Pronoun: impers. ** 5.48 5.67 2.20 5.57 5.78 2.36 5.10 5.32 2.75Article *** 4.91 4.98 2.06 4.75 4.96 2.02 4.20 4.41 2.23Verb ** 17.26 18.15 4.94 17.13 17.82 4.88 16.09 17.23 5.78Aux. verb *** 10.67 11.11 3.51 10.37 10.68 3.44 9.66 10.33 3.96Future *** 1.50 1.44 1.07 1.50 1.43 1.13 1.10 1.01 1.03Preposition *** 11.63 12.27 3.57 11.19 11.66 3.38 10.61 11.51 4.14Conjunction *** 6.39 6.76 2.33 6.18 6.58 2.46 5.72 6.13 2.69Quantitative *** 3.00 2.99 1.52 2.94 2.88 1.64 2.50 2.58 1.67Social ◦ *** 10.26 10.11 4.77 8.83 8.75 4.23 9.78 9.81 5.45Affect ◦ *** 5.73 5.76 2.68 6.55 6.34 3.25 7.54 7.33 4.31Affect: positive ◦ *** 3.72 3.53 2.43 4.61 4.10 3.17 5.84 5.13 4.36Affect: negative ◦ *** 1.96 1.92 1.34 1.90 1.87 1.33 1.67 1.50 1.51Affect: anxiety ◦ *** 0.36 0.24 0.47 0.40 0.23 0.55 0.32 0 0.61Cognitive Proc. *** 19.37 17.81 7.77 18.71 17.43 7.83 18.77 16.80 10Discrepancy *** 2.32 2.32 1.31 1.92 1.88 1.33 1.63 1.60 1.30Tentative *** 3.35 3.25 1.79 3.12 3.09 1.77 2.55 2.45 1.96Exclusive *** 3.35 3.40 1.62 3.07 3.18 1.66 2.56 2.60 1.83Perceptual proc. *** 1.52 1.48 1.07 1.90 1.81 1.34 1.87 1.68 1.55Feel *** 0.64 0.53 0.70 0.91 0.76 0.85 0.65 0.45 0.76Biological *** 3.46 3.20 2.17 3.42 3.22 2.46 2.71 2.41 2.39Body *** 0.52 0.28 0.78 0.78 0.45 1.08 0.52 0.19 0.90Health ◦ *** 2.68 2.45 1.85 2.24 1.95 1.90 1.70 1.32 1.76Sexual *** 0.15 0 0.35 0.14 0 0.36 0.30 0 0.89Ingetion * 0.17 0 0.39 0.30 0 0.66 0.25 0 0.71Relativity ** 11.46 11.82 4.39 12.36 12.68 4.70 11.90 12.50 5.37Time ** 5.29 5.10 2.90 5.88 6.06 3.12 5.66 5.69 3.33Money * 0.32 0.13 0.55 0.28 0 0.56 0.23 0 0.42Assent ◦ *** 0.27 0.07 0.50 0.40 0.18 0.81 0.62 0.27 2.01Colon ** 0.09 0 0.20 0.15 0 0.42 0.27 0 0.84Exclamation ◦ *** 1.02 0.34 1.79 2.25 0.82 5.08 4.52 1.68 8.40Dash ** 0.79 0.28 2.08 0.82 0 2.20 0.62 0 1.64Other punct. ◦ *** 3.41 2.84 2.55 4.29 3.53 3.22 5.64 4.29 6.35All punct. ◦ *** 22.07 21.51 9.71 25.75 23.69 14.52 29.69 26.82 19.27

APPENDIX D. F77 PHASE SUPPLEMENTARY MATERIAL 138

Tabl

eD

.3:

Act

ivity

and

cont

ent-b

ased

feat

ures

fort

heth

ree

clas

ses

inth

ela

bele

dda

tase

t.S

tatis

tical

sign

ifica

nce

isde

term

ined

usin

gK

rusk

al-W

allis

test

s(*p<

0.05

;**p<

0.00

5;**

*p<

0.00

1)af

ter

Bon

ferr

onic

orre

ctio

nsto

adju

stfo

rfa

mily

-wis

eer

ror

rate

acro

ssal

l184

varia

bles

(incl

udes

160

LIW

Cva

riabl

es).

Col

umn

cde

note

s(◦

)ift

hefe

atur

eis

used

inou

rCR

Fcl

assi

fier.

US

ING

WIT

HD

RA

WIN

GR

EC

OV

ER

ING

cp

Mea

nM

edIQ

RM

AD

Mea

nM

edIQ

RM

AD

Mea

nM

edIQ

RM

AD

Act

ivity

Cha

ract

eris

tics

All

time

#in

itiat

ing

post

sau

thor

ed**

*8.

845.

0010

.00

5.93

8.78

5.00

8.00

4.45

20.7

314

.00

22.0

013

.34

#se

lfre

spon

ses

auth

ored

***

13.9

35.

0018

.00

7.41

13.8

08.

0015

.00

8.90

33.2

623

.00

36.2

523

.72

#re

spon

ses

auth

ored

***

26.9

06.

0021

.00

8.90

23.6

18.

0021

.00

10.3

817

8.69

67.0

015

9.25

83.7

7#

initi

atin

gpo

sts

#re

spon

ses

auth

ored

◦**

*1.

361.

001.

311.

021.

280.

821.

260.

850.

530.

220.

350.

21D

ays

sinc

ela

stin

it.po

st◦

***

50.9

45.

0024

.00

5.93

21.0

42.

005.

001.

4831

.04

4.00

12.0

04.

45D

ays

sinc

ela

stse

lfre

sp.

◦**

*66

.34

9.00

43.5

011

.86

29.9

42.

008.

001.

4842

.05

6.00

17.0

07.

41D

ays

sinc

ela

stre

spon

se◦

***

73.3

75.

0027

.00

5.93

33.5

12.

006.

001.

4828

.68

2.00

5.00

1.48

Day

ssi

nce

last

activ

ity**

*39

.56

3.00

13.0

02.

9716

.66

1.00

2.00

0.00

17.7

61.

004.

000.

00

Last

5da

ys

#in

itiat

ing

post

sau

thor

ed◦

***

0.93

0.00

1.00

0.00

2.01

1.00

3.00

1.48

1.81

1.00

3.00

1.48

#se

lfre

spon

ses

auth

ored

◦**

*1.

370.

002.

000.

003.

321.

004.

001.

482.

890.

004.

000.

00#

resp

onse

sau

thor

ed◦

***

1.87

0.00

2.00

0.00

5.48

1.00

6.00

1.48

15.2

05.

0016

.00

7.41

#in

itiat

ing

post

s#

resp

onse

sau

thor

ed◦

***

1.02

1.00

0.00

0.00

1.06

1.00

0.58

0.64

0.58

0.33

0.87

0.42

Toda

y#

repl

ies

rece

ived

.5.

154.

005.

002.

975.

524.

005.

002.

976.

094.

006.

004.

45#

resp

onda

nts

.3.

823.

003.

002.

974.

053.

003.

002.

974.

683.

004.

002.

97#

self

resp

onse

s**

1.57

1.00

2.00

1.48

1.89

1.00

3.00

1.48

1.53

1.00

2.00

1.48

Post

and

Res

pons

eC

onte

ntC

hara

cter

istic

s

Initi

atin

g

Day

scl

ean

◦**

*42

1.15

14.0

017

5.00

17.7

947

.50

5.00

7.00

4.45

125.

9745

.00

74.0

043

.00

Day

sm

entio

ned

◦**

*52

.10

10.0

038

.25

11.8

619

.08

5.00

7.00

4.45

57.0

327

.00

48.0

028

.17

#qu

estio

ns◦

**2.

942.

003.

001.

482.

352.

002.

001.

482.

602.

002.

001.

48#

US

ING

term

s◦

***

0.73

0.00

1.00

0.00

0.35

0.00

1.00

0.00

0.25

0.00

0.00

0.00

#W

ITH

DR

AW

ING

term

s◦

***

0.50

0.00

1.00

0.00

1.11

1.00

2.00

1.48

0.44

0.00

1.00

0.00

#R

EC

OV

ER

ING

term

s◦

***

0.38

0.00

1.00

0.00

0.39

0.00

1.00

0.00

0.94

1.00

1.00

1.48

Res

pons

es#

US

ING

term

s◦

**0.

310.

000.

000.

000.

190.

000.

000.

000.

180.

000.

000.

00#

WIT

HD

RA

WIN

Gte

rms

◦**

*0.

860.

001.

000.

001.

181.

002.

001.

480.

760.

001.

000.

00#

RE

CO

VE

RIN

Gte

rms

◦**

*0.

530.

001.

000.

000.

530.

001.

000.

000.

780.

001.

000.

00

Bibliography

[1] Alcoholics Anonymous (“Big Book,” 4th ed.). AA World Services, Inc. (2001). [Online: http:

//www.aa.org/bigbookonline, accessed 20-May-2014].

[2] Narcotics Anonymous Annual Membership Survey. Narcotics Anonymous (2011). [Online:

http://www.na.org/admin/include/spaw2/uploads/pdf/PR/NA_Membership_Survey.pdf,

accessed 12-August-2013].

[3] Vital signs: Overdoses of prescription opioid pain relievers United States, 1999-2008. Center for

Disease Control. Morbidity and Mortality Weekly Report. (2011). [Online: http://www.cdc.gov/

mmwr/preview/mmwrhtml/mm6043a4.htm, accessed 93/4/2014.].

[4] Addiction medicine, closing the gap between science and practice. CASAColumbia (2012). [On-

line: http://www.casacolumbia.org/download/file/fid/1177, accessed 4/5/2014.].

[5] Commonly abused prescription drugs. National Institute on Drug Abuse (2012). [Online: http://

www.drugabuse.gov/sites/default/files/rx_drugs_placemat_508c_10052011.pdf, ac-

cessed 28-May-2014].

[6] Opiate withdrawal. MedlinePlus - U.S. National Library of Medicine (2012). [Online: http://www.

nlm.nih.gov/medlineplus/ency/article/000949.htm, accessed 28-May-2014].

[7] Internet user demographics. [Online: http://www.pewinternet.org/data-trend/

internet-use/latest-stats/, accessed 7/1/2014].

[8] Prescription painkiller overdoses: A growing epidemic, especially among women. Vital Signs.

CS238899B. Center for Disease Control. (2013). [Online: http://www.cdc.gov/vitalsigns/

pdf/2013-07-vitalsigns.pdf, accessed 9/4/2014].

[9] State and County QuickFacts. U.S. Census Bureau (2013). [Online: http://quickfacts.

census.gov/qfd/states/00000.html, accessed 28-August-2014].

139

BIBLIOGRAPHY 140

[10] Achrekar, H., Gandhe, A., Lazarus, R., Yu, S.-H., and Liu, B. Predicting flu trends using Twitter

data. In Computer Communications Workshops, IEEE (2011), 702–707.

[11] Ahmad, F., Hudak, P. L., Bercovitz, K., Hollenberg, E., and Levinson, W. Are physicians ready for

patients with Internet-based health information? Journal of Medical Internet Research 8, 3 (2006),

e22.

[12] Alpers, G. W., Winzelberg, A. J., Classen, C., Roberts, H., Dev, P., Koopman, C., and Barr Taylor,

C. Evaluation of computerized text analysis in an Internet breast cancer support group. Computers

in Human Behavior 21, 2 (2005), 361–376.

[13] Anand, S. G., Feldman, M. J., Geller, D. S., Bisbee, A., and Bauchner, H. A content analysis

of e-mail communication between primary care providers and parents. Pediatrics 115, 5 (2005),

1283–1288.

[14] Anderson, J. G., Rainey, M. R., and Eysenbach, G. The impact of cyberhealthcare on the

physician–patient relationship. Journal of Medical Systems 27, 1 (2003), 67–84.

[15] Aramaki, E., Maskawa, S., and Morita, M. Twitter catches the flu: detecting influenza epidemics

using Twitter. In Empirical Methods in Natural Language Processing, ACL (2011), 1568–1576.

[16] Aronson, A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap

program. In American Medical Informatics Association Annual Symposium, AMIA (2001), 17.

[17] Aronson, A. R., and Lang, F.-M. An overview of MetaMap: historical perspective and recent

advances. Journal of the American Medical Informatics Association 17, 3 (2010), 229–236.

[18] Ayers, J. W., Ribisl, K. M., and Brownstein, J. S. Tracking the rise in popularity of electronic nicotine

delivery systems (electronic cigarettes) using search query surveillance. American Journal of

Preventive Medicine 40, 4 (2011), 448–453.

[19] Baccianella, S., Esuli, A., and Sebastiani, F. SentiWordNet 3.0: An enhanced lexical resource for

sentiment analysis and opinion mining. In Language Resources and Evaluation (2010).

[20] Bebbington, P. E. The efficacy of Alcoholics Anonymous: the elusiveness of hard data. The British

Journal of Psychiatry 128, 6 (1976), 572–580.

BIBLIOGRAPHY 141

[21] Bell, V. Online information, extreme communities and Internet therapy: Is the Internet good for our

mental health? Journal of Mental Health 16, 4 (2007), 445–457.

[22] Bender, J. L., Jimenez-Marroquin, M.-C., and Jadad, A. R. Seeking support on Facebook: a

content analysis of breast cancer groups. Journal of Medical Internet Research 13, 1 (2011), e16.

[23] Benton, A., Ungar, L., Hill, S., Hennessy, S., Mao, J., Chung, A., Leonard, C. E., and Holmes,

J. H. Identifying potential adverse effects using the web: A new approach to medical hypothesis

generation. Journal of Biomedical Informatics 44, 6 (2011), 989–996.

[24] Berger, M., Wagner, T. H., and Baker, L. C. Internet use and stigmatized illness. Social Science &

Medicine 61, 8 (2005), 1821–1827.

[25] Berland, G. K., Elliott, M. N., Morales, L. S., Algazy, J. I., Kravitz, R. L., Broder, M. S., Kanouse,

D. E., Munoz, J. A., Puyol, J.-A., Lara, M., et al. Health information on the Internet: accessibility,

quality, and readability in English and Spanish. Journal of the American Medical Association 285,

20 (2001), 2612–2621.

[26] Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell,

D., and Panovich, K. Soylent: a word processor with a crowd inside. In User Interface Software

and Technology, ACM (2010), 313–322.

[27] Birnbaum, H. G., White, A. G., Schiller, M., Waldman, T., Cleveland, J. M., and Roland, C. L.

Societal costs of prescription opioid abuse, dependence, and misuse in the United States. Pain

Medicine 12, 4 (2011), 657–667.

[28] Biyani, P., Caragea, C., Mitra, P., and Yen, J. Identifying emotional and informational support in

online health communities. In Computational Linguistics, ICCL (2014), 827–836.

[29] Braithwaite, D. O., Waldron, V. R., and Finn, J. Communication of social support in computer-

mediated groups for people with disabilities. Health Communication 11, 2 (1999), 123–151.

[30] Braun, V., and Clarke, V. Using thematic analysis in psychology. Qualitative Research in Psychol-

ogy 3, 2 (2006), 77–101.

[31] Brennan, P. F., and Aronson, A. R. Towards linking patients and clinical information: detecting

UMLS concepts in e-mail. Journal of Biomedical Informatics 36, 4 (2003), 334–341.

BIBLIOGRAPHY 142

[32] Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. Class-based n-gram

models of natural language. In Computational Linguistics, vol. 18, ICCL (1992), 467–479.

[33] Brownstein, J. S., Freifeld, C. C., Reis, B. Y., and Mandl, K. D. Surveillance Sans Fron-

tieres: Internet-based emerging infectious disease intelligence and the HealthMap project. PLoS

Medicine 5, 7 (2008), e151.

[34] Buchanan, H., and Coulson, N. S. Accessing dental anxiety online support groups: An exploratory

qualitative study of motives and experiences. Patient Education and Counseling 66, 3 (2007),

263–269.

[35] Buehler, J. W., Berkelman, R. L., Hartley, D. M., and Peters, C. J. Syndromic surveillance and

bioterrorism-related epidemics. Emerging Infectious Diseases 9, 10 (2003), 1197.

[36] Buis, L. R. Emotional and informational support messages in an online hospice support commu-

nity. Computers Informatics Nursing 26, 6 (2008), 358–367.

[37] Bundorf, M. K., Wagner, T. H., Singer, S. J., and Baker, L. C. Who searches the Internet for health

information? Health Services Research 41, 3p1 (2006), 819–836.

[38] Butler, D. When google got flu wrong. Nature 494, 7436 (2013), 155.

[39] Card, S. K., Mackinlay, J. D., Pirolli, P. L., and Pitkow, J. E. Method and apparatus for clustering a

collection of linked documents using co-citation analysis, 2000. US Patent 6,038,574.

[40] Carmichael, A. Infertility-Asthma Link Confirmed. Cure Together Blog. [Online:

www.curetogether.com/blog/2011/03/07/infertility-asthma-link-confirmed, ac-

cessed 15-Sept-2013].

[41] Carneiro, H. A., and Mylonakis, E. Google trends: a web-based tool for real-time surveillance of

disease outbreaks. Clinical Infectious Diseases 49, 10 (2009), 1557–1564.

[42] Cartright, M.-A., White, R. W., and Horvitz, E. Intentions and attention in exploratory health search.

In Research and Development in Information Retrieval, ACM SIGIR (2011), 65–74.

[43] Chapman, W. W., Fiszman, M., Dowling, J. N., Chapman, B. E., and Rindflesch, T. C. Identifying

respiratory findings in emergency department reports for biosurveillance using MetaMap. Medinfo

11, Pt 1 (2004), 487–91.

BIBLIOGRAPHY 143

[44] Chary, M., Genes, N., McKenzie, A., and Manini, A. F. Leveraging social networks for toxicovigi-

lance. Journal of Medical Toxicology 9, 2 (2013), 184–191.

[45] Chee, B. W., Berlin, R., and Schatz, B. Predicting adverse drug events from personal health

messages. In American Medical Informatics Association Annual Symposium, AMIA (2011), 217.

[46] Cicero, T. J., Ellis, M. S., and Surratt, H. L. Effect of abuse-deterrent formulation of oxycontin. New

England Journal of Medicine 367, 2 (2012), 187–189.

[47] Civan, A., and Pratt, W. Threading together patient expertise. In American Medical Informatics

Association Annual Symposium, AMIA (2007), 140.

[48] Cleveland, W. S., and Devlin, S. J. Locally weighted regression: an approach to regression analy-

sis by local fitting. Journal of the American Statistical Association 83, 403 (1988), 596–610.

[49] Cline, R. J., and Haynes, K. M. Consumer health information seeking on the Internet: the state of

the art. Health Education Research 16, 6 (2001), 671–692.

[50] Cohen, J. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial

credit. Psychological Bulletin 70, 4 (1968), 213.

[51] Coiera, E. Information epidemics, economics, and immunity on the Internet: We still know so little

about the effect of information on public health. British Medical Journal 317, 7171 (1998), 1469.

[52] Collier, N., Doan, S., Kawazoe, A., Goodwin, R. M., Conway, M., Tateno, Y., Ngo, Q.-H., Dien, D.,

Kawtrakul, A., Takeuchi, K., et al. Biocaster: detecting public health rumors with a web-based text

mining system. Bioinformatics 24, 24 (2008), 2940–2941.

[53] Cooper, C. P., Mallon, K. P., Leadbetter, S., Pollack, L. A., and Peipins, L. A. Cancer Internet

search activity on a major search engine, United States 2001-2003. Journal of Medical Internet

Research 7, 3 (2005), e36.

[54] Corazza, O., Valeriani, G., Bersani, F. S., Corkery, J., Martinotti, G., Bersani, G., and Schifano,

F. “Spice”, “Kryptonite”, “Black Mamba”: An Overview of Brand Names and Marketing Strategies

of Novel Psychoactive Substances on the Web. Journal of Psychoactive Drugs 46, 4 (2014),

287–294.

BIBLIOGRAPHY 144

[55] Corley, C., Mikler, A. R., Singh, K. P., and Cook, D. J. Monitoring influenza trends through mining

social media. In Bioinformatics and Computational Biology (2009), 340–346.

[56] Corley, C. D., Cook, D. J., Mikler, A. R., and Singh, K. P. Text and structural data mining of influenza

mentions in web and social media. International Journal of Environmental Research and Public

Health 7, 2 (2010), 596–615.

[57] Cotten, S. R., and Gupta, S. S. Characteristics of online and offline health information seekers

and factors that discriminate between them. Social Science & Medicine 59, 9 (2004), 1795–1806.

[58] Coulson, N. S. Receiving social support online: an analysis of a computer-mediated support group

for individuals living with irritable bowel syndrome. CyberPsychology & Behavior 8, 6 (2005), 580–

584.

[59] Coulson, N. S., Buchanan, H., and Aubeeluck, A. Social support in cyberspace: a content analysis

of communication within a Huntington’s disease online support group. Patient Education and

Counseling 68, 2 (2007), 173–178.

[60] Coulson, N. S., and Knibb, R. C. Coping with food allergy: exploring the role of the online support

group. CyberPsychology & Behavior 10, 1 (2007), 145–148.

[61] Coursaris, C. K., and Liu, M. An analysis of social support exchanges in online HIV/AIDS self-help

groups. Computers in Human Behavior 25, 4 (2009), 911–918.

[62] Culotta, A. Towards detecting influenza epidemics by analyzing Twitter messages. In workshop

on Social Media Analytics, ACM (2010), 115–122.

[63] Culotta, A. Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter

messages. Language Resources and Evaluation 47, 1 (2013), 217–238.

[64] Culver, J. D., Gerr, F., Frumkin, H., et al. Medical information on the Internet. Journal of General

Internal Medicine 12, 8 (1997), 466–470.

[65] Curtis, B., Alanis-Hirsch, K., Kaynak, O., Cacciola, J., Meyers, K., and McLellan, A. T. Using

web searches to track interest in synthetic cannabinoids (aka “herbal incense”). Drug and Alcohol

Review 34, 1 (2014), 105–108.

BIBLIOGRAPHY 145

[66] Dasgupta, N., Freifeld, C., Brownstein, J. S., Menone, C. M., Surratt, H. L., Poppish, L., Green,

J. L., Lavonas, E. J., and Dart, R. C. Crowdsourcing black market prices for prescription opioids.

Journal of Medical Internet Research 15, 8 (2013), e178.

[67] Davison, K. P., Pennebaker, J. W., and Dickerson, S. S. Who talks? The social psychology of

illness support groups. American Psychologist 55, 2 (2000), 205.

[68] De Bock, G. H., Jacobi, C. E., Seynaeve, C., Krol-Warmerdam, E. M., Blom, J., Van Asperen, C. J.,

Cornelisse, C. J., Klijn, J. G., Devilee, P., Tollenaar, R. A., et al. A family history of breast cancer

will not predict female early onset breast cancer in a population-based setting. BMC Cancer 8, 1

(2008), 203.

[69] De Choudhury, M., Counts, S., and Horvitz, E. Major life changes and behavioral markers in social

media: case of childbirth. In Computer Supported Cooperative Work, ACM (2013), 1431–1442.

[70] De Choudhury, M., Counts, S., and Horvitz, E. Predicting postpartum changes in emotion and

behavior via social media. In Human Factors in Computing Systems, ACM (2013), 3267–3276.

[71] De Choudhury, M., Counts, S., Horvitz, E. J., and Hoff, A. Characterizing and predicting post-

partum depression from shared Facebook data. In Computer Supported Cooperative Work, ACM

(2014), 626–638.

[72] De Choudhury, M., Gamon, M., Counts, S., and Horvitz, E. Predicting depression via social media.

In International Conference on Weblogs and Social Media, AAAI (2013).

[73] Deluca, P., Davey, Z., Corazza, O., Di Furia, L., Farre, M., Flesland, L. H., Mannonen, M., Majava,

A., Peltoniemi, T., Pasinetti, M., et al. Identifying emerging trends in recreational drug use; out-

comes from the Psychonaut Web Mapping Project. Progress in Neuro-Psychopharmacology and

Biological Psychiatry 39, 2 (2012), 221–226.

[74] Diaz, J. A., Griffith, R. A., Ng, J. J., Reinert, S. E., Friedmann, P. D., and Moulton, A. W. Patients’

use of the Internet for medical information. Journal of General Internal Medicine 17, 3 (2002),

180–185.

[75] DiClemente, C. C., Prochaska, J. O., Fairhurst, S. K., Velicer, W. F., Velasquez, M. M., and Rossi,

J. S. The process of smoking cessation: an analysis of precontemplation, contemplation, and

preparation stages of change. Journal of Consulting and Clinical Psychology 59, 2 (1991), 295.

BIBLIOGRAPHY 146

[76] Dingare, S., Nissim, M., Finkel, J., Manning, C., and Grover, C. A system for identifying named

entities in biomedical text: How results from two evaluations reflect on both the system and the

evaluations. Comparative and Functional Genomics 6, 1-2 (2005), 77–85.

[77] Doing-Harris, K. M., and Zeng-Treitler, Q. Computer-assisted update of a consumer health vocab-

ulary through mining of social network data. Journal of Medical Internet Research 13, 2 (2011),

e37.

[78] Dunning, T. Accurate methods for the statistics of surprise and coincidence. Computational Lin-

guistics 19, 1 (1993), 61–74.

[79] DuPont, R. L., McLellan, A. T., White, W. L., Merlo, L. J., and Gold, M. S. Setting the standard

for recovery: Physicians’ health programs. Journal of Substance Abuse Treatment 36, 2 (2009),

159–171.

[80] Esquivel, A., Meric-Bernstam, F., and Bernstam, E. V. Accuracy and self correction of information

received from an Internet breast cancer list: content analysis. British Medical Journal 332, 7547

(2006), 939–942.

[81] Eysenbach, G. Infodemiology: tracking flu-related searches on the web for syndromic surveillance.

In American Medical Informatics Association Annual Symposium, AMIA (2006), 244–248.

[82] Eysenbach, G., and Kohler, C. How do consumers search for and appraise health information on

the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews.

British Medical Journal 324, 7337 (2002), 573.

[83] Eysenbach, G., Powell, J., Kuss, O., and Sa, E.-R. Empirical studies assessing the quality of

health information for consumers on the world wide web: a systematic review. Journal of the

American Medical Association 287, 20 (2002), 2691–2700.

[84] Farrell, M. Opiate withdrawal. Addiction 89, 11 (1994), 1471–1475.

[85] Fernandez-Luque, L., Karlsen, R., and Bonander, J. Review of extracting information from the

social web for health personalization. Journal of Medical Internet Research 13, 1 (2011), e15.

[86] Finfgeld, D. L. Therapeutic groups online: the good, the bad, and the unknown. Issues in Mental

Health Nursing 21, 3 (2000), 241–255.

BIBLIOGRAPHY 147

[87] Finkel, J., Dingare, S., Nguyen, H., Nissim, M., Manning, C., and Sinclair, G. Exploiting context

for biomedical entity recognition: from syntax to the web. In joint workshop on Natural Language

Processing in Biomedicine and its Applications, ACL (2004), 88–91.

[88] Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological Bulletin 76,

5 (1971), 378.

[89] Fox, N., Ward, K., and O’Rourke, A. Pro-anorexia, weight-loss drugs and the Internet: an “anti-

recovery” explanatory model of anorexia. Sociology of Health & Illness 27, 7 (2005), 944–971.

[90] Fox, S. Peer-to-Peer Health Care. Pew Internet & American Life Project, 2011. [Online:

http://pewinternet.org/Reports/2011/P2PHealthcare/Summary-of-Findings.aspx, ac-

cessed 6-January-2014].

[91] Fox, S., and Duggan, M. Health Online. Pew Internet & American Life Project, 2013.

[Online: http://pewinternet.org/Reports/2013/Health-online/Summary-of-Findings.

aspx, accessed 2-April-2013].

[92] Fox, S., and Rainie, L. Vital Decisions: How Internet Users Decide what In-

formation to Trust when They Or Their Loved Ones are Sick. Pew Internet &

American Life Project, 2002. [Online: http://www.pewinternet.org/2002/05/22/

vital-decisions-a-pew-internet-health-report/, accessed 2-April-2013].

[93] Franklin, V. L., Waller, A., Pagliari, C., and Greene, S. A. A randomized controlled trial of Sweet

Talk, a text-messaging system to support young people with diabetes. Diabetic Medicine 23, 12

(2006), 1332–1338.

[94] Frantzi, K., Ananiadou, S., and Mima, H. Automatic recognition of multi-word terms: the c-

value/nc-value method. International Journal on Digital Libraries 3, 2 (2000), 115–130.

[95] Friedrich, C. M., Revillion, T., Hofmann, M., and Fluck, J. Biomedical and chemical named entity

recognition with conditional random fields: the advantage of dictionary features. In Semantic

Mining in Biomedicine, vol. 7 (2006), 85–89.

[96] Frost, J. H., and Massagli, M. P. Social uses of personal health information within PatientsLikeMe,

an online community: what can happen when patients have access to one anothers data. Journal

of Medical Internet Research 10, 3 (2008), e15.

BIBLIOGRAPHY 148

[97] Gade, E. J., Thomsen, S. F., Lindenberg, S., Kyvik, K. O., Lieberoth, S., and Backer, V. Asthma

affects time to pregnancy and fertility: a register-based twin study. European Respiratory Journal

43, 4 (2014), 1077–1085.

[98] Gavin, J., Rodham, K., and Poyer, H. The presentation of “pro-anorexia” in online group interac-

tions. Qualitative Health Research 18, 3 (2008), 325–333.

[99] Gibbs, R. D., Gibbs, P. H., and Henrich, J. Patient understanding of commonly used medical

vocabulary. The Journal of Family Practice 25, 2 (1987), 176–178.

[100] Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L. Detecting

influenza epidemics using search engine query data. Nature 457, 7232 (2008), 1012–1014.

[101] Gooden, R. J., and Winefield, H. R. Breast and prostate cancer online discussion boards a the-

matic analysis of gender differences and similarities. Journal of Health Psychology 12, 1 (2007),

103–114.

[102] Gossop, M., Battersby, M., and Strang, J. Self-detoxification by opiate addicts. a preliminary

investigation. The British Journal of Psychiatry 159, 2 (1991), 208–212.

[103] Gossop, M., Green, L., Phillips, G., and Bradley, B. Lapse, relapse and survival among opiate

addicts after treatment. A prospective follow-up study. The British Journal of Psychiatry 154, 3

(1989), 348–353.

[104] Grandinetti, D. A. Doctors and the web. Help your patients surf the Net safely. Medical Economics

77, 5 (2000), 186.

[105] Gray, N. J., Klein, J. D., Noyce, P. R., Sesselberg, T. S., and Cantrill, J. A. Health information-

seeking behaviour in adolescence: the place of the Internet. Social Science & Medicine 60, 7

(2005), 1467–1478.

[106] Green, L., and Gossop, M. Effects of information on the opiate withdrawal syndrome. British

Journal of Addiction 83, 3 (1988), 305–309.

[107] Greene, J. A., Choudhry, N. K., Kilabuk, E., and Shrank, W. H. Online social networking by pa-

tients with diabetes: a qualitative evaluation of communication with Facebook. Journal of General

Internal Medicine 26, 3 (2011), 287–292.

BIBLIOGRAPHY 149

[108] Grimes, A., Landry, B. M., and Grinter, R. E. Characteristics of shared health reflections in a local

community. In Computer Supported Cooperative Work, ACM (2010), 435–444.

[109] Grishman, R., Huttunen, S., and Yangarber, R. Information extraction for enhanced access to

disease outbreak reports. Journal of Biomedical Informatics 35, 4 (2002), 236–246.

[110] Guest, G., MacQueen, K. M., and Namey, E. E. Applied Thematic Analysis. Sage, 2011.

[111] GuoDong, Z., and Jian, S. Exploring deep knowledge resources in biomedical name recognition.

In workshop on Natural Language Processing in Biomedicine and its Applications, ACL (2004),

96–99.

[112] Gupta, S., MacLean, D. L., Heer, J., and Manning, C. D. Induced lexico-syntactic patterns improve

information extraction from online medical forums. Journal of the American Medical Informatics

Association 21, 5 (2014), 902–909.

[113] Hampton, T. Warning system aims to detect emerging trends in illegal drug use. Journal of the

American Medical Association 312, 8 (2014), 779–779.

[114] Hansen, D. L., Derry, H. A., Resnick, P. J., and Richardson, C. R. Adolescents searching for

health information on the Internet: an observational study. Journal of Medical Internet Research

5, 4 (2003), e25.

[115] Hansen, R. N., Oster, G., Edelsberg, J., Woody, G. E., and Sullivan, S. D. Economic costs of

nonmedical use of prescription opioids. The Clinical Journal of Pain 27, 3 (2011), 194–202.

[116] Hardey, M. Doctor in the house: the Internet as a source of lay health knowledge and the challenge

to expertise. Sociology of Health & Illness 21, 6 (1999), 820–835.

[117] Hardey, M. the story of my illness: Personal accounts of illness on the Internet. Health: 6, 1 (2002),

31–46.

[118] Harman, G. A., Coppersmith, C. T., and Dredze, M. H. Measuring post traumatic stress disorder

in Twitter. In International Conference on Weblogs and Social Media, AAAI (2014), 579–582.

[119] Harpaz, R., DuMouchel, W., Shah, N. H., Madigan, D., Ryan, P., and Friedman, C. Novel data-

mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology &

Therapeutics 91, 6 (2012), 1010–1021.

BIBLIOGRAPHY 150

[120] Harris, S., and Gerich, E. Retiring the NSFNET Backbone Service: Chronicling the end of an era.

Connexions 10, 4 (1996).

[121] Hartzband, P., and Groopman, J. Untangling the Web: patients, doctors, and the Internet. New

England Journal of Medicine 362, 12 (2010), 1063–1066.

[122] Hartzler, A., and Pratt, W. Managing the personal side of health: How patient expertise differs

from the expertise of clinicians. Journal of Medical Internet Research 13, 3 (2011), e62.

[123] He, H. A., Greenberg, S., and Huang, E. M. One size does not fit all: applying the transtheoretical

model to energy feedback technology design. In Human Factors in Computing Systems, ACM

(2010), 927–936.

[124] He, Y., and Kayaalp, M. Biological entity recognition with conditional random fields. In American

Medical Informatics Association Annual Symposium, AMIA (2008), 293.

[125] Hearst, M. S. A simple algorithm for identifying abbreviation definitions in biomedical text. In

Pacific Symposium on Biocomputing (2003), 451–462.

[126] Heer, J., and Bostock, M. Crowdsourcing graphical perception: using Mechanical Turk to assess

visualization design. In Human Factors in Computing Systems, ACM (2010), 203–212.

[127] Heffernan, R., Mostashari, F., Das, D., Karpati, A., Kulldorff, M., Weiss, D., et al. Syndromic

surveillance in public health practice, New York City. Emerging Infectious Diseases 10, 5 (2004),

858–864.

[128] Henning, K. J. What is syndromic surveillance? Morbidity and Mortality Weekly Report (2004),

7–11.

[129] Homan, C. M., Lu, N., Tu, X., Lytle, M. C., and Silenzio, V. Social structure and depression in

TrevorSpace. In Computer supported Cooperative Work, ACM (2014), 615–625.

[130] Houston, T. K., Cooper, L. A., and Ford, D. E. Internet support groups for depression: a 1-year

prospective cohort study. American Journal of Psychiatry 159, 12 (2002), 2062–2068.

[131] Høybye, M. T., Johansen, C., and Tjørnhøj-Thomsen, T. Online interaction. Effects of storytelling

in an Internet breast cancer support group. Psycho-Oncology 14, 3 (2005), 211–220.

BIBLIOGRAPHY 151

[132] Hulth, A., and Rydevik, G. Web query-based surveillance in Sweden during the influenza A (H1N1)

2009 pandemic, April 2009 to February 2010. Euro Surveillance 16, 18 (2011).

[133] Humphreys, K. Circles of recovery: Self-help organizations for addictions. Cambridge Univ. Press,

2004.

[134] Hwang, K. O., Ottenbacher, A. J., Green, A. P., Cannon-Diehl, M. R., Richardson, O., Bernstam,

E. V., and Thomas, E. J. Social support in an Internet weight loss community. International Journal

of Medical Informatics 79, 1 (2010), 5–13.

[135] Jamison-Powell, S., Linehan, C., Daley, L., Garbett, A., and Lawson, S. I can’t get no sleep:

discussing #insomnia on Twitter. In Human Factors in Computing Systems, ACM (2012), 1501–

1510.

[136] Jha, M., and Elhadad, N. Cancer stage prediction based on patient online discourse. In workshop

on Biomedical Natural Language Processing, ACL (2010), 64–71.

[137] Johnson, H. A., Wagner, M. M., Hogan, W. R., Chapman, W., Olszewski, R. T., Dowling, J., Barnas,

G., et al. Analysis of web access logs for surveillance of influenza. Studies in Health Technology

and Informatics 107, Pt 2 (2004), 1202–1206.

[138] Jonquet, C., Shah, N. H., and Musen, M. A. The Open Biomedical Annotator. In summit on

Translational Bioinformatics, AMIA (2009), 56.

[139] Kandel, D. B. Stages and pathways of drug involvement: Examining the gateway hypothesis.

Cambridge University Press, 2002.

[140] Kaskutas, L. A., Bond, J., and Humphreys, K. Social networks as mediators of the effect of

Alcoholics Anonymous. Addiction 97, 7 (2002), 891–900.

[141] Kelly, J. F., Hoeppner, B., Stout, R. L., and Pagano, M. Determining the relative importance of

the mechanisms of behavior change within Alcoholics Anonymous: a multiple mediator analysis.

Addiction 107, 2 (2012), 289–299.

[142] Kendall, L., Hartzler, A., Klasnja, P., and Pratt, W. Descriptive analysis of physical activity conver-

sations on Twitter. In extended abstracts on Human Factors in Computing Systems, ACM (2011),

1555–1560.

BIBLIOGRAPHY 152

[143] Keselman, A., Smith, C. A., Divita, G., Kim, H., Browne, A. C., Leroy, G., and Zeng-Treitler, Q.

Consumer health concepts that do not map to the UMLS: where do they fit? Journal of the

American Medical Informatics Association 15, 4 (2008), 496–505.

[144] Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., and Zeng, Q. Assessing consumer health

vocabulary familiarity: an exploratory study. Journal of Medical Internet Research 9, 1 (2007), e5.

[145] Kim, J.-D., Ohta, T., Tateisi, Y., and Tsujii, J. GENIA corpus – a semantically annotated corpus for

bio-textmining. Bioinformatics 19, suppl 1 (2003), i180–i182.

[146] Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y., and Collier, N. Introduction to the bio-entity recog-

nition task at JNLPBA. In joint workshop on Natural Language Processing in Biomedicine and its

Applications, ACL (2004), 70–75.

[147] Kittur, A., Chi, E. H., and Suh, B. Crowdsourcing user studies with Mechanical Turk. In Human

Factors in Computing Systems, ACM (2008), 453–456.

[148] Klemm, P., Bunnell, D., Cullen, M., Soneji, R., Gibbons, P., and Holecek, A. Online cancer support

groups: a review of the research literature. Computers Informatics Nursing (2003).

[149] Kummervold, P. E., Gammon, D., Bergvik, S., Johnsen, J.-A. K., Hasvold, T., and Rosenvinge, J. H.

Social support in a wired world: use of online mental health forums in Norway. Nordic Journal of

Psychiatry 56, 1 (2002), 59–65.

[150] LaCoursiere, S. P., Knobf, M. T., and McCorkle, R. Cancer patients’ self-reported attitudes about

the Internet. Journal of Medical Internet Research 7, 3 (2005), e22.

[151] Lafferty, J., McCallum, A., and Pereira, F. C. Conditional random fields: Probabilistic models for

segmenting and labeling sequence data. In International Conference on Machine Learning, ACM

(2001), 282–289.

[152] Lamb, A., Paul, M. J., and Dredze, M. Separating fact from fear: Tracking flu infections on Twitter.

In North American Chapter of the ACL : Human Language Technologies, ACL (2013), 789–795.

[153] Lasker, J. N., Sogolow, E. D., and Sharim, R. R. The role of an online community for people with

a rare disease: content analysis of messages posted on a primary biliary cirrhosis mailing list.

Journal of Medical Internet Research 7, 1 (2005), e10.

BIBLIOGRAPHY 153

[154] Leaman, R., Wojtulewicz, L., Sullivan, R., Skariah, A., Yang, J., and Gonzalez, G. Towards

Internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-

related social networks. In workshop on Biomedical Natural Language Processing, ACL (2010),

117–125.

[155] Lembke, A. Humphreys, K. Self-Help Organizations for Substance Use Disorders. Oxford Univ.

Press, 2009.

[156] Lewis, T. Seeking health information on the Internet: lifestyle choice or bad attack of cyberchon-

dria? Media, Culture & Society 28, 4 (2006), 521–539.

[157] Liang, P. Semi-supervised learning for natural language. PhD thesis, Massachusetts Institute of

Technology, 2005.

[158] Lieberman, M. A., Golant, M., Giese-Davis, J., Winzlenberg, A., Benjamin, H., Humphreys, K.,

Kronenwetter, C., Russo, S., and Spiegel, D. Electronic support groups for breast carcinoma.

Cancer 97, 4 (2003), 920–925.

[159] MacLean, D. L., and Heer, J. Identifying medical terms in patient-authored text: a crowdsourcing-

based approach. Journal of the American Medical Informatics Association 20, 6 (2013), 1120–

1127.

[160] Malik, S. H., and Coulson, N. The male experience of infertility: a thematic analysis of an online

infertility support group bulletin board. Journal of Reproductive and Infant Psychology 26, 1 (2008),

18–30.

[161] Malik, S. H., and Coulson, N. S. Coping with infertility online: An examination of self-help mech-

anisms in an online infertility support group. Patient Education and Counseling 81, 2 (2010),

315–318.

[162] Maloney-Krichmar, D., and Preece, J. A multilevel analysis of sociability, usability, and community

dynamics in an online health community. ACM Transactions on Computer-Human Interaction 12,

2 (2005), 201–232.

[163] Mandl, K. D., Overhage, J. M., Wagner, M. M., Lober, W. B., Sebastiani, P., Mostashari, F., Pavlin,

J. A., Gesteland, P. H., Treadwell, T., Koski, E., et al. Implementing syndromic surveillance: a

BIBLIOGRAPHY 154

practical guide informed by the early experience. Journal of the American Medical Informatics

Association 11, 2 (2004), 141–150.

[164] Mankoff, J., Kuksenok, K., Kiesler, S., Rode, J. A., and Waldman, K. Competing online viewpoints

and models of chronic illness. In Human Factors in Computing Systems, ACM (2011), 589–598.

[165] Mayer, D. K., Terrin, N. C., Kreps, G. L., Menon, U., McCance, K., Parsons, S. K., and Mooney,

K. H. Cancer survivors information seeking behaviors: a comparison of survivors who do and do

not seek information about cancer. Patient Education and Counseling 65, 3 (2007), 342–350.

[166] Mayer, M., and Till, J. The Internet: a modern Pandora’s box? Quality of Life Research 5, 6

(1996), 568–571.

[167] McCray, A. T., Loane, R. F., Browne, A. C., and Bangalore, A. K. Terminology issues in user

access to web-based medical information. In American Medical Informatics Association Annual

Symposium, AMIA (1999), 107.

[168] McLellan, A. T. What is recovery? Revisiting the Betty Ford Institute consensus panel definition.

Journal of Substance Abuse Treatment (2010), 109–113.

[169] McLellan, A. T., Lewis, D. C., O’Brien, C. P., and Kleber, H. D. Drug dependence, a chronic medical

illness: implications for treatment, insurance, and outcomes evaluation. Journal of the American

Medical Association 284, 13 (2000), 1689–1695.

[170] McNeil, K., Brna, P., and Gordon, K. Epilepsy in the Twitter era: a need to re-tweet the way we

think about seizures. Epilepsy & Behavior 23, 2 (2012), 127–130.

[171] Medawar, C., Herxheimer, A., Bell, A., and Jofre, S. Paroxetine, panorama and user reporting of

adrs: Consumer intelligence matters in clinical practice and post-marketing drug surveillance. The

International Journal of Risk and Safety in Medicine 15, 3 (2002), 161–169.

[172] Medlineplus use by quarter. National Library of Medicine (2013). [Online: http://www.nlm.nih.

gov/medlineplus/usestatistics.html, accessed 25-August-2014].

[173] Meier, A., Lyons, E. J., Frydman, G., Forlenza, M., and Rimer, B. K. How cancer survivors provide

support on cancer-related Internet mailing lists. Journal of Medical Internet Research 9, 2 (2007),

e12.

BIBLIOGRAPHY 155

[174] Merrill, J. O., Rhodes, L. A., Deyo, R. A., Marlatt, G. A., and Bradley, K. A. Mutual mistrust in the

medical care of drug users. Journal of General Internal Medicine 17, 5 (2002), 327–333.

[175] Migneault, J. P., Adams, T. B., and Read, J. P. Application of the transtheoretical model to sub-

stance abuse: historical development and future directions. Drug and Alcohol Review 24, 5 (2005),

437–448.

[176] Miller, N. S., Sheppard, L. M., Colenda, C. C., and Magen, J. Why physicians are unprepared

to treat patients who have alcohol-and drug-related disorders. Academic Medicine 76, 5 (2001),

410–418.

[177] Mo, P. K., and Coulson, N. S. Exploring the communication of social support within virtual commu-

nities: A content analysis of messages posted to an online HIV/AIDS support group. Cyberpsy-

chology & Behavior 11, 3 (2008), 371–374.

[178] Morahan-Martin, J. M. How Internet users find, evaluate, and use online health information: a

cross-cultural review. CyberPsychology & Behavior 7, 5 (2004), 497–510.

[179] Mulveen, R., and Hepworth, J. An interpretative phenomenological analysis of participation in a

pro-anorexia Internet site and its relationship with disordered eating. Journal of Health Psychology

11, 2 (2006), 283–296.

[180] Murnane, E. L., and Counts, S. Unraveling abstinence and relapse: smoking cessation reflected

in social media. In Human Factors in Computing Systems, ACM (2014), 1345–1354.

[181] Murray, E., Lo, B., Pollack, L., Donelan, K., Catania, J., Lee, K., Zapert, K., and Turner, R. The

impact of health information on the Internet on health care and the physician-patient relationship:

national U.S. survey among 1.050 U.S. physicians. Journal of Medical Internet Research 5, 3

(2003).

[182] Nettleton, S., Burrows, R., and O’Malley, L. The mundane realities of the everyday lay use of the

Internet for health, and their consequences for media convergence. Sociology of Health & Illness

27, 7 (2005), 972–992.

[183] Nikfarjam, A., and Gonzalez, G. H. Pattern mining for extraction of mentions of adverse drug

reactions from user comments. In American Medical Informatics Association Annual Symposium,

AMIA (2011), 1019.

BIBLIOGRAPHY 156

[184] Noble, A., Best, D., Man, L.-H., Gossop, M., and Strang, J. Self-detoxification attempts among

methadone maintenance patients: what methods and what success? Addictive Behaviors 27, 4

(2002), 575–584.

[185] Nonnecke, B., and Preece, J. Shedding light on lurkers in online communities. Ethnographic Stud-

ies in Real and Virtual Environments: Inhabited Information Spaces and Connected Communities

(1999), 123–128.

[186] Nonnecke, B., and Preece, J. Lurker demographics: Counting the silent. In Human Factors in

Computing Systems, ACM (2000), 73–80.

[187] Olsen, Y., and Sharfstein, J. M. Confronting the stigma of opioid use disorder – and its treatment.

Journal of the American Medical Association 311, 14 (2014), 1393–1394.

[188] Owen, J. E., Giese-Davis, J., Cordova, M., Kronenwetter, C., Golant, M., and Spiegel, D. Self-

report and linguistic indicators of emotional expression in narratives as predictors of adjustment to

cancer. Journal of Behavioral Medicine 29, 4 (2006), 335–345.

[189] Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schneider, N., and Smith, N. A. Improved part-

of-speech tagging for online conversational text with word clusters. In North American Chapter of

the ACL : Human Language Technologies, ACL (2013), 380–390.

[190] Pagano, M. E., Friend, K. B., Tonigan, J. S., and Stout, R. L. Helping other alcoholics in Alcoholics

Anonymous and drinking outcomes: Findings from Project MATCH. Journal of Studies on Alcohol

65, 6 (2004), 766.

[191] Park, S., Lee, S. W., Kwak, J., Cha, M., and Jeong, B. Activities on Facebook reveal the depressive

state of users. Journal of Medical Internet Research 15, 10 (2013), e217.

[192] Parker, K., and Wang, W. Modern Parenthood. Pew Internet & American

Life Project, 2013. [Online: http://www.pewsocialtrends.org/2013/03/14/

modern-parenthood-roles-of-moms-and-dads-converge-as-they-balance-work, ac-

cessed 2-April-2013].

[193] Paul, M. J., and Dredze, M. A model for mining public health topics from Twitter. In Health, vol. 11

(2012), 16–6.

BIBLIOGRAPHY 157

[194] Peat, H. J., and Willett, P. The limitations of term co-occurrence data for query expansion in

document retrieval systems. JASIS 42, 5 (1991), 378–383.

[195] Pennebaker, J. W., Francis, M. E., and Booth, R. J. Linguistic inquiry and word count: LIWC 2001.

Mahway: Lawrence Erlbaum Associates 71 (2001).

[196] Pennebaker, J. W., Mehl, M. R., and Niederhoffer, K. G. Psychological aspects of natural language

use: Our words, our selves. Annual Review of Psychology 54, 1 (2003), 547–577.

[197] Ploderer, B., Smith, W., Howard, S., Pearce, J., and Borland, R. Patterns of support in an online

community for smoking cessation. In International Conference on Communities and Technologies,

ACM (2013), 26–35.

[198] Polgreen, P. M., Chen, Y., Pennock, D. M., Nelson, F. D., and Weinstein, R. A. Using Internet

searches for influenza surveillance. Clinical Infectious Diseases 47, 11 (2008), 1443–1448.

[199] Potts, H. W., and Wyatt, J. C. Survey of doctors’ experience of patients using the Internet. Journal

of Medical Internet Research 4, 1 (2002), e5.

[200] Powell, J., and Clarke, A. Internet information-seeking in mental health population survey. The

British Journal of Psychiatry 189, 3 (2006), 273–277.

[201] Pratt, W., and Yetisgen-Yildiz, M. A study of biomedical concept identification: Metamap vs. peo-

ple. In American Medical Informatics Association Annual Symposium, AMIA (2003), 529.

[202] Preece, J., Nonnecke, B., and Andrews, D. The top five reasons for lurking: improving community

experiences for everyone. Computers in Human Behavior 20, 2 (2004), 201–223.

[203] Prochaska, J. O., and Velicer, W. F. The transtheoretical model of health behavior change. Ameri-

can Journal of Health Promotion 12, 1 (1997), 38–48.

[204] Pyysalo, S., Ginter, F., Heimonen, J., Bjorne, J., Boberg, J., Jarvinen, J., and Salakoski, T. Bioinfer:

a corpus for information extraction in the biomedical domain. BMC Bioinformatics 8, 1 (2007), 50.

[205] Rainie, L., and Fox, S. The Online Health Care Revolution. Pew Internet &

American Life Project, 2000. [Online: http://www.pewinternet.org/2000/11/26/

the-online-health-care-revolution/, accessed 2-April-2013].

BIBLIOGRAPHY 158

[206] Ravert, R. D., Hancock, M. D., and Ingersoll, G. M. Online forum messages posted by adolescents

with type 1 diabetes. The Diabetes Educator 30, 5 (2003), 827–834.

[207] Reis, B. Y., and Mandl, K. D. Time series modeling for syndromic surveillance. BMC Medical

Informatics and Decision Making 3, 1 (2003), 2.

[208] Resnik, P., Garron, A., and Resnik, R. Using topic modeling to improve prediction of neuroti-

cism and depression. In Conference on Empirical Methods in Natural Language Processing, ACL

(2013), 1348–1353.

[209] Rideout, V. Generation Rx.com. what are young people really doing online? Marketing Health

Services 22, 1 (2002), 26.

[210] Risk, A., and Petersen, C. Health information on the Internet: quality issues and international

initiatives. Journal of the American Medical Association 287, 20 (2002), 2713–2715.

[211] Rodgers, S., and Chen, Q. Internet community group participation: Psychosocial benefits for

women with breast cancer. Journal of Computer-Mediated Communication 10, 4 (2005).

[212] Ruau, D., Mbagwu, M., Dudley, J. T., Krishnan, V., and Butte, A. J. Comparison of automated and

human assignment of MeSH terms on publicly-available molecular datasets. Journal of Biomedical

Informatics 44 (2011), S39–S43.

[213] Sadilek, A., Brennan, S., Kautz, H., and Silenzio, V. nEmesis: Which restaurants should you avoid

today? In Human Computation and Crowdsourcing, AAAI (2013).

[214] Saha, S. K., Sarkar, S., and Mitra, P. Feature selection techniques for maximum entropy based

biomedical named entity recognition. Journal of Biomedical Informatics 42, 5 (2009), 905–911.

[215] Salem, D. A., Bogat, G. A., and Reid, C. Mutual help goes on-line. Journal of Community Psy-

chology 25, 2 (1997), 189–207.

[216] Sanderson, M., and Croft, B. Deriving concept hierarchies from text. In Research and Develop-

ment in Information Retrieval, ACM SIGIR (1999), 206–213.

[217] Sanford, A. A. “I can air my feelings instead of eating them”: Blogging as social support for the

morbidly obese. Communication Studies 61, 5 (2010), 567–584.

BIBLIOGRAPHY 159

[218] Scanfeld, D., Scanfeld, V., and Larson, E. L. Dissemination of health information through social

networks: Twitter and antibiotics. American Journal of Infection Control 38, 3 (2010), 182–188.

[219] Schatz, B. R., Johnson, E. H., Cochrane, P. A., and Chen, H. Interactive term suggestion for

users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. In

International Conference on Digital libraries, ACM (1996), 126–133.

[220] Seale, C., Ziebland, S., and Charteris-Black, J. Gender, cancer experience and Internet use: a

comparative keyword analysis of interviews and online cancer support groups. Social Science &

Medicine 62, 10 (2006), 2577–2590.

[221] Seifter, A., Schwarzwalder, A., Geis, K., and Aucott, J. The utility of Google Trends for epidemio-

logical research: Lyme disease as an example. Geospatial Health 4, 2 (2010), 135–137.

[222] Settles, B. Biomedical named entity recognition using conditional random fields and rich feature

sets. In joint workshop on Natural Language Processing in Biomedicine and its Applications, ACL

(2004), 104–107.

[223] Sheeren, M. The relationship between relapse and involvement in Alcoholics Anonymous. Journal

of Studies on Alcohol and Drugs 49, 1 (1988), 104.

[224] Shuyler, K. S., and Knight, K. M. What are patients seeking when they turn to the Internet?

Qualitative content analysis of questions asked by visitors to an orthopaedics web site. Journal of

Medical Internet Research 5, 4 (2003), e24.

[225] Sillence, E., Briggs, P., Harris, P. R., and Fishwick, L. How do patients evaluate and make use of

online health information? Social Science & Medicine 64, 9 (2007), 1853–1862.

[226] Smith, C. A., and Wicks, P. J. PatientsLikeMe: Consumer health vocabulary as a folksonomy. In

American Medical Informatics Association Annual Symposium, AMIA (2008), 682.

[227] Smyth, B., Barry, J., Keenan, E., and Ducray, K. Lapse and relapse following inpatient treatment

of opiate dependence. Irish Medical Journal 103, 6 (2010), 176–179.

[228] Snow, R., O’Connor, B., Jurafsky, D., and Ng, A. Y. Cheap and fast—but is it good? Evaluating

non-expert annotations for natural language tasks. In Empirical Methods in Natural Language

Processing, ACL (2008), 254–263.

BIBLIOGRAPHY 160

[229] Sproule, B., Brands, B., Li, S., and Catz-Biro, L. Changing patterns in opioid addiction – charac-

terizing users of oxycodone and other opioids. Canadian Family Physician 55, 1 (2009), 68–69.

[230] Strang, J., Babor, T., Caulkins, J., Fischer, B., Foxcroft, D., and Humphreys, K. Drug policy and

the public good: evidence for effective interventions. The Lancet 379, 9810 (2012), 71–83.

[231] Substance Abuse and Mental Health Services Administration. Drug Abuse Warning Network,

2011: National Estimates of Drug-Related Emergency Department Visits. HHS Publication No.

(SMA) 13-4760, DAWN Series D-39. Rockville, MD: Substance Abuse and Mental Health Services

Administration, 2013.

[232] Substance Abuse and Mental Health Services Administration, Center for Behavioral Health Statis-

tics and Quality. The N-SSATS report: Trends in the use of methadone and buprenorphine at

substance abuse treatment facilities: 2003 to 2011. Rockville, MD. 2013.

[233] Sullivan, C. F. Gendered cybersupport: A thematic analysis of two online cancer support groups.

Journal of Health Psychology 8, 1 (2003), 83–104.

[234] Sullivan, S. J., Schneiders, A. G., Cheang, C.-W., Kitto, E., Lee, H., Redhead, J., Ward, S., Ahmed,

O. H., and McCrory, P. R. What’s happening? A content analysis of concussion-related traffic on

Twitter. British Journal of Sports Medicine 46, 4 (2012), 258–263.

[235] Teodoro, R., and Naaman, M. Fitter with Twitter: Understanding personal health and fitness

activity in social media. In International Conference on Weblogs and Social Media (2013).

[236] Thomas, D. R. A general inductive approach for analyzing qualitative evaluation data. American

Journal of Evaluation 27, 2 (2006), 237–246.

[237] Tonigan, J. S., and Rice, S. L. Is it beneficial to have an Alcoholics Anonymous sponsor? Psy-

chology of Addictive Behaviors 24, 3 (2010), 397.

[238] Tsai, R. T.-H., Wu, S.-H., Chou, W.-C., Lin, Y.-C., He, D., Hsiang, J., Sung, T.-Y., and Hsu, W.-L.

Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics 7, 1

(2006), 92.

[239] Tsai, T.-h., Chou, W.-C., Wu, S.-H., Sung, T.-Y., Hsiang, J., and Hsu, W.-L. Integrating linguistic

knowledge into a conditional random fieldframework to identify biomedical named entities. Expert

Systems with Applications 30, 1 (2006), 117–128.

BIBLIOGRAPHY 161

[240] Turner-McGrievy, G. M., and Tate, D. F. Weight loss social support in 140 characters or less: use of

an online social network in a remotely delivered weight loss intervention. Translational Behavioral

Medicine 3, 3 (2013), 287–294.

[241] United States Department of Health and Human Services. Substance Abuse and Men-

tal Health Services Administration. Center for Behavioral Health Statistics and Quality.

Treatment Episode Data Set – Admissions (TEDS-A), 2011. ICPSR34876-v3. Ann Arbor,

MI: Inter-university Consortium for Political and Social Research [distributor], 2014-09-11.

http://doi.org/10.3886/ICPSR34876.v3.

[242] U.S. Department of Health and Human Services. Substance Abuse and Mental Health Services

Administration. Results from the 2010 National Survey on Drug Use and Health: Summary of

National Findings. [Online: http://www.samhsa.gov/data/nsduh/2k10nsduh/2k10results.

htm, accessed 15-Sept-2013].

[243] Ussher, J., Kirsten, L., Butow, P., and Sandoval, M. What do cancer support groups provide which

other supportive relationships do not? The experience of peer support groups for people with

cancer. Social Science & Medicine 62, 10 (2006), 2565–2576.

[244] Van Hout, M. C., and Bingham, T. Silk road, the virtual drug marketplace: a single case study of

user experiences. International Journal of Drug Policy 24, 5 (2013), 385–391.

[245] van Rijsbergen, C. J. A theoretical basis for the use of co-occurrence data in information retrieval.

Journal of Documentation 33, 2 (1977), 106–119.

[246] van Uden-Kraan, C. F., Drossaert, C. H., Taal, E., Seydel, E. R., and van de Laar, M. A. Self-

reported differences in empowerment between lurkers and posters in online patient support

groups. Journal of Medical Internet Research 10, 2 (2008), e18.

[247] Velicer, W. F., Prochaska, J. O., Fava, J. L., Norman, G. J., and Redding, C. A. Smoking ces-

sation and stress management: Applications of the transtheoretical model of behavior change.

Homeostasis in Health and Disease 38 (1998), 216–233.

[248] Vlahovic, T. A., Wang, Y.-C., Kraut, R. E., and Levine, J. M. Support matching and satisfaction

in an online breast cancer support community. In Human Factors in Computing Systems, ACM

(2014), 1625–1634.

BIBLIOGRAPHY 162

[249] Volkow, N. D. Prescription drugs: Abuse and addiction, 2005. [Online: http://www.drugabuse.

gov/sites/default/files/rxreportfinalprint.pdf, accessed 9/4/2014].

[250] Wang, Y.-C., Kraut, R., and Levine, J. M. To stay or leave? The relationship of emotional and

informational support to commitment in online health support groups. In Computer Supported

Cooperative Work, ACM (2012), 833–842.

[251] Warner, M., Chen, L. H., Makuc, D. M., Anderson, R. N., and Minino, A. M. Drug poisoning deaths

in the United States, 1980-2008. NCHS Data Brief, 81 (2011), 1–8.

[252] Wen, M., and Rose, C. P. Understanding participant behavior trajectories in online health support

groups using automatic extraction methods. In International Conference on Supporting Group

Work, ACM (2012), 179–188.

[253] West, R. Time for a change: putting the transtheoretical (stages of change) model to rest. Addic-

tion 100, 8 (2005), 1036–1039.

[254] White, R. W., and Horvitz, E. Cyberchondria: studies of the escalation of medical concerns in web

search. ACM Transactions on Information Systems 27, 4 (2009), 23.

[255] White, R. W., and Horvitz, E. Web to world: Predicting transitions from self-diagnosis to the pursuit

of local medical assistance in web search. In American Medical Informatics Association Annual

Symposium, AMIA (2010), 882.

[256] White, R. W., and Horvitz, E. Studies of the onset and persistence of medical concerns in search

logs. In Research and Development in Information Retrieval, ACM SIGIR (2012), 265–274.

[257] White, R. W., Tatonetti, N. P., Shah, N. H., Altman, R. B., and Horvitz, E. Web-scale pharmacovigi-

lance: listening to signals from the crowd. Journal of the American Medical Informatics Association

20, 1 (2013), 404–408.

[258] Wicks, P., Keininger, D. L., Massagli, M. P., la Loge, C. d., Brownstein, C., Isojarvi, J., and Hey-

wood, J. Perceived benefits of sharing health data between people with epilepsy on an online

platform. Epilepsy & Behavior 23, 1 (2012), 16–23.

[259] Wicks, P., Massagli, M., Frost, J., Brownstein, C., Okun, S., Vaughan, T., Bradley, R., and Hey-

wood, J. Sharing health data for better outcomes on PatientsLikeMe. Journal of Medical Internet

Research 12, 2 (2010), e19.

BIBLIOGRAPHY 163

[260] Wicks, P., Vaughan, T. E., Massagli, M. P., and Heywood, J. Accelerated clinical discovery using

self-reported patient data collected online and a patient-matching algorithm. Nature Biotechnology

29, 5 (2011), 411–414.

[261] Wilson, J. L., Peebles, R., Hardy, K. K., and Litt, I. F. Surfing for thinness: a pilot study of pro–

eating disorder web site usage in adolescents with eating disorders. Pediatrics 118, 6 (2006),

e1635–e1643.

[262] Wilson, K., and Brownstein, J. S. Early detection of disease outbreaks using the Internet. Cana-

dian Medical Association Journal 180, 8 (2009), 829–831.

[263] Wood, E., Samet, J. H., and Volkow, N. D. Physician education in addiction medicine. Journal of

the American Medical Association 310, 16 (2013), 1673–1674.

[264] Xu, R., Supekar, K., Morgan, A., Das, A., and Garber, A. Unsupervised method for automatic con-

struction of a disease dictionary from a large free text collection. In American Medical Informatics

Association Annual Symposium, AMIA (2008), 820.

[265] Yang, C. C., Jiang, L., Yang, H., and Tang, X. Detecting signals of adverse drug reactions from

health consumer contributed content in social media. In workshop on Health Informatics, ACM

SIGKDD (2012).

[266] Yang, C. C., Yang, H., Jiang, L., and Zhang, M. Social media mining for drug safety signal detec-

tion. In workshop on Smart Health and Wellbeing, ACM (2012), 33–40.

[267] Yang, Z., Lin, H., and Li, Y. Exploiting the contextual cues for bio-entity name recognition in

biomedical literature. Journal of Biomedical Informatics 41, 4 (2008), 580–587.

[268] Yates, A., and Goharian, N. ADRTrace: detecting expected and unexpected adverse drug re-

actions from user reviews on social media sites. In Advances in Information Retrieval. Springer,

2013, 816–819.

[269] Yates, A., Goharian, N., and Frieder, O. Extracting adverse drug reactions from forum posts and

linking them to drugs. In workshop on Health Search and Discovery, ACM SIGIR (2013).

[270] Ybarra, M. L., and Eaton, W. W. Internet-based mental health interventions. Mental Health Ser-

vices Research 7, 2 (2005), 75–87.

BIBLIOGRAPHY 164

[271] Yeh, A., Morgan, A., Colosimo, M., and Hirschman, L. BioCreAtIvE task 1A: gene mention finding

evaluation. BMC Bioinformatics 6, Suppl 1 (2005), S2.

[272] Zeng, Q., Kogan, S., Ash, N., Greenes, R., and Boxwala, A. Characteristics of consumer termi-

nology for health information retrieval. Methods of Information in Medicine 41, 4 (2002), 289–298.

[273] Zeng, Q. T., and Tse, T. Exploring and developing consumer health vocabularies. Journal of the

American Medical Informatics Association 13, 1 (2006), 24–29.

[274] Zeng, Q. T., Tse, T., Divita, G., Keselman, A., Crowell, J., Browne, A. C., Goryachev, S., and Ngo,

L. Term identification methods for consumer health vocabulary development. Journal of Medical

Internet Research 9, 1 (2007), e4.

[275] Ziebland, S., Chapple, A., Dumelow, C., Evans, J., Prinjha, S., and Rozmovits, L. How the Internet

affects patients’ experience of cancer: a qualitative study. British Medical Journal 328, 7439

(2004), 564.

insights from patient authored text: from close …nh030tg4542/... · how people self-treat...

Documents