insights from patient authored text: from close …nh030tg4542/... · how people self-treat...
TRANSCRIPT
INSIGHTS FROM PATIENT AUTHORED TEXT:
FROM CLOSE READING TO AUTOMATED EXTRACTION
A DISSERTATION
SUBMITTED TO THE DEPARTMENT OF COMPUTER SCIENCE
AND THE COMMITTEE ON GRADUATE STUDIES
OF STANFORD UNIVERSITY
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
DIANA LYNN MACLEAN
MARCH 2015
http://creativecommons.org/licenses/by-nc/3.0/us/
This dissertation is online at: http://purl.stanford.edu/nh030tg4542
© 2015 by Diana Lynn MacLean. All Rights Reserved.
Re-distributed by Stanford University under license with the author.
This work is licensed under a Creative Commons Attribution-Noncommercial 3.0 United States License.
ii
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Jeffrey Heer, Primary Adviser
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Michael Bernstein
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Christopher Manning
I certify that I have read this dissertation and that, in my opinion, it is fully adequatein scope and quality as a dissertation for the degree of Doctor of Philosophy.
Stuart Card
Approved for the Stanford University Committee on Graduate Studies.
Patricia J. Gumport, Vice Provost for Graduate Education
This signature page was generated electronically upon submission of this dissertation in electronic format. An original signed hard copy of the signature page is on file inUniversity Archives.
iii
Abstract
Millions of people collaborate online with others who share their health concerns. In the process,
these users perform complex health-related tasks, such as differential diagnosis and treatment compar-
ison. The result is a massive, growing and readily accessible corpus of patient authored text (PAT) that
documents patients’ behavior outside of the clinical environment. As a result, PAT can provide insights
into otherwise obscure topics, such as why patients follow only certain parts of a treatment protocol, or
how people self-treat stigmatized conditions such as prescription drug addiction.
Despite the potential value of PAT, attempts to extract medically-relevant insights from it have been
limited. PAT is notoriously noisy and challenging to work with, and there is a dearth of methods and tools
for processing and analyzing it. Moreover, the specific research questions that PAT can support are not
obvious: determining what data PAT encodes, and how, is a challenge in and of itself.
In this thesis, I develop methods for automatically extracting medically-relevant data from PAT. I focus
specifically on the topic of addiction: a stigmatized and prevalent medical condition. Building on close
readings of source text to inform schema induction, data annotation, and feature engineering, I train clas-
sifiers that accurately identify (1) medically-relevant terms in PAT; (2) users’ motivations for participating
in an addiction-related online health community; (3) users’ drugs of choice, and (4) users’ transitions
through relapse and recovery. Using these classifiers to scale analyses to large PAT corpora, I derive
novel insights into the process of addiction, as well as the role that online health communities play in
giving users informational and emotional support and, ultimately, in enabling recovery.
In concert, these contributions both underscore PAT’s latent value for illuminating poorly understood
or clandestine medical topics, and offer viable methods that dramatically improve our ability to realize
this value.
iv
For Angus and June
v
Acknowledgements
My first and foremost thanks to go my advisor, Jeffrey Heer. Jeff has been a wonderful source of
support, knowledge and inspiration during my time at Stanford, and I am deeply indebted to him for
not only supporting my curiosity as my research ventured into uncharted territory, but for doing so with
enthusiasm and confidence. Most importantly, however, Jeff has been an exemplary role model. I am
lucky, grateful, and unquestionably better for having had the opportunity to learn from him, and am proud
to be taking that with me as I start my next great adventure.
There are several people without whom this dissertation would not have been possible: Anna Lem-
bke, who brought with her invaluable medical perspective, and whose enthusiasm, thoughtful insight and
patience were instrumental in making this cross-disciplinary work a reality; Stuart Card, whose inge-
niousness I aspire to, and whose advice I have had the fortune to benefit from on several occasions;
Sonal Gupta, a close friend and collaborator from whom I have learned a great many things, and hope to
learn many more; and Michael Bernstein and Christopher Manning, who have given generously of their
time and advice, helping to steer this work from its inception through its completion.
I am fortunate to have had many wonderful co-conspirators while at Stanford. Sudheendra Hangal,
whose patient support and advice were instrumental in my early graduate school years, has been a
fantastic collaborator and a dear friend. Monica Lam, with whom I worked closely during my first year,
remains an uplifting source of inspiration. The UW IDL group, the Stanford HCI group, and the fantastic
people in the 3B wing have been a fun, dynamic and reliable source of new ideas, feedback and ca-
maraderie, and will be greatly missed. Finally, Jillian Lentz and Monica Niemiec deserve special thanks
for not only providing efficient administrative support, but also for answering even the most frantic of
questions with a smile.
Finally, there are some people without whom I would not be where I am today. The inimitable Margo
Seltzer who, suffice it to say, started this whole business in the first place; David Holland, whose patient
and thorough technical tutelage stands me in good stead to this day; Will Phan, who helped me to see
the real joy in coding; my mother, Heather, who is the embodiment of never giving up; and, of course, my
husband, Isa, who inspires and challenges me to be a little better every day. It makes all the difference.
vi
Table of Contents
1 Introduction 1
1.1 Overview & Focus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Outline of Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2 The Internet and Health 9
2.1 Online Health Information Seeking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1.1 Historical Overview & Current Landscape . . . . . . . . . . . . . . . . . . . . . . 9
2.1.2 What Health Information Do Users Seek Online? . . . . . . . . . . . . . . . . . . 12
2.1.3 Who Seeks Health Information Online? . . . . . . . . . . . . . . . . . . . . . . . 12
Gender . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Age . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Health . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Race . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
Socio-Economic Status & Education . . . . . . . . . . . . . . . . . . . . . . . . . 15
Role (Patient vs. Caregiver) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.1.4 Where Do People Find Health Information Online? . . . . . . . . . . . . . . . . . 15
2.2 Online Health Community Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Modes of Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.2 Who Participates in OHCs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.3 Reasons for Participation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Medium-Based Affordances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Informational Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
Emotional Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2.4 Efficacy of Online Health Forums . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
vii
3 Prior Work on Patient Authored Text 21
3.1 Patient Authored Text (PAT): Introduction & Overview . . . . . . . . . . . . . . . . . . . . 21
3.1.1 Value of PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.2 Challenges of Working with PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Noisiness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Lack of Analysis Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Applicability to Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Syndromic Surveillance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.1 Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2.2 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.3 Filtering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.2.4 Modeling and Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.2.5 Real-World Evaluation Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Pharmacovigilance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.3.1 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.2 Identifying Drugs in PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3.3 Identifying Adverse Events in PAT . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.3.4 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 Named Entity Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4.1 Ontology-Based Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.4.2 Statistical Classifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.5 Thematic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5.1 Condition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.2 Data Source . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.3 Analysis Question . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.5.4 Scaling Thematic Analyses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4 Data 35
4.1 MedHelp Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
4.1.2 Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
viii
4.2 CureTogether Corpus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5 Identifying Medically Relevant Terms in PAT 40
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.1 Medical Term Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.2.2 Consumer Health Vocabularies . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3.1 Preparation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.3.2 Samples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4 Labeling Medically Relevant Terms with the Crowd . . . . . . . . . . . . . . . . . . . . . 45
5.4.1 Task Design and Pilot Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
5.4.2 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
Determining a Gold Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Comparing Turkers Against a Gold Standard . . . . . . . . . . . . . . . . . . . . 49
5.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.4.4 Limitations of the Crowd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.5 Training a Classifier on Crowd-Labeled Data . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5.1 Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.5.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Failure Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.6 Example Applications of ADEPT to PAT . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.6.1 Summarizing Important Medical Content in MedHelp’s Arthritis Forum . . . . . . . 57
5.6.2 Navigating MedHelp’s Substance Abuse Forum (Forum77) . . . . . . . . . . . . . 57
5.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6 What do People Seek on Forum77? 64
6.1 Why Study Addiction? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.1.1 Addiction is Highly Prevalent . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.1.2 Addiction is Highly Stigmatized . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
6.1.3 People are Turning Online for Help with Addiction . . . . . . . . . . . . . . . . . . 66
ix
6.2 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
6.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.3.1 Thematic Analysis Development Dataset . . . . . . . . . . . . . . . . . . . . . . 68
6.3.2 Labeled Training & Testing Dataset . . . . . . . . . . . . . . . . . . . . . . . . . . 68
6.4 Who Posts? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.5 Users’ Objectives in Initiating Discussions . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.6 Classifying Informational vs. Emotional Support . . . . . . . . . . . . . . . . . . . . . . . 70
6.6.1 Training Dataset Annotation and Agreement . . . . . . . . . . . . . . . . . . . . . 70
6.6.2 Classifier Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.6.3 Classifier Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.7 Classifying Updates vs. Non-updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.7.1 Classifier Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.7.2 Classifier Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
6.8 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.8.1 Thomas’ Recipe: An Informal Collaboration . . . . . . . . . . . . . . . . . . . . . 76
6.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.9.1 Limitations and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.10 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7 Identifying Drugs of Choice 83
7.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
7.2 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3 Automatically Identifying Drugs of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3.1 Definition of Drug of Choice . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
7.3.2 Data Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3.3 Classifier Training & Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.3.4 Drug Term Resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.4 Comparing Real-World DOC Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.4.1 Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
7.4.2 Narcotics Anonymous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.4.3 TEDS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
x
7.4.4 DAWN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
7.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
7.6.1 Limitations & Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
7.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8 Quantifying Recovery and Relapse 96
8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.2 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.2.1 The Prescription Drug Abuse Cycle . . . . . . . . . . . . . . . . . . . . . . . . . 97
Withdrawal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Self-Detoxification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
Relapse & Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
8.2.2 In-Person Mutual Help Groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.2.3 Inferring Health State from Social Media . . . . . . . . . . . . . . . . . . . . . . . 99
8.3 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.4 Exploring & Modeling Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.4.1 Transtheoretical Model for Behavior Change . . . . . . . . . . . . . . . . . . . . . 101
8.4.2 Rubric Development . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
8.4.3 A Taxonomy of the Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . 102
8.4.4 Labeling People, not Posts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.5 Characterizing the Phases of Addiction . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.5.1 Sample & Labeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
8.5.2 Activity Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
8.5.3 Linguistic & Content Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
LIWC Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
Days Mentioned and Question Features . . . . . . . . . . . . . . . . . . . . . . . 105
Phase-Specific Term Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
8.5.4 Results: Activity and Linguistic Features . . . . . . . . . . . . . . . . . . . . . . . 106
8.6 Automatically Classifying Addiction Phase . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.6.1 Model & Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
8.6.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
xi
8.6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.7 Automatically Classifying Relapse and Recovery . . . . . . . . . . . . . . . . . . . . . . 111
8.7.1 Identifying Relapse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.7.2 Identifying Recovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.7.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
8.8 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
8.8.1 Use and Efficacy of Forum77 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
8.8.2 Implications for Forum Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
8.8.3 Implications for Addiction Treatment . . . . . . . . . . . . . . . . . . . . . . . . . 118
8.8.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
8.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
9 Conclusion 121
9.1 Contribution Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
9.2.1 Supporting the Methodological Process . . . . . . . . . . . . . . . . . . . . . . . 123
Interface Support for Thematic Analysis . . . . . . . . . . . . . . . . . . . . . . . 124
Improved Tools for Annotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Mapping the Limits of the Crowd in PAT Annotation Tasks . . . . . . . . . . . . . 125
9.2.2 PAT Interface Design & Support . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Expose Aggregate Data to Users . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
Support Data Entry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Automatically Construct User Timelines . . . . . . . . . . . . . . . . . . . . . . . 126
9.2.3 Making the Leap to Medical Discoveries . . . . . . . . . . . . . . . . . . . . . . . 126
9.3 Concluding Remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
A ADEPT Supplementary Material 128
B F77 Purpose Supplementary Material 129
C F77 Drug of Choice Supplementary Material 130
D F77 Phase Supplementary Material 136
xii
List of Tables
4.1 Top 40 MedHelp forums ranked by total post count. A ◦ in the Stigmatized column de-
notes our conservative estimate of whether the condition represented by the forum carries
a stigma or is otherwise embarrassing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1 Majority vote at the token level over RN responses. Terms identified by RNs as medically
relevant are shown in bold. Stopwords (e.g.,“and”, “of”) are excluded from the vote. . . . 49
5.2 Turker performance against the RN gold standard. Voting threshold indicates the minimum
number of Turkers who have to annotate a term as medically relevant for it to be included
in the result. Maximum column values are indicated in bold. A corroborative policy of 2+
votes yields high scores across the board, and maximizes F1-score. . . . . . . . . . . . 50
5.3 Annotator performance against the crowd-labeled data set and the gold standards. Maxi-
mum column values are indicated in bold. . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.4 Examples of ADEPT’s misclassifications in the test corpora. . . . . . . . . . . . . . . . . 56
6.1 Summary statistics of a representative sample of online health communities focused on
addiction recovery. We identified sites through Google searches and gathered statistics
(if available) from site pages. Data current as of 3/1/2014. . . . . . . . . . . . . . . . . . 67
6.2 Annotator-derived taxonomy for users’ objectives in initiating a post, with % prevalence in
the 1,000 post labeled sample on the right. Note that 1.) labels are mutually exclusive, 2)
“w/d” stands for “withdrawal”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3 Descriptions and samples of taxonomy labels. Samples are synthesized in order to pre-
serve user privacy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6.4 Classifier performance for labeling initiating posts as seeking informational support or
emotional support. Performance scores are averaged over 10 folds. . . . . . . . . . . . . 73
6.5 Classifier performance labeling posts as either update or non-update. Performance scores
are averaged over 10 folds. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
xiii
6.6 Thomas’ Recipe (circa 2001) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
6.7 Thomas’ Recipe (circa 2006) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.1 DOC classifier performance across term categories. The classifier performs best on cor-
rectly spelled, specific drug terms; worst on general drug terms. . . . . . . . . . . . . . . 87
7.2 Examples of DOCs extracted by our CRF classifier. Identified SOA terms are shown in
bold in the context of their originating sentence, and the resolved drug name, generic
name and category are shown on the right. . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.3 Summary of similarities and differences between our Forum77, NA, TEDS and DAWN
datasets. Forum77 is unique in that participation is always voluntary and that users report
only substances that they deem relevant. . . . . . . . . . . . . . . . . . . . . . . . . . . 88
7.4 Alignment of categories across the Forum77, NA, TEDS and DAWN datasets for compar-
ative purposes. Exact category terms from each survey have been preserved in this table
for replicability. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.1 Addiction Phase Taxonomy derived via a thematic analysis. . . . . . . . . . . . . . . . . 103
8.2 Sample phase specific terms for the USING, WITHDRAWING and RECOVERING categories. 106
8.3 CRF performance scores aggregated over 10 runs of 10-fold cross validation, with ran-
domly shuffled input sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
8.4 Performance for identifying relapse events (top) and whether a user’s final state is RECOV-
ERING (bottom). Combined scores across classes are shown in bold. . . . . . . . . . . . 113
8.5 Comparison of activity features for users who are and are not RECOVERING in their last ini-
tiating post. Per-user values are aggregated over USING and WITHDRAWING posts. Statis-
tical significance is determined using Kruskal-Wallis tests (*** p < 0.001) after Bonferroni
corrections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
A.1 The following features are specified when training our CRF. Other features retain their de-
fault values as described at http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/
stanford/nlp/ie/NERFeatureFactory.html . . . . . . . . . . . . . . . . . . . . . . . 128
B.1 Features used to train our purpose classifiers, which distinguish emotional from informa-
tional support seeking, as well as update from non-update posts. . . . . . . . . . . . . . 129
xiv
C.1 Drug term resolution map, manually compiled from classifier output. The i column indi-
cates whether the drug category is included in our analysis in Chapter 7. . . . . . . . . . 130
C.2 The default feature list for Stanford’s NER classifier is at nlp.stanford.edu/nlp/javadoc/
javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html. Here, we list all features
whose default values were changed to train our DOC classifier. . . . . . . . . . . . . . . 134
C.3 Gazette of common substances used as a feature in the DOC classifier. This gazette was
compiled from a range of online resources. . . . . . . . . . . . . . . . . . . . . . . . . . 135
D.1 LIWC features for the three classes in the labeled dataset over initiating posts. Only
statistically significant variables are shown. Statistical significance is determined using
Kruskal-Wallis tests (* p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections
to adjust for family-wise error rate across all 184 variables (includes activity features).
Column c denotes (◦) if the feature is used in our CRF classifier. . . . . . . . . . . . . . . 136
D.2 LIWC features for the three classes in the labeled dataset. Only statistically significant
variables are shown. Statistical significance is determined using Kruskal-Wallis tests (*
p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise
error rate across all 184 variables (includes activity features). Column c denotes (◦) if the
feature is used in our CRF classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
D.3 Activity and content-based features for the three classes in the labeled dataset. Statistical
significance is determined using Kruskal-Wallis tests (* p < 0.05; ** p < 0.005; *** p <
0.001) after Bonferroni corrections to adjust for family-wise error rate across all 184 vari-
ables (includes 160 LIWC variables). Column c denotes (◦) if the feature is used in our
CRF classifier. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
xv
List of Figures
1.1 Our general methodological process. Nodes in grey show avenues for future work sup-
ported by our contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
4.1 Illustrative example of MedHelp and Forum77 content and structure. . . . . . . . . . . . 37
4.2 Summary statistics of Forum77 variables: post volume by month (A), user volume by
month (B), thread length distribution (C), user tenure distribution (D), user initiating post
count distribution (E), and user response post count distribution (F). . . . . . . . . . . . . 38
5.1 Final PAT medical term identification task instructions and interface. Turkers were informed
that their answers would be checked against other Turkers’ in the HIT description on the
MTurk interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Sample sentences labeled by ADEPT, the dictionary, MetaMap, OBA and TerMINE. . . . 54
5.3 Term classification accuracy plotted against logged term frequency in test corpora. Purple
(darker) circles represent terms that are always classified correctly; blue (lighter) circles
represent terms that are misclassified at least once. A LOWESS fit line to the entire data
set (black) shows that most terms are always classified correctly. A LOWESS fit line to the
misclassified points (blue/lighter) shows that classification accuracy increases with term
frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.4 Top 50 terms, ranked by frequency, derived from MedHelp’s Arthritis forum as determined
by ADEPT (left) and OBA (right). Terms unique to their respective portion of the list are
shown in bold. Terms occurring in both lists are linked with a line. The gradient of these
lines show that all co-occurring terms, bar three, are more highly ranked by ADEPT. . . . 58
5.5 A graph showing important terms in Forum77 (nodes), and significant co-occurrence rela-
tionships between them (edges). Node size is proportional to degree, while colors indicate
clusters. Node labels are omitted for legibility; instead, we examine main clusters in-depth
in subsequent figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
xvi
5.6 The largest cluster in Figure 5.5 suggests that discussions frequently involve detoxification
from prescription drugs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.7 The second-largest cluster in Figure 5.5 suggests that discussions frequently pair specific
drugs and the withdrawal symptoms that they cause. . . . . . . . . . . . . . . . . . . . . 60
5.8 The third-largest cluster in Figure 5.8 contains medically relevant terms from Thomas’
Recipe: a user-developed schedule for medication-assisted opioid withdrawal. . . . . . . 61
6.1 Thematic analysis process. Orange edges indicate the iterative component of the analysis. 70
6.2 Normalized transition probabilities and average transition times between consecutive up-
date and non-update posts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
7.1 Drug of choice distributions (% of population using) across the Forum77, TEDS, NA and
DAWN data sets. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
7.2 Prevalence of major opioids in the Forum77 population over time. . . . . . . . . . . . . . 92
8.1 Illustration of how sequence analysis can (1) reduce NA labels by leveraging context from
surrounding posts, and (2) capture relapse events in regressive sequences without requir-
ing the user to explicitly state that she relapsed. . . . . . . . . . . . . . . . . . . . . . . . 104
8.2 Confusion matrix for our CRF classifier aggregated across 10 randomized runs of 10-fold
cross validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
8.3 (a) Normalized transition frequencies between addiction phases (e.g., USING → RECOV-
ERING edges comprise 1.12% of the total transitions in the CRF-labeled data) and (b)
conditional transition probabilities (e.g., the probability of a user moving from USING to
RECOVERING is 4.57%.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
8.4 Distributions of phase lengths. A red bar indicates the median value, while the dark blue
region indicates the middle spread. The light blue region indicates values that fall within
1.5 ∗ the interquartile range of the middle spread. . . . . . . . . . . . . . . . . . . . . . . 112
8.5 Aggregated user transitions from start to end state. Bar widths denote population propor-
tion. For example, 48% of users in our sample relapsed during their tenure on Forum77. 114
9.1 Our general methodological process. Nodes in grey show avenues for future work sup-
ported by our contributions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
xvii
Chapter 1
Introduction
Just keep in mind that whether you recommend online support groups or not, your patients
will use them. There’s no getting around the fact that certain patients in your practice will be-
come as knowledgeable about their conditions as they can. They will also begin to develop
clinical judgment on their own.
– Deborah Grandinetti: Doctors and the Web. Help your patients surf the Net safely [104].
1.1 Overview & Focus
The Internet has revolutionized the way in which people interact with medical knowledge, transforming
its availability, leveling the playing field in terms of who can contribute such knowledge, and facilitat-
ing connections between people with shared health concerns. While to this day accessing and sharing
medical knowledge via traditional resources (e.g., medical practitioners, textbooks, pamphlets, etc.) re-
quires overcoming financial, scheduling and geographic barriers, such frictions are divorced from online
resources. Indeed, the use of the Internet as a health resource is one of its earliest functions: with the
commercialization of the Internet in 1995, patients readily took advantage of the ability to collaborate
with others who shared their health concerns, and the first online health communities (OHCs), in the
form of listservs, came into existence.
Demand for such groups remains high today. Pew’s 2013 Health Online survey [91] reported that
59% of U.S. adults looked online for health information in the last year, and that of these, 16-18% specif-
ically sought to find others who shared their health concerns. Based on the U.S. Census Bureau’s
population estimate for 2013 [9], this comprises some 50-57 million people. Today, thousands of OHCs
1
CHAPTER 1. INTRODUCTION 2
exist, and while their interfaces have become slightly more sophisticated, their underlying functionality of
connecting patients with mutual health interests remains unchanged.
Through participation in online mutual help groups, patients can spend a sizable number of hours
performing complex, health-related tasks. These include differential diagnoses (of either their own or
someone else’s condition), treatment comparison and evaluation, symptom measurement and docu-
mentation, and seeking and providing emotional support. To perform these tasks, patients draw on a
variety of resources: their own experiential knowledge, observations of other community members’ ex-
periences, information sourced from healthcare providers, and the fruits of self-directed research efforts.
The culmination of this effort is a massive, and growing, corpus of data contributed by patients who have
gained not a small degree of clinical expertise in their own condition. Although in some cases these data
are structured (e.g., PatientsLikeMe1 and CureTogether2 collect symptom severity measurements on nu-
merical scales), for the most part OHCs have barely deviated from the original listserv format, meaning
that a large portion of these data exist as free-form text.
We term any medical text authored by patients patient authored text (PAT). PAT contains inherently
valuable content. Foremost, PAT uniquely documents patients’ behavior outside of the clinical environ-
ment. As such, it can host insight into topics that remain obscure in traditional medical data sets, such
as why patients follow only certain parts of a treatment protocol, or how people self-manage conditions
that carry a stigma in the medical profession, like addiction [176,187]. Answers to such questions could
have high-level policy impacts on healthcare systems, potentially affecting both their efficiency and effi-
cacy. PAT may also contain data of immediate medical value. Prior work has leveraged PAT to identify
disease trends [33, 41] and adverse drug events [257]. Through active collaboration, OHC participants
have uncovered novel insights into disease co-morbidities (such as a correlation between asthma and
infertility [40]) and drug-treatment effects (such as the questionable efficacy of lithium as a treatment for
ALS [260]) which have been replicated in subsequent medical trials [97,260]. Finally, medically-relevant
data derived from PAT could be used to both enhance community design as well as support members in
tasks that they already perform, such as polling treatment popularity or sourcing drug reviews.
In spite of the inherent value in PAT and the enormous number of human-intelligence hours invested
in its creation, attempts to leverage PAT have been limited for three main reasons. First, PAT is notoriously
noisy and often incomplete, making it challenging to work with. For example, the fact that authors may
have only partial mastery over medical terminology casts the accuracy of their symptom descriptions
1http://www.patientslikeme.com2http://www.curetogether.com
CHAPTER 1. INTRODUCTION 3
into doubt. Moreover, they may omit important information and their contributions may be infrequent and
irregular.
Second, and closely related, is the dearth of methods, approaches and toolkits for extracting medically-
meaningful data from PAT. Take, for example, the basic problem of identifying medically-relevant terms
in PAT. While well-established toolkits for extracting medical terms from text authored by medical experts
exist, as we show in Chapter 5 their performance on PAT is sufficiently poor that the resulting output is
of dubious analytic value.
Third, the question of whether PAT contains data of medical relevance is contentious. As we discuss
in detail in Chapter 2, medical professionals especially take issue with such claims. Even taking an open-
minded perspective, however, the medical relevance of PAT in relation to a specific research question is
usually unclear, and must be determined empirically. This relevance tends to depend on how well the
research question aligns with users’ motivations for authoring PAT. For example, because people mention
their influenza-like symptoms on social media platforms, Twitter is a viable data source for monitoring
influenza outbreaks [10, 15, 62, 213]. However, Twitter would be a poor data source for comparing drug
dosage efficacy, because people do not consistently tweet drug dosages, schedules, and self-reported
wellness metrics. Determining what medically relevant signals are present in PAT is a challenge separate
from extracting them.
Our goals in this work are twofold: first, to develop methods for extracting a variety of medically-
relevant data from PAT. Second, to uncover medically-meaningful insights through the application of
these methods. To this end, we focus specifically on the topic of addiction, studying Forum77: Med-
Help’s 3 online health community for Addiction & Substance Abuse. Addiction is both highly prevalent,
affecting 16% of Americans ages 12 or older (about 40 million people), which far exceeds the num-
ber of people afflicted with heart disease (27 million), diabetes (26 million), or cancer (19 million) [4],
and highly stigmatized, even within the medical profession [176, 187]. These facts conspire to make
addiction-related PAT a rich source for novel and impactful insights.
Our work draws from and contributes to several fields in Computer Science. From the Human Com-
puter Interaction perspective, we investigate crowdsourcing as a method for large-scale data annotation,
and leverage methodological work on thematic analyses to develop taxonomies of medically relevant
information contained in our PAT data sources. From the Computer Supported Cooperative Work per-
spective, we investigate the types of support that users give and receive, and analyze on-site behavioral
3http://www.medhelp.org
CHAPTER 1. INTRODUCTION 4
and content features that correlate with successful and unsuccessful participatory outcomes in Forum77.
On the Natural Language Processing side, we evaluate the application and extension of existing statis-
tical classification methods to a variety of PAT information extraction tasks. Finally, to guide the validity
of our work from a medical perspective, we collaborated closely with an addiction specialist: a practicing
psychiatrist who specializes in the topic of substance use disorders.
1.2 Contributions
In concert, this thesis contributes a viable, multi-stage approach for finding and extracting data of medi-
cal relevance from PAT. The specific contributions of this thesis are:
Targeted literature reviews that serve both to illuminate the landscape of related work as well as con-
textualize our own work. In particular, we review:
Online health seeking behavior: via a cross-disciplinary literature review, we first synthesize an
overview of the demographics, methods and motives of people who seek health information online.
Next, we narrow our focus to the specific topic of OHC participation, exploring users’ reasons for
participation as well as whether and how such participation is beneficial (Chapter 2).
Prior work analyzing patient authored text: we conduct an extensive review of literature utilizing
PAT as a primary data source, including work on pharmacovigilance, syndromic surveillance, entity
extraction and thematic analyses (Chapter 3). To our knowledge, this review is the first compre-
hensive synthesis and summary of data sources, methods, goals and outcomes of prior work that
utilizes PAT as a primary data source.
Methods for extracting medically-relevant data from PAT. Our characteristic methodology, illustrated
in Figure 1.1, moves through human categorization and labeling of data to automatic extraction and
analysis. Accordingly, our methods comprise multiple stages, including inductive content analysis, data
annotation, feature engineering, classifier training and result analysis. Our specific contributions are:
A method for crowdsourcing medically-relevant term annotation in PAT. Having medical ex-
perts annotate data is both costly and slow. We show that for the task of identifying medically-
relevant terms in PAT, a crowd of non-experts yields annotations comparable in quality to those
submitted by medical professionals (Chapter 5).
CHAPTER 1. INTRODUCTION 5
Data-driven annotation rubrics describing what users seek when they initiate posts on Forum77
(Chapter 6), as well as the phases of addiction that users exhibit on Forum77 (Chapter 8). These
rubrics, educed via thematic analyses of Forum77 content, serve as novel contributions in their
own right as well as reusable guides for data annotation.
A novel analysis of behavioral and linguistic features that correlate with each phase of ad-
diction. The results of this feature space analysis (Chapter 8) give novel insight into how the
psychologically and physiologically distinct phases of addiction correspond with Forum77 users’
behavior and linguistic usage. They are also a valuable resource for feature design and engineer-
ing.
Trained classifiers that accurately extract medically-relevant data from PAT. We train classi-
fiers that accurately extract medically-relevant terms (Chapter 5), addictive drugs of choice (Chap-
ter 7), phase of addiction at the time of writing (Chapter 8) and the type of support that a user is
seeking when she initiates a thread (Chapter 6) from PAT. These classifiers are novel in function.
We make them freely available to support future work and comparisons in this area.
Labeled Data (auto)
Medical Discovery
PAT interface design
application
Content Schema
Labeled Data
(human) Classifier Features PAT
close reading
annotation training
Insights Process-ed Data
schema revision
processing & analysis
tuning
Future Work
108 Figure 1.1: Our general methodological process. Nodes in grey show avenues for future work supportedby our contributions.
Medically-relevant insights on Addiction. Our classification methods allow us to scale our analyses
to the entire Forum77 population. Some of the resulting insights are, to the best of our knowledge, novel
to both the Computer Science and the Addiction literature. These insights include the discovery that:
CHAPTER 1. INTRODUCTION 6
Users actively collaborate on developing highly effective medication-assisted withdrawal
treatment protocols. The most prevalent example of this is Thomas’ Recipe, a detailed protocol
for medication-assisted opiate withdrawal that has evolved on Forum77 over the course of several
years (§ 6.8.1).
The Forum77 population is comprised almost entirely of people struggling with prescription
opioid abuse, making it strongly distinct from traditionally surveyed drug-using populations. Our
results evidence that such populations are not well covered by existing medical research methods.
While relapse is common, chances of a user leaving Forum77 in the state of RECOVERING
are favorable. Although different methodological approaches make comparison with real-world
treatments difficult, our results suggest that Forum77 is an effective self-detoxification resource.
Active participants are more likely to leave Forum77 in a state of RECOVERING. Such users
participate significantly more frequently than those who leave in a state of ¬ RECOVERING, even
when they are USING and WITHDRAWING. This resonates with prior research that shows that
increased participation in the traditional mutual help group Alcoholics Anonymous correlates with
sustained sobriety [190,223].
1.3 Outline of Thesis
Chapters 2-4 serve to contextualize our work and give the reader a framework for reference and eval-
uation. Chapter 2 presents a targeted literature review of online health information seeking. We begin
with a broad overview of online health information seeking (§ 2.1) before focusing on the question of who
participates in OHCs, their motivations for doing so, and the associated benefits and pitfalls of partici-
pation (§ 2.2). Chapter 3 begins with a definition of PAT accompanied by a discussion of its values and
the challenges that it presents (§ 3.1). Next, we synthesize prior work that utilizes PAT as a primary data
source, including syndromic surveillance (§ 3.2), pharmacovigilance (§ 3.3), Named Entity Recognition
(§ 3.4) and Thematic Analyses (§ 3.5). Chapter 4 describes the data sets that we use in our work: the
MedHelp corpus (§ 4.1), which includes the Forum77 data set, and the CureTogether corpus (§ 4.2).
While PAT contains a wealth of information, it is inherently noisy, and requires text mining techniques
to extract data of value. In Chapter 5, we address one of the most basic problems of this sort: identifying
medically-relevant terms in PAT. After discussing related work (§ 5.2) and data preparation (§ 5.3), we
CHAPTER 1. INTRODUCTION 7
explore the feasibility of replacing experts with non-expert crowds in medical term annotation tasks
(§ 5.4). Next, we show that a conditional random field (CRF) model trained on crowd-labeled data
dramatically outperforms state of the art medical term annotation tools (§ 5.5). Finally, we demonstrate
the effectiveness of our approach through applying our classifier to large PAT corpora (§ 5.6). While
our results demonstrate the efficacy of our approach, we find that the extracted data are too broad for
deriving insights on specific medical conditions. We narrow our focus to the topic of addiction, one of the
most urgent public health issues of the day.
Understanding why people participate in Forum77 is a precursor to more targeted analyses. Chap-
ter 6 poses the question, “what do people seek on Forum77?”. We first motivate studying the topic of
addiction (§ 6.1), before discussing related work (§ 6.2) and data preparation (§ 6.3). Next, we present the
process and result of a thematic analysis of users’ motivations for initiating Forum77 discussions (§ 6.5).
Congruent with prior work, driving motivations are the seeking of informational and emotional support.
In terms of informational support, we find that users primarily seek explicit medical advice on prescrip-
tion opioids. In the emotional support category, the update post, in which users log their progress but
request no feedback, is highly prevalent. We train machine learning classifiers to distinguish emotional
from informational support-seeking (§ 6.6), as well as update from non-update posts (§ 6.7). Finally, we
present and discuss the results of applying our classifiers to the entire Forum77 data set (§ 6.8 & § 6.9).
Chapter 7 establishes whether the Forum77 population is similar to traditionally surveyed drug-using
populations in terms of drugs of choice (DOCs). We first discuss related work (§ 7.1) as well as our
data preparation and sampling (§ 7.2). Next, we present our method for automatically extracting users’
DOCs from Forum77 initiating posts (§ 7.3), which comprises data annotation, classifier training and term
resolution. We then detail how we compare our classifier-derived Forum77 DOC distribution with those
from three traditionally-surveyed drug-using populations (§ 7.4). Among other things, our results (§ 7.5)
indicate that Forum77 is used primarily by people struggling with prescription opioid use disorders, rather
than by people using traditionally-abused substances such as alcohol, cocaine and marijuana (§ 7.5).
Finally, we discuss the implications and opportunities revealed by these results (§ 7.6).
Chapter 8 focuses on the topic of the cycle of abuse, a well-known concept whose stages and
transitions, to the best of our knowledge, have never been quantified. Drawing on the addiction literature,
we first describe the phases of drug abuse and define key terminology (§ 8.2), and then describe our data
preparation and sampling (§ 8.3). Next, building on the well known Transtheoretic Model for Behavioral
Change [203], we develop a taxonomy describing the phases of addiction as they are expressed on
CHAPTER 1. INTRODUCTION 8
Forum77 (§ 8.4). We then analyze a variety of behavioral and content-based features in order to identify
features that discriminate between the phases USING, WITHDRAWING and RECOVERING (§ 8.5). Next,
we present our statistical classifier for identifying addiction phase (§ 8.6), and discuss how this enables
us to identity important sequences in the process of addiction, such as relapse and recovery (§ 8.7).
Aggregating these events across the entire Forum77 membership base indicates, amongst other results,
that although relapse is common, reaching a state of RECOVERING prior to leaving the forum is likely
(§ 8.7.3).
In Chapter 9, we reiterate the main contributions of this thesis (§ 9.1), and outline challenges for
future work (§ 9.2), and offer our concluding thoughts (§ 9.3).
Chapter 2
The Internet and Health
Millions of people around the world seek health information online, and have been doing so since the
earliest days of the Internet [166]. But who are these people, and what do they seek? Our goal in this
chapter is to provide readers with a contextual backdrop against which to interpret our work. Drawing
on prior work from Computer Science, Medical Informatics and Medicine, we first describe online health
information seeking in general (§ 2.1), beginning with an historical overview before investigating what
kinds of information people seek, who seeks this information, and where. Next, we focus on a specific
subset of online health information seeking: online health community (OHC) participation (§ 2.2). We pay
particular attention to who participates, their motivations for doing so, and potential benefits associated
with participation. Finally, we summarize our findings (§ 2.3) before moving on to a literature review of
prior work utilizing PAT as a primary data source (Chapter 3).
2.1 Online Health Information Seeking
2.1.1 Historical Overview & Current Landscape
When the Internet was commercialized in 1995 [120], widespread consumer adoption brought with it
widespread supply and demand for health information [49]. The Internet made health information more
accessible. An example illustrates: between 1997-1998 the National Library of Medicine (NLM) made
Medline1, a repository of journal citations and abstracts from the biomedical literature previously only
available to medical professionals, publicly accessible online. The number of queries to Medline in-
creased almost threefold, from 7 million to 120 million, with more than 30% of new queries stemming
from consumers [49]. In response, the NLM launched MedlinePlus2, a site hosting information targeted
1https://www.nlm.nih.gov/bsd/pmresources.html2http://www.nlm.nih.gov/medlineplus
9
CHAPTER 2. THE INTERNET AND HEALTH 10
specifically at patients and their families [49]. The move was a roaring success: in the first quarter of
1999, MedlinePlus had 62,638 unique visitors. Since then, this statistic has only increased: in the third
quarter of 2013, the site had ∼81,000,000 unique visitors [172].
In addition to making health information more accessible to consumers, the Internet also broadened
the scope of potential contributors: for the first time, health information could be easily sourced from and
exchanged between patients. Widespread, patient-driven mutual help efforts unfolded simultaneously
with the commercial web. As early as 1997, Salem et al. [215] published an analysis of an online mutual
help group for depression; their study covered 2 weeks’ worth of data and comprised 533 participants.
Even earlier, in 1996 Mayer and Till [166] published a short, interview-based study of a breast cancer
listserv allegedly utilized by thousands of patients. Today, a full 8% of Internet users in the U.S. report
either sharing a personal health experience or posting a related question online [91].
The revolution in how health information was created and shared was received primarily positively by
consumers and sociologists, who celebrated its potential for “democratizing” healthcare and rebalancing
the power dynamic in doctor-patient relationships [182]. The reaction from the medical community was
substantially more turbulent. Early research on online health information seeking raised concerns about
the quality of the information available, as well as patients’ ability to evaluate it critically [49, 156, 181,
182, 199, 210]; some even described the phenomenon as an “epidemic of misinformation” [51]. Indeed,
discussion in the medical literature at the time communicates a strong resistance to the idea of patients
pursuing medical knowledge outside the purview of a medical professional [104, 182]. For example, in
2000 the Journal of Medical Economics initiated a series of articles aimed to educate doctors about
online resources so that they, in turn, could guide their patients through the plethora of available online
health information resources. The first article in the series is titled, “Doctors and the Web: Help your
patients surf the Net safely” [104].
Despite these concerns, analyses of online health seeking behavior indicates that patients are, in
fact, highly skeptical of information presented online and take care to evaluate it critically [21, 105, 156,
178, 182, 205, 209, 225]. Patients tend to mistrust information from websites that appear to be primarily
commercial [92, 182], have unclear sources of information [92], or that seem unprofessional or highly
opinionated [182]. Moreover, rather than taking a single source at face value, patients typically evaluate
information quality by aggregating information from multiple sources [82, 92, 182, 205], and even posing
and testing hypotheses from one information source to the next [225]. That said, online health seekers
are not infallible: cyberchondria – the escalation of a user’s perception of the severity of her medical
CHAPTER 2. THE INTERNET AND HEALTH 11
state as a result of researching it online – has been provably documented, and results in increased
stress levels and potentially unnecessary use of available medical resources [254,255].
Measuring the quality of online health information is challenging. Prior work finds that information
accuracy tends to be high [25, 80]. For example, in an independent evaluation of 4,600 posts on The
Breast Cancer Mailing List3, Esquivel et al. [80] found only 10 (0.22%) posts containing misleading or
incorrect information. Of these, 7 were identified as such by participants and corrected within 4.5 hours.
However, the majority of studies from the medical domain conclude that online health information is of
subpar quality [21,83]. A common point of failure cited is whether the information is “complete” (covers all
medically-relevant details). However, the value of the completeness metric has been called into question:
first, including all relevant medical information might comprise information overload for readers [83].
Second, as patients typically synthesize medical information from a variety of sources [82,92,182,205],
they are likely robust to this. Patients themselves report that in general they have no trouble finding the
information that they need online [92,105].
Despite this, strong resistance, and even condescension, from medical professionals is a common
response to the idea of patients pursuing medical knowledge online. “Many of the participants reported
symptoms that they attributed to using a computer keyboard, so it appeared incongruous that they turned
for help to an activity that required more typing”, quip Culver et al. [64] in an evaluation of an online
health community on Carpal Tunnel Syndrome. Yet even amongst surveyed physicians, there is general
agreement that the result of patients pursuing medical information online is rarely harmful, and in fact can
be moderately beneficial [181, 199]. One explanation may be that the public dissemination of medical
knowledge, which was previously exclusive and difficult to access, challenges medical professionals’
dominance as medical experts [116]. Indeed, many physicians who feel that online health information
seeking negatively impacts the doctor-patient relationship also feel that their patients are challenging
their authority [11, 181]. Today, almost 20 years later, most research agrees that the nature of the
patient-doctor-internet relationship remains in flux, with resistance from the medical field barring potential
synergies from reaching fruition [14,121].
3http://www.bclist.org
CHAPTER 2. THE INTERNET AND HEALTH 12
2.1.2 What Health Information Do Users Seek Online?
Despite the concerns echoed in the medical literature, patients seem disinclined to stage a cyber coup
d’etat against the medical profession. In fact, with the exception of teens [105,209], patients rarely con-
sider the Internet their primary or most important source of medical information [82, 165, 200]. Rather,
information acquired online tends to supplement or complement that acquired through traditional chan-
nels [82, 149, 209], and is often sought for the express purpose of discussing it with a medical practi-
tioner [49, 92, 181, 205, 225]. Moreover, patients have preferences over which types of information they
would prefer to acquire online: respondents to Pew’s 2010 Peer-to-peer Healthcare Survey [90] said
that they would prefer to communicate with medical professionals for information regarding prescription
drugs and alternative treatments, an accurate diagnosis, and recommendations for other medical profes-
sionals and medical facilities. Peers and professionals were rated as equally helpful for practical advice
for day-to-day coping, and peers were rated most helpful for emotional support and quick remedies for
non-urgent, everyday health issues.
Major categories of online information sought by patients include finding disease-specific informa-
tion [49, 91], finding information about particular medial treatments or procedures [91]; and attempting
to diagnose or treat a new condition [49, 91]. In fact, Pew’s 2013 Health Online survey found that 35%
of American adults tried to diagnose a condition using information found online; of these, roughly half
followed up with a medical professional [91]. Cartright et al. [42], who analyzed user search logs sur-
rounding self-diagnosis attempts, observed two patterns: evidence-based searching, in which users
searched for a condition that matched a set of symptoms and risk factors, and hypothesis-based search-
ing, in which given a specific condition, users searched for symptoms and risk factors associated with that
condition. Minor categories of health information sought online include finding information about health
insurance, food and drug safety recalls, interpreting medical test results, information on weight loss [91],
and finding reviews on medical professionals or medical facilities [49]. Finally, an estimated 16-18% of
online health seekers go online specifically to find others who share their health concerns [90,91].
2.1.3 Who Seeks Health Information Online?
Early proponents of the Internet as a health information resource touted its potential as a liberating
technology for those with limited access to traditional health resources [182]. In some ways this is
true: online health information seeking seems to be need-driven, with those suffering from chronic or
CHAPTER 2. THE INTERNET AND HEALTH 13
stigmatized conditions more likely to seek health information online. However, survey-based research
also points to a strong “digital divide” between those who have access to, and are comfortable using
the Internet as a determinant of who searches for health information online. We discuss discriminating
features in detail below.
Gender
Women are more likely to seek health information in general [57], and this trend is mirrored online [37,57,
90–92, 165] despite the fact that men and women have equal access to the Internet [91]. Pew’s Health
Online [91] survey in 2013 estimated that while 53% of all U.S. male adults look for health information
online; the corresponding statistic for U.S. female adults is 64%. Extrapolating from the 2013 U.S.
Census results [9], approximately 55% of online health seekers are female.
In a survey exploring online health information seeking in 2000, Fox & Rainie [205] describe several
differences between men and women’s health seeking behavior. First, while both men and women are
equally likely to search for information in relation to a parent or older relative, women are twice as likely to
search for information on behalf of a child. This is likely a residual of the fact that women spend more time
on child care [192]. Finally, women are more likely to search for information related to specific conditions
(either physical or mental), while men are more likely to search for information related to sensitive topics
and for information on treatment timelines and administration [205].
Age
Studies measuring the age distribution across online health information seekers report that it is relatively
uniform among adults until the age of 65, at which point it declines [37, 91, 165]. This is contrary to the
fact that health needs generally increase with age, and stands in contrast to the age distribution over
offline health information seekers, who tend to be older (mean age 40 vs. 52) [57]. Both Cotten et
al. [57] and Bundorf et al. [37] hypothesize that this discrepancy is due to the fact that younger people
have more access to and experience with using the Internet. In fact, health information seeking is one
of the most common and important online activities for young people [105, 209]. A random-dial survey
of 1,209 respondents aged 15-24 initiated in 2002 by healthcare provider Kaiser [209] found that 75% of
respondents had looked for health information online: more than had downloaded music (72%), played
games (72%), shopped online (50%) and participated in chat rooms (67%). In fact, many young people
consider the Internet to be their primary source of health information [105].
CHAPTER 2. THE INTERNET AND HEALTH 14
Health
People suffering from chronic conditions (e.g., asthma, diabetes etc.) [37,90] and people suffering from
stigmatized conditions (e.g., anxiety, herpes, addiction) [24, 67] are highly likely to seek health informa-
tion online. A casual inspection of our own MedHelp data set (described in Chapter 4) corroborates
this: 8 of the top 20 forums focus on stigmatized or otherwise embarrassing conditions including addic-
tion, Hepatitis C, STDs and HIV (see Table 4.1). Other health characteristics that correlate with online
health information seeking include experiencing a medical crisis within the past year [90], experiencing
a significant change in physical health (e.g., weight loss/gain, smoking cessation) [90], having a rare
condition [90], and having significant barriers to health care (e.g., expense, travel distance) [37].
This suggests that online health seeking behavior is need-driven; however, other evidence also points
to a digital divide: people are more likely to seek health information online if they have health insur-
ance [91] and a regular healthcare provider [165]. Finally, online health seekers self-report as being
healthier than their offline counterparts [57].
Race
Pew’s 2013 Health Online survey [91] reports that 83% of Caucasian adults go online: significantly more
than adult African Americans (74%) and Latinos (73%). Therefore, at a population level, significantly
more Caucasians search for health information online. In a study of online health information seeking
in youth, Rideout et al. [209] observe the same phenomenon, noting that fewer African American and
Hispanic youth in their survey had Internet access at home.
Controlling for adults who use the Internet shows no significant differences in ethnicity between those
who search for health information online and those who do not. In addition, Cotton et al. [57] find
no significant differences in ethnicity between online and offline health seekers. However, Pew’s 2013
Health Online survey [91] highlights some statistically significant, ethnicity-based differences in what
kind of information people seek. For example, Caucasians are more likely than African Americans and
Latinos to look online for a diagnosis and for information pertaining to a specific disease/condition, and
are less likely to search for information on weight loss. African Americans are more likely to conduct
online research on a drug seen in advertising, while Latinos are more likely to search for information on
pregnancy.
CHAPTER 2. THE INTERNET AND HEALTH 15
Socio-Economic Status & Education
Online health seekers tend to have higher income levels than those who do not seek health information
online [57,74,165]. In addition, higher levels of education correlate with online health seeking [57,74,91,
165]. This again suggests a digital divide, with those who have ready Internet access being more likely
to use it as a health information resource. However, o work points out that literacy and language barriers
can prevent people from engaging fully with online health resources [25,49].
Role (Patient vs. Caregiver)
Queries conducted on behalf of someone else (e.g., a child, a parent or other older relative, or a
friend) comprise roughly 50% of all online health inquiries [90, 91]. Usually such “caregivers” are ei-
ther women [205] or parents [91] (or both).
2.1.4 Where Do People Find Health Information Online?
There are myriad ways of accessing health information online. We highlight those most often discussed
in related work.
Search Engines The majority of online health information quests start at a search engine such as
Google4, Yahoo5 or Bing6 [82, 91, 114, 178]. Users iteratively refine their queries based on search re-
sults [82,114], and in the majority of cases are successful in finding the information that they are looking
for [92,114].
Medical Information Portals Sites such as WebMD7 and MedlinePlus8 serve as medical information
portals and are heavily utilized [172]. However, it is rare for online health seekers to have a favorite or
“go-to” information portal [92], and they are rarely the starting point of a user’s search [82].
Online Health Communities Online health communities (OHCs) provide an interactive environment
in which users can seek others familiar with their health concerns and acquire tailored information.
These groups provide social support, information and shared experiences, and can be empowering
4http://www.google.com5http://www.yahoo.com6http://www.bing.com7http://www.webmd.com8http://www.nlm.nih.gov/medlineplus
CHAPTER 2. THE INTERNET AND HEALTH 16
for patients [49]. Prior work indicates that a significant proportion of online health seekers ultimately
participate in an OHC, with estimates ranging from 8% [91] to 16% [90] to 25% [49]. We discuss OHC
participation in depth in the next section.
2.2 Online Health Community Participation
Having outlined the landscape of online health information seeking in general, we now turn to the spe-
cific topic of online health community participation. Where possible, we expand on any relevant details
introduced in § 2.1. We briefly discuss modes of participation (§ 2.2.1), before addressing the question
of who participates in OHCs (§ 2.2.2), why (§ 2.2.3), and what measurable benefits may result from their
participation (§ 2.2.4).
2.2.1 Modes of Participation
OHCs typically comprise environments in which users communicate via posted messages. There are
three primary forms of participation on an OHC: users start new discussions by contributing initiating
posts, and respond to existing discussions with response posts. The third, much overlooked, mode of
participation is lurking, in which users read community-generated content, but never contribute or make
their presence known in any way. Lurking is prevalent in all kinds of online communities [185, 202],
although possibly less so in health-oriented OHCs [186]. Prior work suggests that lurkers’ demographics
and motivations for participating align closely with those of active OHC participants [202]. Moreover,
lurkers and active members derive the same benefits from OHC participation [246]. As defining and
measuring lurking behavior is challenging, we do not discuss it further in our own work, but note here
that capturing lurking behavior is an important avenue for future work.
2.2.2 Who Participates in OHCs?
Demographic analyses of OHC participants similar to those offered in § 2.1.3 are scarce. Unlike the
problem of general health information seeking, OHCs focus on specific medical conditions, many of
which correlate with particular demographic factors. For example, people suffering from breast cancer
tend to be female, and people suffering from Alzheimer’s tend to be older.
However, in concert with research on online health seeking behavior [24], Davison et al. [67] find
that social factors that predict for face-to-face support group seeking correlate with those that predict for
CHAPTER 2. THE INTERNET AND HEALTH 17
online support group seeking. Specifically, conditions that are embarrassing, stigmatized, or disfiguring,
as well as conditions in which a patient’s attitude towards the condition is important in treatment outcome,
lead people to seek the support of others with similar conditions online.
2.2.3 Reasons for Participation
A user’s overarching goal in joining an OHC is to align herself with other people who share her health
concerns [90, 96, 259]. A great deal of literature examines patients’ perceived benefits to OHC partici-
pation. Results tend to fall into one of three categories: (1) medium-based affordances, in which users
cite practical advantages related to the fact that OHCs are online, digital resources; (2) informational
support; and (3) emotional support. We discuss each of these in detail below.
Medium-Based Affordances
By nature of being online and digital, OHCs have several unique characteristics that users view as
advantageous, such as the convenience of having the community be available around the clock [49,60,
162, 205, 275]. Other factors cited include providing access to a wide range of people, information and
experiences [162, 205]; the fact that such information is personalized or tailored [49]; the ability to store
and edit personal narratives [117,162]; and the perception of privacy and anonymity on OHCs [49,105,
205, 270, 275]. Users’ ability to conceal their true identities has also been credited with increasing their
propensity to discuss issues that they would not discuss face-to-face [21,105,149]. Finally, OHC content
is easily searchable, making it easier for patients to browse and filter for suitable people to approach for
help. In an analysis of PatientsLikeMe, Frost et al. [96] conclude that searching for similar users is the
primary motivation behind patients’ sharing their data with each other.
Informational Support
The two most cited benefits of OHC participation are the information and emotional (sometimes called
“social”) support given by the community [36, 47, 86, 122, 131, 148, 149, 162, 211, 243, 250, 258]. Infor-
mational support constitutes the exchange of clinical as well as experiential knowledge relevant to a
particular condition. Typical topics of discussion include treatments and treatment options [47, 96, 258],
symptoms [96,258], preventive care [47] and condition outcomes [47,96]. Patients seek this information
for several reasons, including learning what to expect in the future and how to plan for it [47], informing
decision making (especially related to treatment options) [47, 122], informing day-to-day care/everyday
CHAPTER 2. THE INTERNET AND HEALTH 18
illness management (coping strategies) [60, 90, 122, 131], advice on managing interactions with others
(e.g., from healthcare professionals to colleagues to family) [122], and often for simply acquiring a better
understanding of their condition [47, 122, 149, 258]. As such, OHCs are often a source of information
distinct from and complementary to that typically acquired via medical practitioners.
Emotional Support
In addition to being valuable sources of personalized informational support, OHCs provide users with
an accepting and safe space to vent emotions or discuss uncomfortable topics [149, 243]. Participation
provides users with a means of articulating and making sense of their experience, which they find em-
powering [131,173]. Patients also receive positive affect, encouragement and sympathy from their fellow
community members [60, 131]. Continued participation over time may result in patients taking on new,
supportive roles [164] as well as developing increased optimism towards their situation [211]. OHCs also
provide patients managing serious conditions with unique types of emotional support that are difficult to
acquire elsewhere. For example, patients find that sharing with people like them partially relieves the
burden of care placed on family members who, despite their best intentions, cannot empathize with the
patient’s experience [162, 243]. In addition, patients find that while family and friends tend to try to nor-
malize their (the patient’s) emotions – even when they are inappropriate – online communities challenge
users on inappropriate emotional behavior [243].
2.2.4 Efficacy of Online Health Forums
While patients perceive many benefits to participating in OHCs, measuring the effect of participation on
their health outcomes is difficult, and raises the question of what metrics really matter in health manage-
ment. Would we consider OHC participation effective if it altered disease outcome, or shortened time to
recovery? What about if it imparted a sense of control and wellbeing on patients, improving quality of life,
even if it had no effect on prognosis? Although OHC efficacy is difficult to define, participation has been
shown to promote effective disease management strategies [93,131,148,211], and impart psychosocial
benefits, such as improved ability to cope [148,150,179], improved mood/decreased distress [158,211],
and improved stress management [211]. Moreover, some studies report measurable beneficial effects
on symptoms. Houston et al. [130] found that increased participation in a depression-oriented OHC cor-
related with likelihood of users experiencing a resolution in their condition. Lieberman et al. [158] found
that cancer patients who participated in OHCs reported a decrease in physical pain. However, they note
CHAPTER 2. THE INTERNET AND HEALTH 19
that it is impossible to tell whether this was due to emotional suppression on behalf of their subjects: a
conundrum afflicting the measurement of any subjective symptom.
In general, then, research points to OHC participation having beneficial effects for patients. However,
the jury is still out when it comes to conditions in which negative behaviors are enabled through social in-
teraction with similar patients [21]. While some research finds that OHC participation provides increased
protection and motivation for continuing these behaviors, others conclude that the overall experience may
be a more positive way of dealing with the condition than traditional methods [21,89,179]. For example,
Wilson et al. [261] found that patients learned new binging and purging techniques on both pro-eating
disorder sites9 and pro-recovery sites. However, while they found no significant difference in final health
outcomes between the two groups, users of pro-eating disorder sites experienced a significantly longer
illness duration [261]. On the other hand, group bonds forged through shared secret identity may render
participants less likely to reveal their condition to others, potentially increasing the likelihood that they
will not seek appropriate help [98].
2.3 Summary
Our goal in this chapter was to provide a general overview of the landscape of online health informa-
tion seeking. Beginning with an historical overview (§ 2.1.1), we noted that the advent of the Internet
both made health information more accessible, and made it possible for anybody to contribute health
information online. From patient’s perspective, this was a largely positive improvement, and a great deal
of research supports the notion that little harm, other than cyberchondria, arises from online health in-
formation seeking. The medical community, however, remains somewhat opposed to people pursuing
health information outside of the purview of medical professionals.
In general, online health seekers search for information on specific diseases and diagnoses (§ 2.1.2).
This behavior appears to be partially need-driven, with people suffering from chronic or stigmatized
conditions more likely to seek help online. It is also partially driven by a digital divide, in which those
with ready Internet access and technical skills (i.e., younger, wealthier, and more educated people) are
more likely to seek health information online. One exception to the digital divide pattern is gender: 55%
of online health seekers are female (§ 2.1.3).
9Sites that promote eating disorders.
CHAPTER 2. THE INTERNET AND HEALTH 20
While medical information portals such as WebMD and MedlinePlus are heavily utilized, most health
information quests begin with search engines. A significant proportion (8-25%) of online health informa-
tion seekers eventually participate in an OHC (§ 2.1.4).
The primary reason for participating in an OHC is to find others who share the same health concerns.
While we know that people with stigmatized, or otherwise embarrassing, medical conditions are more
likely to participate in OHCs, we know little else about participant demographics, which are rarely studied.
Given the demographic specificity of many medical conditions (e.g., only women acquire breast cancer),
it is likely that such demographics vary widely across conditions (§ 2.2.2).
Users perceive several benefits to participating in OHCs, which we can categorize into: medium-
based affordances – unique and valuable characteristics that OHCs have by nature of being an online,
digital resource; informational support benefits; and emotional support benefits (§ 2.2.3). While acquir-
ing an objective assessment of an OHC’s efficacy is challenging, participation does appear to impart
psychosocial benefits on users, and may play a role in measurably reducing certain symptoms. How-
ever, the answer to whether OHC participation benefits those afflicted with conditions that are stimulated
by social contact with similar patients, such as eating disorders, is less clear (§ 2.2.4).
Chapter 3
Prior Work on Patient Authored Text
A great deal of prior work utilizes patient authored text (PAT) as a primary data source. Despite this,
to our knowledge no organized review of data sources, methods, goals and outcomes of such work
exists. Our goal in this chapter is to motivate the utility of PAT as a data source and provide a structured
framework over relevant prior work. We first scope our definition of PAT, and discuss its latent value as a
data source as well as the challenges it poses for analysis (§ 3.1). We then review prior work that uses
PAT as a primary data source. This work tends to fall into one of four categories: syndromic surveillance
(§ 3.2), pharmacovigilance (§ 3.3), entity extraction (§ 3.4), and thematic analysis (§ 3.5). Finally, we
summarize our findings (§ 3.6).
3.1 Patient Authored Text (PAT): Introduction & Overview
We define patient authored text (PAT) as any online, medical text authored by someone who is not a
medical professional. A main source of PAT is online health communities (OHCs): online discussion
forums dedicated to specific health topics where people converse in the form of posted messages.
MedHelp1, PatientsLikeMe2 and CureTogether3 are all examples of OHCs. Other sources of PAT include
search logs, social media data (e.g. Twitter4 and Facebook5), personal blogs (e.g. Lady of Lyme6), and
email.1http://www.medhelp.org2http://www.patientslikeme.com3http://www.curetogether.com4http://www.twitter.com5http://www.facebook.com6http://www.ladyoflyme.com
21
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 22
3.1.1 Value of PAT
In the process of creating PAT, users are documenting medical data, making sense of it, prioritizing it, and
synthesizing it in order to solve problems that are relevant to them. This is time intensive work, performed
by agents who may well make up in motivation for what they lack in medical expertise. The resulting text
is rich in medical information, with users recording medical histories, comparing treatments, detailing
symptoms and reasoning about differential diagnoses. At a minimum, this culminates in a unique record
of patient behavior outside of the clinical environment. In the case of stigmatized or otherwise embar-
rassing conditions7, PAT may well contain medical data that is rarely captured elsewhere. For example,
someone struggling with substance abuse might detail her self-prescribed treatment schedule for with-
drawal. In concert, then, PAT comprises a valuable and, in many cases, unique medical data set that is
abundant and readily available. However, PAT is also challenging to work with.
3.1.2 Challenges of Working with PAT
PAT is notoriously difficult to work with. We attribute this to three main reasons: it’s inherent noisiness;
the lack of existing tools for exploring and analyzing it; and the fact that it is often difficult to discern
whether PAT supports any given research question. As we will show in § 3.2-§3.5, prior work tends
to compensate for these challenges by either fixing some variables in a quantitative analysis, or by
conducting small-scale, qualitative analyses.
Noisiness
On the text level, PAT is riddled with spelling and grammatical errors. Compared with expert-authored
text, differences include lexical and semantic mismatches [167,272], mismatches in consumers’ and ex-
perts’ understanding of medical concepts [99,272] and mismatches in descriptive richness and length [99,
167,272]. Consider, for example, the text snippets below, both discussing the predictive value of a family
history of breast cancer. The first snippet is from a medical study by De Bock et al. [68]:
In our study, at least 2 cases of female breast cancer in first-degree relatives, or having at
least 1 case of breast cancer in a woman younger than 40 years in a first or second-degree
relative were associated with early onset of breast cancer.
7In Chapter 2 we note that people suffering from stigmatized conditions are more likely to seek help online and to participate inOHCs.
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 23
The second (unedited) snippet is from the MedHelp breast cancer community:
im 40yrs old and my mother is a breast cancer surivor. i have had a hard knot about an inch
long . the knot is a little movable. the knot has grew a little over the past year and on the
edge closest to my underarm. i am scared and dnt want to worry my mom ..
Moreover, PAT contributors vary widely in their level of medical expertise, command of medical jar-
gon, and the frequency with which they document their experiences online. Most PAT would be consid-
ered unusable from a medical perspective: symptom descriptions, treatments and medical histories are
incomplete, and basic demographic data is absent.
Lack of Analysis Tools
The dearth of tools and methods for mining PAT is likely exacerbated by its noisiness and inconsistencies.
As we discuss in § 3.4, the handful of medical annotation toolkits that do exist are tailored to process
well formatted, expert-authored text (e.g. clinical text, journal publications), and perform poorly on PAT.
As a result, exploring PAT corpora is costly, often requiring researchers to build ad hoc tools for large
scale annotation and extraction. Moreover, as there is no systematic method for exploring the space of
possible approaches to extracting medically useful information from PAT, these ad hoc tools are often
not recyclable.
Applicability to Research Questions
The question of whether or not a PAT corpus supports a given research question is not always obvious,
and depends very much on users’ reasons for authoring the PAT in the first place. Finding a tight
match between a research question and users’ motivations for authoring PAT is crucial for success. For
example, search logs are an appropriate data source for monitoring influenza trends, because users are
motivated to search for their symptoms when they get sick. However, Twitter would be an inappropriate
data source for mining optimal drug dosages, as users tend not to tweet this information en masse.
Determining what data PAT encodes, and how it is encoded, is a costly investment and a separate
challenge from extracting these data.
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 24
3.2 Syndromic Surveillance
Syndromic surveillance – also known as early warning, outbreak detection, or biosurveillance – is the
utilization of health-related data for the purpose of detecting, analyzing and monitoring potential disease
outbreaks [128]. Syndromic surveillance systems do not necessarily utilize online data: the first such
systems were developed to give advanced notice of bioterrorism attacks – in particular, those related to
anthrax – after 9/11, and utilized data such as pharmacy purchases and emergency room visits [35,127,
128,163,207].
However, building syndromic surveillance systems based on PAT is appealing for a number of rea-
sons. The first is users’ proclivity for seeking health information online. For example, it is fairly common
for users to search online for symptoms that they are experiencing, or for conditions that they believe
they might have [156, 254, 256]. As such, data useful for syndromic surveillance tends to accrue natu-
rally, which is preferable to resource-intensive, manual data collection [128, 262]. In addition, collecting
and analyzing online data is fast, enabling advanced (or even real-time) detection of outbreaks, which is
not possible using traditional syndromic surveillance systems [41,100,128].
The best known example of a PAT-based syndromic surveillance system is likely Google Flu Trends8,
which estimates regional flu activity from aggregated search queries [41]. Google Flu Trends can often
identify flu outbreaks a full 1-2 weeks ahead of the CDC, which bases its reports on laboratory and
clinical data [41]. However, the system is vulnerable to anomalous situations, such as outbreaks of new
influenza strains, or particularly bad influenza seasons [38]. Other challenges to syndromic surveillance
systems based on PAT include their vulnerability to changes in users’ online health seeking behavior [38,
262], making it difficult to estimate false positive and false negative rates [262]. Finally, a successful
syndromic surveillance system requires that a sufficient portion of the population of interest is seeking
health information online, which is not always the case. Below, we outline the chief components of
syndromic surveillance projects.
3.2.1 Condition
Typically, syndromic surveillance systems focus on a single medical condition of interest. To date, the
majority of work on syndromic surveillance focuses on influenza [10,15,55,56,62,63,81,100,132,137,
152,198]. Exceptions include investigating general infectious disease outbreaks [33,52,109,262], Lyme
8http://www.google.org/flutrends/us
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 25
Disease [221], and potential foodborne illness outbreaks at restaurants [213]. Syndromic surveillance
techniques have also been used to monitor “non-outbreak” conditions or behaviors. For example, Cooper
et al. [53] use syndromic surveillance techniques to monitor cancer prevalence, while Ayers et al. [18]
use them to track the popularity of electronic nicotine delivery systems (e-cigarettes).
3.2.2 Data Source
People searching for their own symptoms online is a well documented phenomenon [254,257]. Accord-
ingly, search logs are a natural choice for a syndromic surveillance data source, and are successfully
utilized in several instances of prior work [18, 53, 100, 132, 198, 221]. More recently, Twitter has come
to light as another suitable source [10, 15, 62, 63, 152, 213], suggesting that users are prone to men-
tioning when they, or someone around them, falls ill. Rarer data sources include blogs [55, 56], website
access logs [137], and aggregated web data (a combination of search logs, news articles, RSS feeds
etc.) [33, 52]. The latter may be particularly appropriate when trying to survey regions in which the
population of interest has limited education and/or Internet access, such as developing countries.
3.2.3 Filtering
As syndromic surveillance aims to correlate online frequency data with real-world epidemiological trends,
separating signal from noise in the data stream is important. Mentions of a condition do not necessarily
correlate with real-world instances of it [152].
On the simple end of the spectrum is keyword filtering. While common [10, 18, 53, 55, 56, 198, 221],
this approach has several shortcomings. First, relying on a static set of keywords makes the system
susceptible to over-fitting [62], as well as fluctuations in the use of those keywords that are unrelated
to the disease in question [15, 38, 100]. For example, a news story on flu could galvanize a “burst” of
online activity around the topic of flu, even while infection levels in the population remain unchanged.
Finally, although keywords are occasionally picked in a principled and consistent manner (e.g. Ginsberg
et al. [100] pick keywords based on how their frequency fluctuations correlate with regional influenza ac-
tivity), in general selection is arbitrary and prone to human misjudgment. For example, spelling variations
of keywords may be ignored [56].
Other work indicates that more nuanced filtering yields higher quality results [62, 152]. One such
approach is to train statistical classifiers to automatically identify whether a datum is relevant or not. Both
Support Vector Machines (SVMs) [15,193] and other simple bag-of-words models [52,62,63] have been
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 26
successfully leveraged to identify data that correspond to actual influenza infections. Moreover, Lamb
et al. [152] show that using binary classifiers to acquire even more detailed information (specifically,
whether a tweet is about the author or about someone else; whether a tweet represents an awareness
vs. an instance of flu; and whether a tweet is flu-related or not) greatly improves prediction.
3.2.4 Modeling and Prediction
In the case of syndromic surveillance systems that focus on a specific condition (e.g. influenza), linear
models are commonly used to predict trends from the filtered data [10, 62, 63, 100, 152, 198]. Simpler
approaches do not model the filtered data, deeming frequency counts sufficient for reflecting real-world
trends [15,53,55,56,137,221].
The few syndromic surveillance systems attempting to monitor a range of diseases require the ad-
ditional step of identifying specific diseases and geographic locations [33, 52]. Of note is the approach
used by Paul et al. [193], who use topic modeling over their filtered data to acquire distributions of ail-
ments over time. One key advantage of this approach is its ability to surface new diseases without
manual intervention [193].
3.2.5 Real-World Evaluation Dataset
In order to prove the utility of a syndromic surveillance system, a corresponding real-world metric of the
same phenomenon that the system is trying to measure is required for comparison. In the case of in-
fluenza, the CDC frequently releases timely data on cases of influenza-like illnesses detected through its
traditional surveillance systems9. It is likely that the availability of this data set is the driving force behind
the fact that almost all PAT-based syndromic surveillance research focuses on the topic of influenza.
3.3 Pharmacovigilance
Pharmacovigilance is concerned with detecting, monitoring and preventing adverse affects related to
pharmaceutical products. Like syndromic surveillance, traditional Pharmacovigilance systems are of-
fline, typically comprising adverse drug event reports contributed by patients, physicians and pharma-
cists, which are collected by the United States Food and Drug Administration10. Many of the appeals
9http://www.cdc.gov/flu/weekly/fluactivitysurv.htm10http://www.fda.gov
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 27
of making online-based syndromic surveillance systems apply to Pharmacovigilance. However, by con-
struction Pharmacovigilance is a more complex problem: whereas syndromic surveillance systems typ-
ically monitor only a single variable (e.g. how many people have the flu), an adverse event involves at
least two elements: a drug and an adverse effect (e.g. unexpected side effects). Extracting such entities
can be challenging. Unlike syndromic surveillance, prior work on Pharmacovigilance addresses a wide
array of topics and conditions. Below, we discuss important components of Pharmacovigilance systems.
3.3.1 Data Source
In order to leverage the advantages of both scale and relevant content, researchers must find a large
source of PAT where patients typically disclose both which drugs they use as well as adverse events they
experience. Online health communities (OHCs) are rich with discussions disclosing users’ medications,
symptoms and current health states (see Chapter 2). Accordingly, almost all work on PAT-based Phar-
macovigilance utilizes OHC communications as a primary data source [23, 45, 154, 171, 183, 265, 266,
268,269]. To our knowledge, the only exception to this is also arguably the most successful & impactful
work on Pharmacovigilance: White et al [257] successfully utilize search query logs to discover a novel
adverse drug-drug interaction, which was later proved in medical trials.
3.3.2 Identifying Drugs in PAT
Identifying drugs in PAT is challenging: in addition to the many spelling variations of a drug that might
be present in a PAT data set, users may mention several drugs at once, making it difficult to tell which
one is responsible for the adverse event [119]. Accordingly, only a handful of prior Pharmacovigilance
work attempts to explicitly identify drugs related to adverse events in a data set. Yang et al. [265, 266]
extract drug entities using a lexicon, and Yates et al. [269] train a conditional random field (CRF) model
for this purpose. A more common approach is to pre-select a small number of drugs of interest, filter
the original data set for mentions of these drugs, and then attempt to extract adverse events from these
filtered data [154,171,183,257,268].
Chee et al. [45] take a different approach that is worth noting. Rather than attempting to extract {drug,
adverse event} pairs, they use an ensemble classifier over OHC text to identify drugs that are similar
to “watch list” drugs: drugs that already have adverse effects reported by the FDA11. Unfortunately this
method gives no insight into why a drug might be worthy of inclusion on such a list.
11http://www.fda.gov/Safety/MedWatch
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 28
3.3.3 Identifying Adverse Events in PAT
Unlike the drug involved in an adverse event, the adverse events themselves are rarely fixed: typically
a Pharmacovigilance system will attempt to identify any adverse event related to a particular drug. The
list of extracted events is then somehow ranked and given to an human reviewer for analysis. Yang et
al. [265,266] and Lehman et al. [154] identify adverse events in PAT by first compiling lexicons describing
adverse events, and then scoring matches against sliding n-gram windows over PAT sentences.
Yates et al. [269] train a CRF to identify adverse events in PAT. Nikfarjam et al. [183] learn patterns
from text about known adverse drugs; they then apply these patterns to identify new adverse events.
White et al. [257] are the sole exception to extracting an open set of adverse events: rather, they limit
their extraction to a pre-specified set of symptoms related to hyperglycemia. The fact that theirs is
arguably the most successfully Pharmacovigilance system to date suggests that this may be a promising
approach.
3.3.4 Evaluation
In general, evaluating the efficacy of Pharmacovigilance systems is difficult: results typically contain
several known indications; the remaining result elements are either false positives, or true positives that
have yet to be detected via traditional reporting mechanisms. In general, most work serves as a proof of
concept that some adverse drug events manifest in PAT, but there is little quantification of how many and
how strongly different events are represented. Most importantly, determining how to surface the most
relevant true positives remains an area for future work. The work by White et al. [257], which rigorously
demonstrates the existence of the connection between paroxetine, pravastatin and hyperglycemia in
PAT (predating the FDA’s discovery of this), comes closest to proposing a methodology for doing this.
However, their approach lacks flexibility in that both their drugs and adverse events of interest were
predefined.
3.4 Named Entity Recognition
Named entity recognition (NER) is an information extraction task in which the goal is to develop methods
that automatically identify entities of a specific type from text. For example, extracting drugs, adverse
events or symptoms from medical records are all NER tasks. In general, there are two ways to go about
medical NER in PAT: the first is to use state of the art ontology-based tools, which work “straight out of
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 29
the box”, but have poor performance on PAT. The second is to use custom statistical classifiers, which
tend to have high accuracy, but require large volumes of labeled data for training and testing. We discuss
each in detail below.
3.4.1 Ontology-Based Tools
Historically, the go-to tools for medical text annotation are MetaMap12 [17] and, more recently, the Open
Biomedical Annotator (OBA)13 [138]. These tools are ontology-based, meaning that they search through
text for matches against underlying ontologies (curated vocabularies of medical terms and the rela-
tionships between them) [17, 138]. While these tools are capable of fine-grained entity resolution, a
previous study [201] comparing OBA and MetaMap against human annotator performance underscores
two sources of performance error on PAT. The first is ontology incompleteness, which results in low re-
call, and the second is inclusion of contextually irrelevant terms. For example, when restricted to the
RxNORM ontology and semantic-type Antibiotic (T195), OBA will extract both “Today” and “Penicillin”
from the sentence “Today I filled my Penicillin rx”. We observe the same limitations in Chapter 5 and in
later collaborative work with Gupta et al. [112].
Despite recent efforts to develop an ontology suitable for PAT - the open and collaborative Consumer
Health Vocabulary (OAC) CHV [77, 273, 274] - we suspect that tools like MetaMap and OBA will remain
ill-suited to the task of medical term identification in PAT due to structural differences between PAT and
text authored by medical experts that we discuss in § 3.1.2. Finally, in addition to including misspellings
and slang, consumer medical jargon may evolve over time as patients acquire expertise.
3.4.2 Statistical Classifiers
A natural alternative to ontology-based tools are statistical classifiers, which can be trained to extract
biomedical entities of interest with high accuracy. However, such methods require sizable corpora of
labeled data for training and evaluation. This is problematic in the medical domain, as having medi-
cal experts annotate text is both expensive and time consuming. Only a handful of publicly available
annotated medical corpora exist, all of them comprised of annotated biomedical journal publication
abstracts (i.e. expert authored text) [145, 146, 204, 271]. This has had the dual effect of generat-
ing a plethora of prior work demonstrating the efficacy of statistical-based approaches to biomedical
12http://metamap.nlm.nih.gov13http://bioportal.bioontology.org/annotator
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 30
NER [76, 87, 95, 124, 125, 214, 238, 239, 267], but little work that explicitly examines PAT as a potential
data source.
Our work on ADEPT (Chapter 5) is an exception to this. By proving that crowdsourcing medical term
annotations yields labels comparable in quality to experts’, we were able to use crowd-labeled PAT to
train a conditional random field (CRF) classifier to identify medically-relevant terms in PAT. However, we
also find that crowdsourcing is not always a ready solution to PAT annotation tasks (§ 5.7). In Chapter 7
we show that a CRF similarly extracts users’ drugs of choice (preferred substances of abuse) from
PAT from a manually-labeled data set. Later work in collaboration with Gupta et al. [112] shows that
the unsupervised method of lexico-syntactic pattern induction is a promising approach for extracting
specific types of biomedical entities (including symptoms & conditions, as well as drugs & treatments)
from PAT. This approach is also employed by Xu et al. [264], although our method achieves higher
scores. Finally, other work demonstrating entity extraction on PAT includes some of the work discussed
in Pharmacovigilance (§ 3.3), which utilizes CRFs [269] and pattern learning [183] to extract drugs and
adverse events from PAT.
3.5 Thematic Analysis
Thematic analyses (sometimes called content analyses) involve the systematic reading of text with the
goal of eliciting a taxonomy (i.e., an organized collection of significant patterns and themes) that de-
scribes the source data. While some literature outlines standard practice for thematic analyses [30,
110, 236], it is infrequently referenced, and methods utilized in applied research tend to be somewhat
ad hoc. Thematic analysis is the most extensively used qualitative analysis technique [110], and in our
experience, the most common type of analysis applied to PAT, easily outnumbering work on syndromic
surveillance, pharmacovigilance, and Named Entity Recognition. This is likely due to the fact that (1)
thematic analyses are easy to apply: any kind of text is a suitable candidate for thematic analysis, which
is not true for quantitative analyses requiring automated extraction, (2) they are interesting: the results
of a thematic analysis over PAT almost always satisfy our latent curiosity about what people actually do
online in relation to their own health, and (3) they are useful: illuminating corpus content via thematic
analysis is a sensible precursor to higher-investment, quantitative research with automated components.
Below, we discuss compare and contrast prior work that conducts thematic analyses on PAT.
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 31
3.5.1 Condition
There is a great deal of diversity in the conditions studied via thematic analysis. Stigmatized, or oth-
erwise embarrassing, conditions receive notably more coverage than they do in syndromic surveil-
lance, pharmacovigilance or NER. Examples include smoking cessation [180, 197], infertility [160, 161],
HIV/AIDS [61, 177], Huntington’s disease [59], irritable bowel syndrome [58], and post-partum depres-
sion [69]. Underlying the interest in these topics is likely the fact that PAT comprises a unique data
source, especially for stigmatized conditions. Another common topic of study are conditions that have a
behavioral component through which the user can directly influence health outcomes. These include di-
abetes [107,206], smoking cessation [180,197], weight loss and fitness [134,142,217,240], and general
wellness [108].
3.5.2 Data Source
The majority of thematic analyses focus on online health communities (OHCs) [29, 34, 58–61, 101, 134,
160, 161, 177, 206, 220, 233], a natural choice given the volume and richness of OHC text. However,
contemporary PAT thematic analyses also turn to Twitter [69, 135, 142, 170, 180, 218, 234, 235, 240] and
Facebook [22, 71, 107, 197]. Other data sources include search logs [224, 255, 256], email [13], and
personal blogs [217].
3.5.3 Analysis Question
Thematic analyses are, by nature, exploratory, and researchers leverage them to answer a wide ar-
ray of questions. A frequent focus is unearthing users’ reasons for participating in a particular OHC,
which alludes to the question of what role the community plays in helping users meet their health
goals [13, 34, 60, 134, 142, 161, 206, 217, 224, 235]. Results usually contain some interesting insights.
For example, Hwang et al. [134] find that online support groups for weight loss are an important source
of encouragement as well as friendly competition. Relatedly, Kendall et al. [142] find that people use
Twitter to realize their fitness goals in two ways: the first is to publish evidence of having worked out, the
second is to publish a commitment to work out in the future.
The assumed role of many OHCs is to provide users with support. In such cases, a natural question
to ask is what types of support users receive. Results are practically unanimous in noting that users
seek primarily informational and emotional support [58,59,61,153,177,197,224].
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 32
In larger communities that are not necessarily specifically health-oriented (e.g. Twitter and Face-
book), the research question often takes the angle of, “When people mention X on interface Y, what
do they talk about?”. A wide range of health topics have been analyzed on Twitter along these lines,
including insomnia [135], epileptic seizures [170], and concussions [234], often with interesting insights.
For example, Scanfeld et al. [218] find that Tweets mentioning antibiotics often indicate misuse. McNeil
et al. [170] note that most tweets about concussions are in reference to professional sports injuries,
and Bender et al. [22] find that a great deal of breast cancer related discussion on Facebook involves
fundraising.
Finally, a handful of thematic analyses investigate how the experience of an illness can differ by
gender. Makil et al. [160, 161] investigate infertility, paying special attention to the experience of men
whose partners are infertile. Another topic that has received some attention is how coping and self-help
mechanisms differ between people with breast cancer and prostate cancer. In general, these studies find
that men seek more informational support and less emotional support than women do [101,220,233].
3.5.4 Scaling Thematic Analyses
Only a handful of prior work uses thematic analysis results as the foundation for a larger-scale analysis
of PAT. Most notable is that by De Choudhury et al. [69–71], who analyze how postpartum depression
(PPD) is characterized on both Twitter and Facebook. Using their findings, they leverage activity and
linguistic features to build models that can predict the onset of PPD from Facebook data [71]. Also of
note is the work on cyberchondria by White & Horvitz [255, 256], who analyze health-related search
logs and leverage the results of their analysis to model anxiety escalation and predict the transition from
self-diagnosis to seeking medical assistance. Our work on identifying users’ reasons for participating in
Forum77 (Chapter 6) and their transitions through addiction (Chapter 8) also implements scaled thematic
analyses.
Results of scaled thematic analyses are especially powerful, as they provide both a novel, insightful
contextualization of PAT acquired via close reading of a small sample, as well as population-level insights
acquired via extending these results through automated annotation and large-scale analysis. As such,
their rarity is puzzling: it is possible that many researchers who conduct thematic analyses do not have
experience with machine learning. Alternatively, categories derived in thematic analyses may be too
fine-grained for classifier training. A final explanation may be that there is sufficient reward for publishing
the results of a thematic analysis without investing the resources required to scale it.
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 33
3.6 Summary
Our goal in this chapter was to motivate PAT as a data source and present a comprehensive overview of
relevant prior work. We define PAT as any medical text authored by someone who is not a medical pro-
fessional (§ 3.1). PAT, which is often the product of many human hours spent on complex health-related
problem solving, provides a unique window into patient behavior outside of the clinical environment
(§ 3.1.1). However, it is also challenging to work with: PAT is noisy, few tools support mining and explor-
ing it, and determining what medical data PAT encodes, and how, is often unclear upon casual inspection
(§ 3.1.2). This underscores the importance of matching research questions with users’ motivations for
authoring PAT in the first place.
Work utilizing PAT as a primary data source tends to fall into one of four categories. Syndromic
Surveillance (3.2) and Pharmacovigilance (3.3) both involve processing large quantities of data in order
to monitor health-related variables. Entity extraction (3.4), which lies under the purview of Natural Lan-
guage Processing and Machine Learning, concerns the identification of specific entities in PAT. Finally,
on the qualitative side, thematic analyses (3.5) involve close readings of text in order to gain insight into
its structure and content.
PAT-based syndromic surveillance systems have great potential in the toolbox of techniques for the
real-time monitoring of medical conditions. To date, the majority of such systems focus on the topic
of influenza, relying either upon search query logs or Twitter as a primary data source. Filtering the
PAT data stream for relevant entities is crucial for a cleaner signal: although keyword-based filtering is
popular due to its simplicity, training classifiers to discriminate relevant from irrelevant data produces
superior results. Often, frequency counts of these filtered data are compared as-is to real-world gold
standards (most commonly, the CDC ILI data set14), but prior work shows that linear models built on
these data have promising predictive value.
Pharmacovigilance (§ 3.3) is concerned with detecting adverse effects related to pharmaceutical
products in real-time. PAT comprises a potentially valuable, but difficult to work with, data source for Phar-
macovigilance [119]. Most prior work focuses on online health communities (OHCs), although search
logs have also been shown to be a viable data source for web-scale pharmacovigilance [257]. While
many systems demonstrate the ability to identify {drug, adverse event} pairs, automatically identifying
14http://www.cdc.gov/flu/weekly/fluactivitysurv.htm
CHAPTER 3. PRIOR WORK ON PATIENT AUTHORED TEXT 34
which of these pairs (amongst thousands) are important is an unsolved problem. To date, no work has
presented a viable predictive model for adverse drug events.
A great deal of work on biomedical named entity recognition (NER) exists. While ontology-based
MetaMap and Open Biomedical Annotator are the go-to tools for medical term annotation, they per-
form poorly on PAT for two reasons: first, ontologies have insufficient coverage of consumer medical
terminology. Second, their lack of context sensitivity leads to over-inclusion of irrelevant terminology in
results.
Statistical classifiers have been shown to achieve high accuracy in biomedical NER tasks. However,
these approaches are limited by their requirement for a sizable corpus of annotated data for training and
testing. Most research on biomedical NER utilizes existing publicly available data sets, which are based
on abstracts from biomedical journal publications. Consequently, little prior work on biomedical NER
in PAT exists. Exceptions to this include some of the work on Pharmacovigilance [183, 269], and our
work on ADEPT (Chapter 5), identifying drugs of choice (Chapter 7) and using patterns to extract entity
types [112,264].
Thematic analyses over PAT cover a wide array of conditions. However, notably present are stigma-
tized conditions and conditions that have a behavioral component through which the user can influence
health outcomes. Online health communities, Twitter and Facebook are the most commonly utilized PAT
sources for thematic analyses. As thematic analyses are exploratory by nature, they are used to answer
a wide array of questions. Common topics include elucidating users’ reasons for participating in an on-
line community as well as what kinds of support such a community provides. The results of a thematic
analyses can be used to train automatic classifiers, thereby extending the research from a small PAT
sample to large PAT corpora. While prior work demonstrates the power and value in this approach, it is
rare.
In sum, PAT is a valuable data source that has been proven to have clinical value. However, PAT is
challenging to work with. To date, prior work on PAT tends to be either structured in such a way as to
reduce the number of variables being analyzed, making analysis and evaluation easier (e.g. syndromic
surveillance, pharmacovigilance, NER), or focuses on qualitative analyses of PAT (e.g. thematic anal-
yses). Although little work builds automated extraction and analysis on top of the results of a thematic
analysis, prior work, as well as our findings in Chapters 6-8, indicate that this approach yields novel and
valuable insights.
Chapter 4
Data
In this chapter we describe our PAT data sets and define terminology relevant to our work. We first
present our full MedHelp data set (§ 4.1), which we use in our work on medical term identification
(Chapter 5), and define key terminology (§ 4.1.1). We then describe Forum77 (§ 4.1.2), a subset of the
MedHelp data set, which we use for our work on addiction (Chapters 6, 7 and 8). Finally, we present our
CureTogether data set (§ 4.2), which we use as an independent test set in Chapter 5. We acquired our
data sets through research agreements with MedHelp and CureTogether, respectively, who anonymized
the data prior to sharing them.
4.1 MedHelp Corpus
MedHelp1 is an online health community designed to aid users in the diagnosis, exploration, and man-
agement of personal health conditions. The site boasts a variety of tools and services, including over
200 condition-specific user online health communities (OHCs). Our data set comprises all discussions
on all of MedHelp’s forums from 2006 through mid-2011: a total of ∼1,250,000 threads. Table 4.1 lists
the top 40 MedHelp forums by post volume, along with unique contributor counts.
4.1.1 Terminology
Figure 4.1 provides an illustrative example of the composition and content of our MedHelp data. A forum
comprises several threads (or discussions) centered around a specific medical condition (e.g. addiction,
breast cancer, etc.). A thread is composed of an initiating post, in which the initiator posts new content
for the community’s consideration, and a series of response posts, in which respondents contribute to
1http://www.medhelp.com
35
CHAPTER 4. DATA 36
Table 4.1: Top 40 MedHelp forums ranked by total post count. A ◦ in the Stigmatized column denotes ourconservative estimate of whether the condition represented by the forum carries a stigma or is otherwiseembarrassing.
Stigmatized Forum Post count Unique users
◦ Addiction: Substance Abuse 486,972 32,542Maternal & Child 402,065 45,821Pregnancy 18-34 364,475 28,321
◦ Hepatitis C 343,433 14,330◦ HIV Prevention 274,072 27,528◦ Fertility 243,919 17,391
Women’s Health 208,683 76,221Thyroid Disorders 169,713 21,939Multiple Sclerosis 156,500 5,545
◦ STDs 117,462 29,455Neurology 111,671 47,968Dermatology 107,134 47,612Ovarian Cancer 99,954 10,425
◦ Anxiety 98,971 17,373◦ Herpes 89,792 17,061
Undiagnosed Symptoms 82,301 30,741Gastroenterology 79,659 32,694Heart Disease 74,671 22,294
◦ Hepatitis Social 74,412 2,122Pregnancy 35+ 72,414 5,923Eye Care 70,744 18,666
◦ Addiction: Social 68,831 3,253Heart Rhythm 57,001 9,496Child Behavior 45,660 14,961Relationships 42,891 4,724Pain Management 42,099 7,990Breast Cancer 41,197 10,869Urology 37,121 17,351Weight Loss Alternatives 36,925 15,003
◦ Depression 35,614 9,035Chiari Malformation 32,493 1,892Sexual Health 32,269 11,344MedHelp Social 31,800 778Men’s Health 31,712 14,832
◦ Bipolar Disorder 29,057 3,775Back & Neck 28,926 13,082Hepatitis B 28,664 4,621Ear, Nose & Throat 28,439 14,244
◦ Miscarriages 26,043 3,703
CHAPTER 4. DATA 37
the discussion galvanized by the initiating post. When an initiator posts a response to a thread that she
started, this post is called a self-response.
While features for sub-discussions (nested responses) as well as picking a “best response” in a
thread do exist, they are used infrequently and we do not consider them in our analyses. Moreover, we
have neither demographic data (age, geographic location etc.) describing MedHelp users nor page view
data describing lurking (reading without posting – see § 2.2.1) behavior.
ADD/ADHD Addiction (Forum77)
Allergies – Food Allergy
Arthritis Asthma Autism
Back & Neck Bipolar Disorder
Bone Cancer Breast Cancer Breastfeeding
Cancer Carpal Tunnel Syndr.
Celiac Disease Cerebral Palsy
Cervical Cancer Chemotherapy
the best way? By sparklystars 23 minutes ago
I want to come off 10 percs per day. Is it better to taper, or to go gold turkey???
3
oxycodone By oxyuser 5 hours ago
I have been taking vics for about 5 years now. At times I have taken as much as 40 a day. I’m sorta on day 3. I took 1 pill y…
12
Suboxone withdrawal By liquid_daisy 10 hours ago
I quit cold turkey off 32mgs of suboxone. Today is day 5 and I’m in a lot of pain. I just want to know how long these withd…
3 10
300 DAYS for LEX!!! By happystar 6/12/2013
Guess who had 10 months clean today!?? LEX, you go girl!!! Great job we are all sooooooo proud of you!
19
Can you withdraw from Lyrica? By fl12abs 6/11/2013
My doctor prescribed Lyrica for chronic
2
Suboxone withdrawal By liquid_daisy 6/12/2012
I quit cold turkey off 32mgs of suboxone. Today is day 5 and I’m in a lot of pain. I just want to know how long these withdrawals will last…? Is there anything I can get OTC that will help??? Thanks.
Boo28 on 6/12/2012
Congrats on the 5 days clean! 32mgs is a high dose to CT, but doable. First, some questions: are you on any other medications? What other w/d symptom…
10 responses
yellowPop on 6/12/2012
hi congrats and keep posting for support. I myself jumped from 44mgs although it wasn’t pretty. Physical w/ds tend to last 10 days to 2 weeks but everyone is diff…
liquid_daisy on 6/12/2012
No diarrhea, just cold sweats. I stay busy so that I don’t let my mind wander. Don’t have much of an appetite, but redbulls seem to help… chugged 4 today alrea…
MEDHELP COMMUNITIES
INITIATING POST
RESPONSES
FORUM77 DISCUSSION THREAD
SELF RESPONSE
Figure 4.1: Illustrative example of MedHelp and Forum77 content and structure.
4.1.2 Forum77
MedHelp’s largest forum is dedicated to the topic of Addiction: Substance Abuse2. We dub this commu-
nity Forum773
Our data set covers all Forum77 content from 2007 to mid-2014 (7.5 years), and comprises 80,529
discussions (740,046 total posts) authored by 51,153 unique users. Figure 4.2 illustrates summary statis-
tics describing content and activity on Forum77. As expected, the volume of response posts correlates
strongly with the volume of initiating posts; moreover, both experience a slight decline from 2009 - 2014
(Figure 4.2 (A)). While the number of new users to Forum77 varies widely each month, the number of
2http://www.medhelp.org/forums/Addiction-Substance-Abuse/show/773All of MedHelp’s forums have a unique identifier, and the Addiction: Substance Abuse community’s is 77. We settled on
Forum77 as a convenient way to refer to this community. To our knowledge nobody within the community refers to it as Forum77.
CHAPTER 4. DATA 38
return users, which comprise the core community base, is more consistent: in any given month there
are between 200 - 300 return users participating in the forum (Figure 4.2 (B)). This is consistent with
user tenure distribution on Forum77: while most users have a tenure of ≤ 1 month, a long tail indicates
several thousand users who have tenure > 1 year (Figure 4.2 (D)). Finally, while some initiating posts
get no responses, most get at least one, and modal thread length is 4 posts (Figure 4.2 (C)).7/26/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
2008 2009 2010 2011 2012 2013200
1,000
10,00020,000
12 24 36 48 60 72
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
12345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
0 5 10 15 20 25 30 35 40 45 50 55 60
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
01,0002,0003,0004,0005,0006,0007,0008,000
0 5 10 15 20 25 30 35 40
Year
Pos
t cou
nt
Initiating
Responding
7/26/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
2008 2009 2010 2011 2012 2013200
1,000
10,00020,000
2007 2008 2009 2010 2011 2012
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
12345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
0 5 10 15 20 25 30 35 40 45 50 55 60
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
01,0002,0003,0004,0005,0006,0007,0008,000
0 5 10 15 20 25 30 35 40
Year
Use
r cou
nt
Return users
New users
7/24/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
12 24 36 48 60 722003004005006007008009001,000
2,0003,0004,0005,0006,0007,0008,0009,00010,000
20,000
12 24 36 48 60 72
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
Use
r cou
nt
Initiating posts per user +
7/24/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
12 24 36 48 60 722003004005006007008009001,000
2,0003,0004,0005,0006,0007,0008,0009,00010,000
20,000
12 24 36 48 60 72
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
01,0002,0003,0004,0005,0006,0007,0008,000
0 5 10 15 20 25 30 35 40
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
1
10
100
1,000
10,00040,000
0 5 10 15 20 25 30 35 40 45 50 55 60Tenure (months)
Use
r cou
nt
+
7/24/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
12 24 36 48 60 722003004005006007008009001,000
2,0003,0004,0005,0006,0007,0008,0009,00010,000
20,000
12 24 36 48 60 72
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
Use
r cou
nt
Responses per user +
7/24/2014 localhost:8081/index_hist.html
http://localhost:8081/index_hist.html 1/2
12 24 36 48 60 722003004005006007008009001,000
2,0003,0004,0005,0006,0007,0008,0009,00010,000
20,000
12 24 36 48 60 72
100200300400500600700800
10 20 30 40 50 60 70 8012345678910
2030405060708090100
2003004005006007008009001,0002,0003,0004,0005,0006,0007,0008,0009,00010,000
20,00030,00040,000
01,0002,0003,0004,0005,0006,0007,0008,000
0 5 10 15 20 25 30 35 40
02,0004,0006,0008,000
10,00012,00014,00016,00018,00020,000
0 1 2 3 4 5 6 7 8 9 10
Thre
ad c
ount
Thread length (# posts) +
A B
C D
E F
Figure 4.2: Summary statistics of Forum77 variables: post volume by month (A), user volume by month(B), thread length distribution (C), user tenure distribution (D), user initiating post count distribution (E),and user response post count distribution (F).
CHAPTER 4. DATA 39
4.2 CureTogether Corpus
CureTogether4 is an online health community that focuses on collecting structured health information
from its members via surveys. The site covers a wide array of medical conditions (589 in our data set),
each associated with a curated collection of symptom, treatment, side effect and cause/trigger terms. By
focusing on collecting structured data, CureTogether circumvents the problem of extracting medically-
relevant information from PAT. However, discussion levels on the site are low: our data set contains
∼3,000 free-text posts on a variety of CureTogether’s medical topics. Despite this, these posts are
detailed and thoughtful and suffice, in Chapter 5, as a suitable PAT source independent from MedHelp.
4http://www.curetogether.com
Chapter 5
Identifying Medically Relevant Terms in
PAT
5.1 Introduction
When we began exploring our MedHelp corpus, we realized that our efforts were severely hampered
by the absence of a good solution to a seemingly simple problem: identifying the medically relevant
terms in PAT. How, for example, might one automatically extract the terms that we have flagged as
medically relevant in the following excerpt from MedHelp’s Addiction: Substance Abuse forum?
So, I’m 62 hours without pills, and its definitely getting worse, I ache all over, the anxiety is
the worst, along with restless legs but I ’m here now, and I’m not sure it can get much worse
so hopefully soon I’ll be out the other side. Last night was horrible. I had around 3 hours
broken sleep, night sweats and the most awful haunting nightmares when I was sleeping.
I’ve taken the l-tyrosine and B6 this morning, I’ll try and force some food down me shortly
and then take the rest of the vitamins.
The ability to distill medically relevant terms from PAT is useful for exploration: it filters out irrelevant
content, allowing for high-level insights into the corpus and facilitating hypothesis generation. More
sophisticated analyses can also be implemented on the extracted terms. The results of co-occurrence
analyses, for example, can improve query expansion and information retrieval over a corpus [194, 219,
245], or can be used to impose additional structure, such as clustering [39] or hierarchical concept
summaries [216], over the source data. In a PAT corpus, significant term co-occurrences could be used
to build a “map” of important links between symptoms and treatments.
40
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 41
Identifying medical concepts in text is a long-standing research challenge that has spurred the devel-
opment of several software toolkits [17]. Those such as MetaMap1 and the Open Biomedical Annotator
(OBA)2 focus primarily on mapping words from text authored by medical experts to concepts in biomed-
ical ontologies. A biomedical ontology is essentially a controlled collection of terms and the hierarchical
relationships between them. Usually, ontological terms are also categorized or typed (e.g., drug, sign or
symptom, medical device, etc.).
Thousands of biomedical ontologies exist, and differ according to the topic or level of specificity
covered by their terms. For example, the MOFEM3 (Emotion Ontology) covers concepts specifically
related to affective phenomena, while SNOMED-CT4 (Systemized Nomenclature of Medicine - Clinical
Terms) covers a broad array of clinical terms. Curating ontologies is a labor intensive process, in which
people must agree on which terms should be included, removed, combined or split, must categorize said
terms, and must define their hierarchical relationships.
Despite recent efforts to develop an ontology suitable for PAT - the open and collaborative Consumer
Health Vocabulary (OAC) CHV [77, 273, 274] - we suspect that tools like MetaMap and OBA will remain
ill-suited to the task of medical term identification in PAT due to structural differences between PAT and
text authored by medical experts. As we note in § 3.1.2, such differences include lexical and semantic
mismatches [167,272], mismatches in consumers’ and experts’ understanding of medical concepts [99,
272] and mismatches in descriptive richness and length [99,167,272]. Finally, consumer medical jargon
may evolve over time as a patient acquires expertise. This would be a challenge for ontologies which
are, by design, inflexible and brittle.
Our goal is to automatically and accurately identify medically relevant terms in PAT. (Note that we do
not attempt to map terms to ontological concepts; we view this as a separate and complementary task.)
As acquiring annotated data sets is a major obstacle to classifier training, we investigate crowdsourcing
as an alternative option to having medical professionals label PAT (§ 5.4). First, we discuss the process
of designing the crowdsourcing task (§ 5.4.1). Next, we compare crowdsourced annotations from non-
experts (Amazon’s Mechanical Turk5 workers (Turkers)) and medical experts (Registered Nurses hired
via ODesk6) (§ 5.4.2). We find that crowdsourcing PAT medical term identification tasks to non-experts
achieves results comparable in quality to those given by medical experts (§ 5.4.3). While this result
1http://metamap.nlm.nih.gov2http://bioportal.bioontology.org/annotator3http://bioportal.bioontology.org/ontologies/MFOEM4http://www.ihtsdo.org/snomed-ct5http://www.mturk.com6http://www.odesk.com
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 42
opens a new avenue for rapid and affordable PAT annotation, not all PAT annotation tasks are amenable
to crowd labeling (§ 5.4.4).
Next, we train a conditional random field (CRF) classifier to automatically identify medically relevant
terms in PAT (§ 5.5). Our classifier, trained on 10,000 crowd-labeled PAT sentences, dramatically out-
performs state-of-the-art annotation tools MetaMap, OBA and TerMINE (§ 5.5.3). We call our classifier
ADEPT (Automatic Detection of Patient Terminology). In an error analysis, we observe that ADEPT
has the most trouble correctly classifying “generic” medical terms (e.g.,pills, medicine, doctor) (§ 5.5.3).
We attribute ADEPT’s success to the suitability of sentence-level context-sensitive learning models, like
CRFs, to PAT medical term identification tasks (§ 5.7).
Finally, we demonstrate ADEPT’s efficacy through applying it to text from our MedHelp corpus (§ 5.6).
First, we compare the top-50 terms extracted from MedHelp’s Arthritis forum by both ADEPT and the
OBA (§ 5.6.1), noting that those recovered by ADEPT are both diverse and richly descriptive of arthritic
conditions, while the majority of those recovered by OBA are spurious. Next, we construct a graph of
co-occurring terms extracted by ADEPT from MedHelp’s Addiction: Substance Abuse forum, Forum77
(§ 5.6.2). The resulting graph suggests that a primary topic of discussion on the forum is withdrawal, and
moreover, that users discuss explicit drugs, especially prescription opioids, on the forum. Our work in
Chapters 6, 7 and 8 further explores Forum77 and confirms that these high-level insights are accurate.
5.2 Related Work
5.2.1 Medical Term Identification
MetaMap, arguably the best-known medical entity extractor, is a highly configurable program that relates
words in free text to concepts in the UMLS Metathesaurus [16,17]. MetaMap sports an array of analytic
components, including word sense disambiguation, lexical and syntactical analysis, variant generation,
and POS tagging. MetaMap has been widely used to process data sets ranging from email to MEDLINE7
abstracts to clinical records [17,31,43].
The Open Biomedical Annotator (OBA) is a more recent biomedical concept extraction tool under
development at Stanford University. OBA is based on MGREP: a concept recognizer developed at the
University of Michigan [138]. Like MetaMap, OBA maps words in free text to ontological concepts; its
7A collection of biomedical publications abstracts. For more information see: http://www.nlm.nih.gov/pubs/factsheets/medline.html
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 43
workflow, however, is simpler, comprising a dictionary-based concept recognition tool and a semantic
expansion component that finds concepts related to those present in the exact text [138].
A handful of studies compare MetaMap and/or OBA to human annotators, and tend to find the
tools wanting. Ruau et al. [212] evaluated automated MeSH annotations on PRoteomics IDEntification
(PRIDE) experiment descriptions against manually assigned MeSH annotations. MetaMap achieved
precision and recall scores of 15.66% and 79.44%, while OBA achieved 20.97% and 79.48%. Pratt and
Yetisgen-Yildiz [201] compare MetaMap’s annotations to human annotations on 60 MEDLINE titles: they
found that MetaMap achieved exact precision and recall scores of 27.7% and 52.8%, and partial preci-
sion and recall scores of 55.2% and 93.3%. They note that several failures result from missing concepts
in the UMLS. This is corroborated in an analysis of 376 patient-defined symptoms from PatientsLikeMe
by Smith and Wicks [226], who found that only 43% of unique terms had either exact or synonymous
matches in the UMLS; of the exact matches, 93% were contributed by SNOMED CT.
In addition to ontological approaches, there are several statistical approaches to medical term iden-
tification. NaCTeM’s TerMINE8 is a domain-independent tool that uses statistical scoring to identify
technical terms in text corpora [94]. Given a corpus, TerMINE produces a ranked list of candidate terms.
In a test on eye-pathology medical records, precision was highest for the top 40 – ranked by C-value –
terms (∼75%) and decreased steadily down the list (∼30% overall). Absolute recall was not calculated,
due to the time-consuming nature of having experts verify true negative classifications in the test corpus.
Recall relative to the extracted term list, however, was ∼97% [94].
As we discuss in Chapter 3, a great deal of prior work has focused on training statistical classifiers
for biomedical named entity recognition (NER) tasks [76,87,95,111,124,125,214,222,238,239,267]. In
general, this work demonstrates good results, indicating that statistical classification methods are more
appropriate for biomedical NER tasks than MetaMap and OBA. However, none of this work utilizes PAT
as a primary data source: statistical classifiers require sizable quantities of labeled data for training and
testing, and to date all available such data sets are based on biomedical publication abstracts [145,146,
204,271].
5.2.2 Consumer Health Vocabularies
A complementary and closely related branch of research to ours is Consumer Health Vocabularies
(CHVs): ontologies that link layman and UMLS medical terminology [85,273]. Supporting motivations for
8http://www.nactem.ac.uk/software/termine
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 44
developing CHVs include: narrowing knowledge gaps between consumers and providers [273,274], cod-
ing data for retrieval and analysis [77], improving the “readability” of health texts for lay consumers [144]
and coding new concepts that are missing from the UMLS [143, 226]. We are currently aware of two
CHVs: the MedlinePlus Consumer Health Vocabulary9, and the open and collaborative Consumer Health
Vocabulary10 – (OAC) CHV – which was included in UMLS as of May 2011.
To date, most research on CHVs has focused on discovering new terms to add to the (OAC) CHV. In
2007, Zeng et al. [274] compared several automated approaches for discovering new “consumer medical
terms” from MedlinePlus query logs. Using a logistic regression classifier, they achieved an AUC of
95.5% on all n-grams not present in the UMLS. More recently, Doing-Harris & Zeng [77] proposed a
computer-assisted update (CAU) system to crawl PatientsLikeMe, suggesting candidate terms for the
(OAC) CHV to human reviewers. By filtering CAU terms by C-value [94] and termhood [274] scores, they
were able to achieve a 4:1 ratio of valid to invalid terms; however, this also resulted in discarding over
50% of the original valid terms. Given the goals of the CHV movement, our CRF model for PAT medical
term identification may prove to be an effective method for generating new candidates terms for CHVs.
5.3 Data
In this section we describe our data preparation and sampling methods. We use samples from our
MedHelp (§ 4.1) data set for comparing crowdsourced vs. expert sourced labels, and for training and
cross-validation of our CRF classifier. We use a sample from our CureTogether (§ 4.2) data set as a
hold-out gold standard for comparing our CRF classifier to state of the art medical term annotation tools.
5.3.1 Preparation
We analyze our data at the sentence level. This promotes a fairer comparison between machine taggers,
which break text into independent sentences or phrases before annotating, and human taggers, who may
otherwise transfer context across sentences. We use Lucene11 to tokenize our corpora into sentences.
For consistency, we excluded sentences from MedHelp forums that we agreed were tangentially
medical (e.g.,“Relationships”), over-general (e.g.,“General Health”), or that contain fewer than 1,000
9http://www.nlm.nih.gov/medlineplus/xml.html10http://consumerhealthvocab.org11http://lucene.apache.org
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 45
sentences. The raw MedHelp data set contains approximately 1,250,000 discussions. After prepara-
tion, the data set comprises approximately 950,000 discussions from 138 forums: a total of 27,230,721
sentences.
5.3.2 Samples
We use the following samples:
MH1K : 1,000 MedHelp sentences sampled uniformly at random; labeled by crowd and experts. We
use this sample to compare expert and crowd labels. We also use the expert labels as a gold standard
for comparing our CRF classifier’s performance against state-of-the-art tools.
MH10K : 10,000 MedHelp sentences sampled uniformly at random; labeled by crowd. We use this
sample to train our CRF classifier to identify medically relevant terms in PAT. We also use it for 10-fold
cross validation of this classifier.
CT1K : 1,000 CureTogether sentences sampled uniformly at random; labeled by experts. We use this
as an independent gold standard for comparing our CRF classifier performance against those of state-
of-the-art tools.
5.4 Labeling Medically Relevant Terms with the Crowd
A common barrier to both training and evaluating medical text annotators is the lack of sufficiently large,
labeled data sets [17,201]. The challenge in building such data sets lies in sourcing medical experts with
enough time to annotate text at a reasonably low cost [201].
Crowdsourcing is the allocation of a series of small tasks (often called micro-tasks) to a “crowd”
of online workers, typically via a web-based marketplace. Crowdsourcing is particularly attractive for
obtaining results faster and at lower cost than other participant recruitment schemes. When the workflow
is properly managed (e.g., via quality control measures such as aggregate voting, or by breaking up tasks
into suitable sub-components such the “find-fix-verify” method proposed by Bernstein et al. [26]) the
combined results are often comparable in quality to those obtained via more traditional task completion
methods [126, 147]. Snow et al. [228] find that non-expert crowds can effectively execute linguistic
annotation tasks (affect recognition, word similarity, textual entailment, temporal ordering, and word
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 46
sense disambiguation) that are typically performed by experts. However, designing a crowdsourcing
task such that quality results are obtained is challenging and requires careful though [26,147].
Replacing medical experts with non-expert crowds would address concerns of time and cost, allowing
us to build labeled PAT data sets quickly and cheaply. To test the viability of this idea, we first design
a crowdsourcing task for medical term identification in PAT (§ 5.4.1). Next, we deploy this task to both
experts (in our case, Registered Nurses, or RNs) and non-experts (Amazon Mechanical Turk workers,
or Turkers), and compare their annotations over a sample of 1,000 sentences (MH1K ) (§ 5.4.2).
5.4.1 Task Design and Pilot Study
Amazon’s Mechanical Turk12 is an online crowdsourcing platform where workers (Turkers) can browse
“human intelligence tasks” (HITs) posted by requesters and complete them for a small payment. We de-
signed a simple interface in which a HIT comprised 100 sentences, each of which was accompanied by
a text box into which Turkers could copy medically relevant terms. Our original prompt simply asked Turk-
ers to copy/paste any terms that seemed medically relevant from each sentence into the accompanying
text box. The resulting data contained several inconsistencies, including:
terms taken out of context: users selected terms that had no medical relevance in the context of
the given sentence, but might have medical connotations in other contexts. E.g., “anxiety” in the
sentence “I apologize if my post created any undue anxiety”.
omission: users would often leave an empty response for a sentence that contained a term that
was clearly medically relevant.
numerical measurement inclusion: some users felt that numbers corresponding to medication
dosages, units of measurement, etc. were relevant, while others did not.
concept granularity and scope: in a sentence such as “I have low blood sugar”, users would not
know whether to select “low blood sugar” or just “blood sugar”.
repetition: if a medically relevant term appeared twice in the same sentence (e.g.,“pain” in “I am
in a lot of pain and the meds don’t seem to help, they just take the edge off the pain if anything”),
some users would extract it only once, and others would extract it each time it appeared.
12http://www.mturk.com
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 47
Prior work shows that the design of a crowdsourcing task and prompt strongly impacts response
quality [147]. In order to arrive at a suitable prompt that produced consistent results, we iterated on our
original version several times, basing our changes on the design principles outlined by Kittur et al. [147].
We discuss pivotal changes below; Figure 5.1 shows our final prompt and interface.
The most problematic inconsistency was terms taken out of context, which amount to unnecessary
false positives. Subjective tasks are especially difficult for crowd workers [147], and the medical term
identification task is inherently subjective. We discovered, however, that making the task seem less
subjective by asking users to tag words/phrases that they thought doctors would find interesting, all but
eliminated this effect.
The next problematic issue was omission, or unnecessary false negatives. We suspected that one
reason Turkers were cheating was because by doing so they could complete the HIT faster. Kittur et
al. [147] note that to acquire accurate results from Turkers, malicious completion and good-faith comple-
tion should require comparable levels of effort. We changed our interface such that each text box had
to contain some value prior to completion of the HIT, and instructed Turkers to type “NA” into text boxes
corresponding to sentences containing no medically relevant concepts. This helped somewhat, but it
is still easier to type “NA” than to copy/paste several terms into a text box. Kittur et al. [147] also note
that signaling to Turkers that their responses will be verified in a believable manner is thought to reduce
invalid responses as well as increase time spent on task. Before accepting the HIT, we informed Turkers
that four other Turkers would be completing the same HIT, and that their response would be rejected if it
disagreed substantially from the others. We enforced this policy. Implementing these changes resulted
in a drastic reduction of omissions.
Explicitly asking users to ignore numerical measurements and providing illustrative examples on
multi-word concepts reduced conflicting incidences of numerical measurement inclusion and concept
granularity to the point where aggregating over Turker responses produced a good result. However,
similar interventions related to issues of repetition had no effect. Ultimately we propagated the “medically
relevant” label to all unlabeled terms in the sentence that matched an extracted term. It is reasonable
to assume that two identical terms should carry the same label in a sentence, and we observed no
instances in which this assumption was violated.
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 48
Instructions (please read to get full credit for this task)
For this HIT, we would like you to extract all words/phrases that are medical concepts from thesentences below. There are 100 sentences; this should take ~15-25 minutes.
To find medical concepts, ask yourself the question: "If I was telling this to my doctor, which wordswould the doctor find interesting?" To simplify things, do not extract numerical values such as age,weight, gender, medication dosage, symptom duration etc. Do extract concepts describing body parts,conditions (and causes and effects of conditions), symptoms, treatments, etc. Remember that somemedically relevant terms are abbreviated (e.g. BS for "blood sugar").
For each sentence, please COPY/PASTE the relevant text EXACTLY (do not re-type it, or correctmisspellings), and SEPARATE each concept with a COMMA. For example:
I gave up smoking 2 weeks ago, and my blood pressure is under control with verapamil (0.5mg twicea day)..smoking, blood pressure, verapamil
For multi-word concepts, include as many words as you can, but make sure that they refer to just ONEconcept. Do not extract overlapping concepts. For example, in the sentence below, the term "bloodsugar" is preferred to "blood".
Shakes in the hands can be symptomatic of low blood sugar.shakes, hand, blood sugar
Finally, many of the sentences will contain no medically relevant concepts. Just enter NA in the boxesin these cases. For example:
You need to take care of yourself before you can take care of someone else.NA
NOTE: you will be able to complete ONLY ONE of these HITs. Please do not attempt to acceptanother hit after completing this one. Have fun!
Submit
Figure 5.1: Final PAT medical term identification task instructions and interface. Turkers were informedthat their answers would be checked against other Turkers’ in the HIT description on the MTurk interface.
5.4.2 Experiment
We use our MH1K data set for this experiment: a uniform sample of 1,000 sentences from the general
MedHelp data set. We deemed 1,000 sufficiently large for an informative comparison between RN and
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 49
Table 5.1: Majority vote at the token level over RN responses. Terms identified by RNs as medicallyrelevant are shown in bold. Stopwords (e.g.,“and”, “of”) are excluded from the vote.
RN 1: shakes in the hands can be symptomatic of low blood sugarRN 2: shakes in the hands can be symptomatic of low blood sugarRN 3: shakes in the hands can be symptomatic of low blood sugar
Result: shakes hands symptomatic blood sugar
Turker responses, but small enough to make expert annotation affordable. We split the sample into 10
groups of 100 sentences.
Our experts comprised 30 RNs from ODesk13, an online professional contracting service. In addition
to the RN qualification, we required that each expert have perfectly rated English language proficiency.
Each expert did one PAT medical term identification task (100 sentences), and each group of 100 sen-
tences was tagged by three experts, who were reimbursed $5.00 for completing the task. All tasks were
completed within two weeks at a cost of $150.00.
Our non-expert crowd comprised 50 Turkers recruited from Amazon’s Mechanical Turk (AMT). We
required that the Turkers have high English language proficiency, reside in the United States, and be
certified to work on potentially explicit content. Each Turker performed a single PAT medical term iden-
tification task (100 sentences), and each sentence group was tagged by five Turkers. The Turkers were
reimbursed $1.20 upon faithful completion of the task. All tasks were completed within 17 hours at a cost
of $60.00.
Determining a Gold Standard
We determine a gold standard for each sentence by taking a majority vote over the RNs’ responses.
Voting is performed at the word level, despite the prompt to extract words or phrases from the sentences.
Table 5.1 illustrates how this simplifies term identification by eliminating partial matching considerations
over multi-word concepts. N-gram terms can be recovered by heuristically combining adjacent words.
Comparing Turkers Against a Gold Standard
To test the feasibility of using non-expert crowds in place of experts, we compare Turker to RN responses
directly, aggregating across all 5 possible Turker voting thresholds. This allows us both to evaluate
13http://www.odesk.com
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 50
Table 5.2: Turker performance against the RN gold standard. Voting threshold indicates the minimumnumber of Turkers who have to annotate a term as medically relevant for it to be included in the result.Maximum column values are indicated in bold. A corroborative policy of 2+ votes yields high scoresacross the board, and maximizes F1-score.
Vote Threshold F1 Precision Recall Accuracy MCC
1 78.45 67.15 94.31 93.96 0.772 84.43 82.53 86.41 96.29 0.823 83.80 91.67 77.18 96.52 0.824 76.61 95.70 63.87 95.46 0.765 59.81 97.99 43.04 93.26 0.62
the quality of aggregated Turker responses against the gold standard and to select the optimal voting
threshold.
5.4.3 Results
Both the RN and the Turker group achieve high inter-rater reliability scores: κ = 0.709 and κ = 0.707
respectively using Fleiss’ Kappa [88], which measures agreement across two or more voters. Table 5.2
compares aggregated Turker responses against the RNs’ gold standard; voting thresholds dictate the
number of Turker votes required for a word to be tagged as “medically relevant”.
F1-score is maximized at a voting threshold of 2. We call this a corroborated vote, and select 2 as
the appropriate threshold for our remaining experiments. Overall, Turker scores are sufficiently high that
we regard corroborated Turker responses as an acceptable approximation for expert judgment.
5.4.4 Limitations of the Crowd
Crowdsourcing medical term identification in PAT allows us to build large, annotated data sets both
cheaply and quickly. Exploring the crowd’s efficacy at other medical entity annotation tasks is an impor-
tant avenue for future work. Here, we offer some anecdotal insights based on our own attempts to get
the crowd to label specific types of medical terms in PAT. We attempted to implement two tasks similar
to that described in § 5.4.1: in the first, we asked Turkers to identify terms referring to symptoms and/or
conditions (e.g.,“cough”, “asthma”, “headache”). In the second task, we asked them to identify terms
referring to drugs and/or treatments (e.g.,“acupuncture”, “Tylenol”, “cough medicine”).
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 51
Although Turkers’ seemed to approach the task earnestly (they spent a reasonable amount of time
on it), the results were surprisingly inconsistent. In fact, some workers defaulted to labeling any terms
that were medically relevant, even though it is unlikely that they had been exposed to the original task
described in § 5.4.1, as more than 6 months had since elapsed. Ultimately, we hypothesized that there
were three factors explaining Turkers’ poor performance:
The first is subjectivity. The task of identifying symptoms or treatments is ambiguous and, in our
experience, more subjective than that of identifying terms that are simply medically relevant. For ex-
ample, do wheelchairs, relaxation classes, birth control or drinking water constitute treatments? Do
sensations, flare-ups, pregnant and being worried constitute symptoms or conditions? The answers to
these questions tend to be “it depends”.
The second is concept scatteredness, which primarily affects the symptom/condition category. Symp-
tom descriptions are often spread across an entire sentence, and Turkers are unsure of how to scope
such concepts. Consider, for example, the phrase “after I took the meds I felt like I’d been hit by a truck”.
Is “felt like I’d been hit by a truck” a symptom? This challenge is also cited by Leaman et al. [154] in work
on mining adverse drug events from user comments on DailyStrength14.
The final factor that likely affected Turker performance was task overlap. The postings of the symptom
and/or condition task and the drug and/or treatment tasks were staggered by a couple of days. However,
we noticed that some people tried to pick out just drugs and/or treatments in a symptoms and/or con-
ditions task, and vice versa. We attribute such mixups to the fact that the same Turkers who had done
the earlier task were also attempting the staggered task, but had habituated to the first task. Allowing
more time to elapse before posting the second task, or preventing Turkers from doing both tasks, should
ameliorate this effect.
We believe that with additional design and iteration, it would be possible to get Turkers to identify
specific types of medical terminology in PAT. For example, a multi-tiered approach such as find-fix-
verify [26] might reduce the level of task subjectivity. Enhancing the interface such that Turkers could
select “core” concepts and then related supporting terms might facilitate accuracy. Refining the task to
make it more specific would likely reap rewards. For example, instead of asking Turkers to “find terms
referring to symptoms or conditions”, they might be asked to “find terms that refer to symptoms related
to the condition Asthma”.14http://www.dailystrength.com
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 52
In sum, however, designing a crowdsourcing task can be a resource intensive process, and this
must be traded off against alternative annotation methods. In our later work on Forum77, our data were
sufficiently small that we elected to annotate it ourselves. However, systematically exploring the design
space of crowdsourcing PAT annotation tasks would likely yield high returns in the long term.
5.5 Training a Classifier on Crowd-Labeled Data
We now turn to the question of training a statistical classifier to identify medical terms in PAT automat-
ically. We describe the models that we both use and compare against (§ 5.5.1), before describing our
experiment design (§ 5.5.2). Next, we present our results (§ 5.5.3), along with a failure analysis of our
classifier, ADEPT. Finally, we discuss our results and the limitations of our approach (§ 5.7).
5.5.1 Models
MetaMap, OBA and TerMINE We use the Java API for MetaMap 201215, running it under three con-
ditions: default; restricting the target ontology to SNOMED CT (a high percentage of “consumer health
vocabulary” is reputedly contained in SNOMED CT [226]), and restricting the target ontology to the
(OAC) CHV. We used the Java client for OBA [138], running it under two conditions: default; and restrict-
ing the target ontology to SNOMED CT, as the OAC (CHV) was not available to the OBA at the time of
writing. For TerMINE, we used the online web service16.
Dictionary A dictionary (or gazette) is one of the simplest classifiers that we can build using labeled
training data. Our dictionary compiles a vocabulary of all words tagged as “medical” in the training data
according to the corroborative voting policy; it then scans the test data and tags any words that match a
vocabulary element. Our dictionary implements case-insensitive, space-normalized matching.
ADEPT: A Conditional Random Field Model Conditional random fields (CRFs) are probabilistic
graphical models particularly suited to labeling sequence data [151]. Their suitability stems from the
fact that they relax several independence assumptions made by Hidden Markov Models; moreover, they
can encode arbitrarily related feature sets without having to represent the joint dependency distribution
over features [151]. As such, CRFs can incorporate sentence-level context into their inference proce-
dure. For example, a CRF can discern that the word “tired” represents a medical term in the sentence,15http://metamap.nlm.nih.gov16http://www.nactem.ac.uk/software/termine
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 53
“I’m feeling so tired, as though I am oxygen deprived.”, but not in the sentence, “I’m tired of feeling as
though I am oxygen deprived.”” The term “oxygen deprived” is medically relevant in both sentences17:
Our CRF training procedure takes, as input, labeled training data coupled with a set of feature defi-
nitions, and determines model feature weights that maximize the likelihood of the observed annotations.
We use the Stanford Named Entity Recognizer package18, a trainable Java implementation of a CRF
classifier, and its default feature set. Examples of default features include word substrings (e.g.,“ology”
from “biology”) and windows (previous and trailing words); the full list is detailed in Appendix A. We refer
to our trained CRF model as ADEPT (Automatic Detection of Patient Terminology).
5.5.2 Design
To test our second hypothesis, we create a crowd-labeled data set comprising 10,000 MedHelp sen-
tences (MH10K ), and a RN-labeled data set comprising 1,000 CureTogether sentences (CT1K ). Using
the procedures described in § 5.4, this cost approximately $600 and $150, respectively. We train two
models – a dictionary and a CRF – on the MedHelp data set (MH10K ), and evaluate performance via
5-fold cross validation; we compare MetaMap, OBA and TerMINE’s output directly. Finally, we compare
the performance of all 5 models against the CureTogether gold standard (CT1K ).
5.5.3 Results
Table 5.3 shows the performance of MetaMap, OBA, TerMINE, the dictionary model and ADEPT on
MH10K , (MH1K and CT1K ). ADEPT achieves the maximum score in every metric, bar recall. Moreover,
its high performance carries over to the Cure Together test corpus, indicating adequate generalization
from the training data. Figure 5.2 provides illustrative examples of the models’ performance on sample
sentences from MH1K .
Failure Analysis
While ADEPT’s results are promising, assessing points of failure is useful for future improvements and
implementations. Figure 5.3 plots term classification accuracy against logged term frequency in both test
corpora. We observe that while most terms are always correctly classified, a number of terms (∼650) are
never classified correctly. Of these, almost all (>90%) appear only once in the test corpora. A LOWESS
17Note: this is actual output from our final classifier.18http://nlp.stanford.edu/software/CRF-NER.shtml
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 54
ADEPT: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma Dictionary: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma MetaMap: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma
OBA: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma TerMINE: it says proliferative ductal hyperplasia without atypia and non-proliferative duct ecstasia without carcinoma
ADEPT: last summer i was at home with my daughter who is now 2
Dictionary: last summer i was at home with my daughter who is now 2 MetaMap: last summer i was at home with my daughter who is now 2
OBA: last summer i was at home with my daughter who is now 2 TerMINE: last summer i was at home with my daughter who is now 2
ADEPT: in my case the woman my husband had an affair with reassured him twice she had no stds Dictionary: in my case the woman my husband had an affair with reassured him twice she had no stds MetaMap: in my case the woman my husband had an affair with reassured him twice she had no stds
OBA: in my case the woman my husband had an affair with reassured him twice she had no stds TerMINE: in my case the woman my husband had an affair with reassured him twice she had no stds
ADEPT: i had a chest xray done and they said there was something in my lung Dictionary: i had a chest xray done and they said there was something in my lung MetaMap: i had a chest xray done and they said there was something in my lung
OBA: i had a chest xray done and they said there was something in my lung TerMINE: i had a chest xray done and they said there was something in my lung
ADEPT: mgmt retail sales not overweight good almost great posture Dictionary: mgmt retail sales not overweight good almost great posture MetaMap: mgmt retail sales not overweight good almost great posture
OBA: mgmt retail sales not overweight good almost great posture TerMINE: mgmt retail sales not overweight good almost great posture
Figure 5.2: Sample sentences labeled by ADEPT, the dictionary, MetaMap, OBA and TerMINE.
Table 5.3: Annotator performance against the crowd-labeled data set and the gold standards. Maximumcolumn values are indicated in bold.
Validation data set Annotator F1 Precision Recall Accuracy MCC Parameters
Crowd-labeledMH10K
MetaMap 32.64 21.88 64.20 70.44 0.24 Default34.97 25.45 55.85 76.83 0.26 SNOMED CT34.88 24.48 60.63 74.75 0.26 CHV
OBA 43.77 30.20 79.53 77.21 0.39 Default43.23 36.15 53.76 84.25 0.35 SNOMED CT
Dictionary 46.18 32.34 80.75 79.02 0.42ADEPT 78.41 82.66 74.59 95.42 0.76
MedHelpgold standardMH1K
MetaMap 37.73 28.03 57.67 77.82 0.29 SNOMED CTOBA 45.78 32.10 79.31 78.04 0.41 SNOMED CTTerMine 42.35 52.67 35.41 88.77 0.37Dictionary 37.30 26.34 63.89 74.98 0.29ADEPT 78.33 82.55 74.53 95.20 0.76
CureTogether goldstandardCT1K
MetaMap 39.12 29.33 58.57 74.13 0.27 SNOMED CTOBA 47.28 33.56 79.91 74.74 0.40 SNOMED CTTerMine 43.09 53.11 36.25 86.43 0.37Dictionary 38.74 27.53 65.35 70.65 0.27ADEPT 77.74 78.82 76.69 93.78 0.74
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 55
4/27/13 4:29 PMAdept Chart
Page 1 of 1http://localhost:8999/scatter.html
1 term
10 terms
100 terms
500 terms
0 1 2 3 4 5 6 7
ln(frequency) of term in test corpora
0
10
20
30
40
50
60
70
80
90
100C
lass
ifica
tion
accu
racy
(%)
Figure 5.3: Term classification accuracy plotted against logged term frequency in test corpora. Purple(darker) circles represent terms that are always classified correctly; blue (lighter) circles represent termsthat are misclassified at least once. A LOWESS fit line to the entire data set (black) shows that mostterms are always classified correctly. A LOWESS fit line to the misclassified points (blue/lighter) showsthat classification accuracy increases with term frequency.
fit to the points representing terms that were misclassified at least once shows that classification accu-
racy increases with term frequency in the test corpora (and by logical extension, term frequency in the
training corpus). As we might expect, over half (∼51%) of the misclassified terms occur with frequency
one in the test corpora. A review of these terms reveals no obvious term type (or set of term types)
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 56
Table 5.4: Examples of ADEPT’s misclassifications in the test corpora.
Frequently Misclassified(FP > 1, FN > 1)
baby, bc, condition, doctor, doctors, drs, health, ice, natural, relief, short, strain,weight
Mostly False Positive(FP > 1, FN ≤ 1)
accident, decreased, drinks, drunk, exertion, external, healthy, heavy, higher, lie,lying, milk, million, pants, periods, prevention, solution, suicidal . . . [37 more terms]
Mostly False Negative(FP ≤ 1, FN > 1)
appointment, clear, copd, hiccups, lack, ldn, massage, maxalt, missed, nurse, physi-cian, pubic, rebound, silver, sleeping, smell, tea, treat, tree, tx . . . [41 more terms]
Infrequently Misclassified(FP ≤ 1, FN ≤ 1)
cravings, generic, growing, hereditary, increasing, lab, limit, lunch, panel, pituitary,position, possibilities, precursor, taste, version, weakness . . . [118 more terms]
likely to be incorrectly classified. Indeed, many are typical words with conceivable medical relevance
(e.g.,gout, aggravates, irritated). Such misclassifications would likely improve with more training data,
which would allow ADEPT to learn new terms and patterns.
It remains to investigate terms that are both frequent and frequently misclassified. Table 5.4 shows
terms from the test corpora that ADEPT misclassifies at least once. Immediately obvious is the pres-
ence of terms that are medical but generic, such as doctor, doctors, drs, physician, nurse, appointment,
condition, and health. These misclassifications likely stem from ambivalence in the training data; indeed,
Yetisgen-Yildiz and Pratt [201] find that human annotators have low certainty over whether to include
general terms such as these in medical term annotation tasks. In either case, specific instructions to
human annotators on how to handle generic terms, or rule-based post-processing of annotations, could
ameliorate such errors.
5.6 Example Applications of ADEPT to PAT
To illustrate ADEPT’s efficacy, we present two applications to PAT corpora. The first is to MedHelp’s
Arthritis forum, with an eye to summarizing its important medical concepts. In this application, we com-
pare ADEPT’s output with OBA’s. Our second application is to Forum77, MedHelp’s Addiction: Sub-
stance Abuse forum, in which our goal is to generate a high-level concept map of its medically relevant
content.
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 57
5.6.1 Summarizing Important Medical Content in MedHelp’s Arthritis Forum
A simple way of summarizing the medical content in a PAT corpus is to simply rank all relevant terms by
frequency, and select the top N . Figure 5.4 compares the top 50 medical terms in MedHelp’s Arthritis
forum as determined by ADEPT and the OBA. (We picked OBA instead of MetaMap due to its superior
performance – see Table 5.2). The terms recovered by ADEPT are both diverse and richly descriptive of
arthritic conditions; in contrast, the majority of terms recovered by the OBA are spurious, and serve only
to demote the rankings of the few relevant terms that it does find.
5.6.2 Navigating MedHelp’s Substance Abuse Forum (Forum77)
A natural way of acquiring a casual overview of a corpus’ content is to visualize both the important
medical terms, as well as significant relationships between them. Including term relationships imparts an
extra layer of insight to the underlying content. For example, if drug terms tend to co-occur in sentences,
then it is likely that users compare drugs in their discussions. On the other hand, if drug terms tend to
co-occur with symptom terms, then discussions likely document which drugs treat specific symptoms.
To acquire a high-level topography of Forum77’s medical content, we first apply ADEPT to the Fo-
rum77 corpus. Filtering out infrequent terms (terms that appear < 10 times in the corpus), we score
connections between remaining co-occurring terms with the G2 metric, which rewards significant (or in-
teresting) co-occurrence relationships over common ones [78]. We then use Gephi19, a tool for graph
analysis and visualization, to explore the results interactively.
Note that what follows is a casual analysis in which we utilize Gephi’s internal filtering and clustering
features to facilitate rapid exploration. Our goal is to illustrate a typical point of departure in exploring
a novel corpus of ADEPT-extracted PAT terms. Figure 5.5 shows a co-occurrence graph over ADEPT-
extracted Forum77 terms, with node labels omitted to illustrate the underlying graph structure. Imme-
diately obvious is the presence of two, large, interlinked clusters (dark and light blue). A third cluster
(dark green) is more independent. We examine each of these clusters in greater detail by filtering out
non-member nodes, and recalculating the graph layout.
Figure 5.6 shows the largest (light blue) cluster with node labels. This cluster appears to detail gen-
eral aspects of addiction related to detoxification: suboxone and methadone are synthetic opioids used
in opioid-replacement therapy; detox and taper are direct detoxification references; many other nodes
19http://gephi.github.io
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 58
!
!!!!!!!!!!!!!!!!!!!!!!!!!!
ADEPT OBA pain have
arthritis pain symptoms doctor
joints arthritis knees like
feet help hands time
swelling years neck symptoms knee right
fingers did ankles work
legs blood tests joint joint good
rheumatologist does diagnosed need
swollen months meds joints
disease test surgery knee
treatment day leg started
shoulder ago spine try
doctor is a inflammation tests
wrists better test left
stiffness hope painful long
diagnosis year arms disease
toes bad fatigue rheumatologist
shoulders diagnosed joint pain here
wrist days bone hands
muscles old arm sure
osteoarthritis weeks foot knees hip doctors
medication normal negative cause positive lot
skin got cold make
Figure 5.4: Top 50 terms, ranked by frequency, derived from MedHelp’s Arthritis forum as determinedby ADEPT (left) and OBA (right). Terms unique to their respective portion of the list are shown in bold.Terms occurring in both lists are linked with a line. The gradient of these lines show that all co-occurringterms, bar three, are more highly ranked by ADEPT.
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 59
Figure 5.5: A graph showing important terms in Forum77 (nodes), and significant co-occurrence rela-tionships between them (edges). Node size is proportional to degree, while colors indicate clusters.Node labels are omitted for legibility; instead, we examine main clusters in-depth in subsequent figures.
detail withdrawal symptoms (anxiety, cramps, body aches, muscle-tremors, muscles-restlessness, etc.).
Overall, this cluster suggests that Forum77 hosts detailed discussions on the process and mechanisms
of opiate withdrawal.
Figure 5.7 illustrates the second-largest (dark blue) cluster. This cluster is almost clique-like, and its
core comprises primarily addictive prescription drugs: oxy (oxycodone), hydro (hydrocodone), xanax,
vicodin, benzo (benzodiazapine) etc. This cluster also details several withdrawal symptoms (tired, chills,
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 60
Figure 5.6: The largest cluster in Figure 5.5 suggests that discussions frequently involve detoxificationfrom prescription drugs.
Figure 5.7: The second-largest cluster in Figure 5.5 suggests that discussions frequently pair specificdrugs and the withdrawal symptoms that they cause.
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 61
Figure 5.8: The third-largest cluster in Figure 5.8 contains medically relevant terms from Thomas’Recipe: a user-developed schedule for medication-assisted opioid withdrawal.
flu, etc.) as well as body parts (head, legs, skin, etc.), suggesting a great deal of discussion around
specific prescription opioids and their associated withdrawal symptoms.
Finally, Figure 5.8 shows the third-largest cluster (dark green). Like Figure 5.7, the structure is clique-
like. Its nodes constitute a combination of withdrawal symptoms (runny nose, general aches, leg cramps
etc.), terms representing wellness activities or supplements (mild exercise, cycling, vitamin b6, zinc,
l-tyrosine etc.), and non-opiate drugs (ativan, imodium, benzodiazepine). In hindsight, it is clear that
this cluster represents medically relevant terms from Thomas’ Recipe: a user-developed schedule for
medication assisted opioid withdrawal that is popular on Forum77. We discuss Thomas’ Recipe in depth
in § 6.8.1.
These casual explorations of co-occurring ADEPT-extracted Forum77 terms suggest that withdrawal
is a primary topic of discussion on the Forum (Figures 5.6, 5.7). Moreover, users discuss specific drugs,
primarily prescription drugs (Figure 5.7). Without prior knowledge of Thomas’ Recipe (§ 6.8.1), guessing
that Figure 5.8 partially represented a detoxification protocol would be difficult, although the nodes opiate
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 62
detox and at-home self-detox might have provided a clue. Overall, our later work in this thesis shows
that these explorations yield accurate, although incomplete, insights into Forum77’s primary content.
5.7 Conclusion
Our work on ADEPT was prompted by the observation that despite the abundance of PAT, tools for
extracting medically relevant content from it are lacking. This, in turn, restricts general exploration and
hypothesis generation over PAT corpora. One major limitation to building such tools is a lack of large,
annotated corpora for training and testing statistical models.
Our first result addresses this by proving that a crowd of non-experts is a sufficient replacement for
medical experts in the PAT medical term identification task (§ 5.4). Through paying careful attention to
existing crowdsourcing design principles, we were able to design a prompt and task that resulted in labels
of comparable quality to those produced by experts (§ 5.4.1). Combined and aggregated according to
a corroborative vote, Turker responses achieve an F1-Score of 84% against our RNs’ gold standard
(§ 5.4.2). As crowds of non-experts are much easier to coordinate than medical experts, this opens
up the option of building large, labeled PAT corpora of high quality both quickly and cheaply. We note,
however, that not all tasks may be suitable to crowd labeling; those that are more subjective or require
specialized knowledge may involve particularly challenging task design (§ 5.4.4).
Next, we addressed the issue of automating the PAT medically relevant term identification task
(§ 5.5). ADEPT, our CRF classifier trained on crowd-labeled data, dramatically outperforms existing
tools MetaMap, OBA and TerMINE (§ 5.5.3). Moreover, ADEPT’s performance carries over to an in-
dependently sourced PAT gold standard from CureTogether. While one limitation of ADEPT is that it
does not identify specific term types (e.g.,drugs, symptoms), it is excellent at finding terms of medical
relevance. This makes it a useful and novel tool for summarizing and exploring PAT corpora (§ 5.6.2).
We attribute ADEPT’s success to the suitability of sentence-level, context-sensitive learning models
like CRFs to PAT medical term identification tasks. Our dictionary, trained on the same data as ADEPT,
achieves high recall because it collects many medical terms from training data, but it achieves low pre-
cision because it cannot discriminate between relevant and irrelevant invocations of these terms. Unlike
ADEPT, for example, the dictionary cannot learn that the word “sugar” is of particular medical relevance
when it co-occurs with the word “diabetes”. The third sentence in Figure 5.2 suggests that context-based
relevance detection may be problematic for MetaMap and OBA, too. In this sentence, the term “case” is
CHAPTER 5. IDENTIFYING MEDICALLY RELEVANT TERMS IN PAT 63
annotated because of its membership in SNOMED-CT as a medically relevant term pertaining either to
a “situation” or a “unit of product usage”.
In concert, our contributions in this chapter constitute an alternative approach to medical term anno-
tation and identification. In Chapter 7 we leverage the lessons learned in this chapter to extract a specific
type of medical term from Forum77 discussions: users’ drugs of choice. First, however, in Chapter 6 we
investigate users’ motivations for participating in Forum77.
Chapter 6
What do People Seek on Forum77?
Forum77 is the largest community on MedHelp, which indicates that it provides something that users
need and find useful. But what do people seek through participation on Forum77? Insight into how and
why users engage with Forum77 is instructional in its own right, but also provides a valuable template
for planning future, targeted explorations of the corpus. Our goal in this chapter is to elucidate users’
motivations for initiating discussions on Forum77.
We first motivate our focus on the topic of addiction (§ 6.1) before covering related work (§ 6.2) and
summarizing the data sets used in this chapter (§ 6.3). Next, we conduct a thematic analysis, developing
a taxonomy of users’ reasons for participation (§ 6.5). In congruence with prior work, the two driving
motivations are seeking emotional support and seeking informational support. Within these categories
are sub-categories specific to the topic of substance abuse, such as seeking information on withdrawal
and expressing concern about relapse. The most prevalent label, accounting for over 30% of all initiating
posts, is the update: a status log devoid of requests for feedback.
Next, we discuss the training and evaluation of two binary statistical classifiers than can distinguish
emotional from informational posts (§ 6.6), and update from non-update posts (§ 6.7). Our classifiers
perform well, achieving F1-scores of 80.12% and 76.54% for emotional vs. informational and update vs.
non-update, respectively.
Finally, we present the results of applying these classifiers to the entire Forum77 corpus (§ 6.8). We
compare and contrast features such as thread longevity and response rates across thread categories.
We also present and discuss Thomas’ Recipe: a highly prevalent informational support artifact on Fo-
rum77 that we came across in the course of our analyses. We conclude that Forum77 serves both as
a user-generated and tested repository of medically-explicit knowledge on managing substance abuse
64
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 65
withdrawal, as well as a public platform where people broadcast their progress as a mechanism for seek-
ing emotional support and encouragement from others. In this latter capacity, Forum77 is similar to the
offline mutual help groups Alcoholics Anonymous (AA) and Narcotics Anonymous (NA); in its information
providing capacity, however, Forum77 is quite distinct, as AA and NA explicitly eschew the sharing of
medical information [133].
6.1 Why Study Addiction?
We focus on the topic of addiction for 3 primary reasons, which we expand on below. The first is that
addiction is highly prevalent. As such, any insights or results that arise from studying addiction could be
useful and impactful to a large number of people. Second, addiction is highly stigmatized. As a result,
people suffering from addiction are likely to turn online for help, and addiction-related PAT is likely to
contain information that is difficult to acquire through traditional medical channels. Finally, people are
turning online en masse for help with Addiction. Forum77 is MedHelp’s largest forum, but, as we show
in Table 6.1, only one of several online forums dedicated to the topic of substance abuse recovery.
6.1.1 Addiction is Highly Prevalent
Drug and alcohol use disorders, in particular the escalating misuse of prescription drugs, present one
of the most pressing public health issues of the day. Addiction affects 16% of Americans ages 12 or
older (about 40 million people), far exceeding the number of people afflicted with heart disease (27
million), diabetes (26 million), or cancer (19 million) [4]. Deaths due to accidental drug overdose now
exceed deaths due to motor vehicle accidents [251]. In 2008, more than 36,000 deaths were due to drug
overdoses; of these, opioid pain reliever (OPR) overdoses accounted for more than heroin and cocaine
combined [3, 249]. Taking into account workplace, criminal justice, and health care costs, the burden of
prescription drug abuse on the U.S. Economy was $56-$57 billion in 2006-2007 [27,115].
6.1.2 Addiction is Highly Stigmatized
Recent medical research argues that drug dependence is a chronic, relapsing and remitting disorder
that behaves just like other chronic illnesses with a behavioral component, such as Type II Diabetes
Mellitus [169]. Despite this, prescription opioid abuse is a highly stigmatized condition: the opinion
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 66
that opioid misuse is a flaw of a person’s moral character, rather than a legitimate medical condition, is
common [187].
This stigma carries into the medical profession. In general, medical professionals feel that addiction
lacks parity with other medical conditions in terms of prestige and importance [176]. In addition, there
is a mutual mistrust between addiction patients, who feel that they are mistreated and stigmatized and
receive poor medical care as a result, and their providers, who find it difficult to evaluate whether patients’
requests for opioids stem from genuine “medically indicated” needs or from addictive behavior [174].
The stigma is compounded by the fact that the most effective treatments for opioid use disorders are
methadone or buprenorphine-assisted replacement therapies, which require patients to continue taking
prescription opioids under the supervision of a medical professional [187]. Finally, as pain treatment
is often the starting point of a longer addiction to prescription opioids, it is common for people with
prescription drug use disorders to acquire their drug of choice via a doctor’s prescription [229,249].
6.1.3 People are Turning Online for Help with Addiction
People with substance use disorders are no exception to the trend of online health forum participation.
Myriad discussion forums focus on addiction recovery and are widely utilized. Table 6.1 describes a
representative sample of these that we curated during a brief search. The result of this is a massive,
growing and (until now) unexamined corpus of text in which users document their experiences with
addiction and their attempts at overcoming it.
6.2 Related Work
Emotional and informational support consistently emerge as the primary reasons for user engagement
in online health communities [36,47,86,122,131,148,149,162,211,243,250,258]. However, little work
attempts to extend analyses of users’ support giving, seeking, or reasons for participation to data sets
that are too large for manual annotation. We discuss this work here, referring the reader to § 2.2.3 for a
thorough discussion of users’ reasons for participation in online health communities, and to § 3.5 for a
summary of prior work on thematic analyses of PAT.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 67
Table 6.1: Summary statistics of a representative sample of online health communities focused on ad-diction recovery. We identified sites through Google searches and gathered statistics (if available) fromsite pages. Data current as of 3/1/2014.
Name Description Members Posts ThreadsJoin
topost
Jointo
read
Forum77medhelp.org/forums/
Addiction-Substance-
Abuse/show/77
Single forum dedicated to re-covery in general.
∼51,153 ∼740,046 ∼80,529 Y N
The Suboxone Talk Zonesuboxforum.com
Multiple forums focused onissues related to Suboxone.
∼11,000 ∼77,000 ∼8,900 Y N
Addiction Recovery Guideaddictionrecovery
guide.org
Collection of resources forassisting recovery; includesonline forum.
N/A 700,000 N/A Y N
Addiction Survivorsaddictionsurvivors.org
Forums focus on opiate, al-cohol, benzodiazepine, andstimulant addiction.
∼15,870 ∼270,000 ∼17,500 Y N
Cyber Recoverycyberrecovery.net
Multiple forums dedicated torecovery in general.
5,078 154,975 23,000 Y Y
Sober Recoverysoberrecovery.com/forums
Multiple forums dedicated toalcoholism and drug abuserecovery.
132,964 >3.5 M 234,311 Y N
Wang et al. [250] successfully use workers on Amazon’s Mechanical Turk1 (Turkers) to quantify the
amount of emotional and informational support contained in both initiating and response posts on Breast-
cancer.org2. They then use this data to train regression models that have correlation scores 0.76 and
0.80 for emotional and informational content, respectively. Investigating whether certain types of support
are important for member retention, they found that receiving high levels of emotional support predicted
for lower dropout risk.
Biyani et al. [28] manually labeled ∼1,000 sentences from the Cancer Survivor’s Network forum3 as
either emotional or informational. An ensemble classifier trained on this data achieved an F1-score of
84% (88% for emotional support, 77% for informational support). Their goal was to determine whether
influential and regular community members differed in terms of the types of support they provided on
the forum. They found that influential members offer significantly more emotional support than regular
community members.
1http://www.mturk.com2http://breastcancer.org3http://csn.cancer.org
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 68
To our knowledge, no other prior work attempts to automatically classify informational and emotional
support in PAT. However, some work does investigate methods for labeling or featurizing these data at
scale. Vlahovic et al. [248] found that Turkers produced good labels for emotional and informational
support on posts from a breast cancer support forum. Finally, both Owen et al. [188] and Alpers et
al. [12] evaluate the efficacy of using LIWC4 to automatically identify emotions expressed in posts on
breast cancer support forums. While both find the tool reasonably accurate, they do not attempt to
analyze users’ motives for posting.
Unlike Wang et al. [250] and Biyani et al. [28], we investigate and discuss a more detailed taxonomy
of users’ reasons for participation. In addition to automatically classifying informational and emotional
support, we are also able to train a classifier to identify a specific sub-category of emotional support
posts: the update. While we leave the analysis of response post content to future work, we do investigate
response levels to different categories of initiating posts.
6.3 Data
For clarity, we briefly summarize the data sets used in this chapter.
6.3.1 Thematic Analysis Development Dataset
We use our Forum77 data set (§ 4.1.2) for this work. For our thematic analysis (§ 6.5), we used ∼1,000
initiating posts sampled uniformly at random for each iteration of the analysis, and evaluated inter-
annotator agreement on a 200-post subsample. With a total of 3 iterations, we used ∼3,000 initiating
posts sampled uniformly at random to conduct the thematic analysis.
6.3.2 Labeled Training & Testing Dataset
We created a data set for labeling and classifier training as follows: first, we curated a sample of initiating
posts from recurring Forum77 users by randomly sampling 200 users who had initiated 5 or more posts.
(We restricted the sample to recurring users in order to ensure a more balanced representation of tax-
onomy labels, as we observed in our thematic analysis (§ 6.5) that certain labels (e.g., support giving)
tend to appear only later in a user’s tenure.) Our 200 sampled users authored ∼32,000 initiating posts,
4http://www.liwc.net
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 69
of which we took a random sample of 1,000 for subsequent coding. To prevent any user from dominating
the sample, we admitted no more than 30 posts per user.
6.4 Who Posts?
Traditional demographic information such as gender, age, race and socioeconomic status is rarely dis-
cernible from Forum77 posts. However, we were able to determine other aspects of identity, namely
whether a user was posting on their own behalf or on behalf of someone else. We noted that most
users initiate posts in which they are the subject; occasionally, however, users initiating posts in which
someone else is the subject. These proxies range from concerned parents, to members congratulating
each other on clean time, to loved ones posting on behalf of an incapacitated member.
We defined the subject of the post to be self if the author is writing about her own addiction, associate
if the author is writing about someone else’s addiction, or n/a if this information is absent or indeterminate.
Two authors labeled our 1,000 initiating post training data sample with the subject label. Inter-annotator
agreement was 92%, with a Cohen’s Kappa of 0.77.
The distribution of subject labels over the sample data set is: 85% self, 8% associate, and 7% n/a.
While most users post on their own behalf, a significant minority post on behalf of another. Moreover,
the number of posts in which the subject was indeterminate was higher than we expected. Such posts
typically consist of social chatter (e.g., talking about sports). As these results do not suggest anything
interesting or novel, we do not pursue this analysis at scale.
6.5 Users’ Objectives in Initiating Discussions
Thematic analyses are frequently used on PAT to identify structure and patterns in user behavior and
user-generated content (§3.5). To develop a taxonomy describing users’ objectives in initiating discus-
sions on Forum77, we use an adapted General Inductive Approach [236]: over the course of read-
ing ∼3,000 posts, two authors iteratively co-developed a taxonomy describing recurrent and emergent
themes in the posts. On each iteration, the authors used the taxonomy to independently label 1,000
randomly sampled posts. They then revised the rubric based on subsequent error analysis and inter-
annotator agreement scores calculated on a 200-post subsample. The authors executed a total of three
iteration cycles. Figure 6.1 illustrates our thematic analysis process.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 70
Thematic Analysis
Schema Sample n=1,000
Label Set#1 n=600
Label Set#2 n=600
Error Analysis
Consult Addiction Specialist
Final Schema
Figure 6.1: Thematic analysis process. Orange edges indicate the iterative component of the analysis.
Table 6.2 presents our final taxonomy, which was reviewed and approved by an Addiction specialist,
along with label prevalence in our labeled training data set. Table 6.3 presents sample text from posts in
each category in the taxonomy.
6.6 Classifying Informational vs. Emotional Support
6.6.1 Training Dataset Annotation and Agreement
Having finalized our taxonomy, two annotators used it to each label 600 of our 1,000 initiating post train-
ing data sample (§6.3.2). We annotated each post with its primary purpose using the most specific label
available. Inter-annotator agreement for specific purpose labels (Label in Table 6.2) was medium, with
agreement of 67% and Cohen’s kappa [50] of 0.62. Inter-annotator agreement on the three broader cat-
egories informational, emotional and neither (Category in Table 6.2), however, was high with agreement
of 87% and a Cohen’s kappa [50] of 0.78.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 71
Table 6.2: Annotator-derived taxonomy for users’ objectives in initiating a post, with % prevalence in the1,000 post labeled sample on the right. Note that 1.) labels are mutually exclusive, 2) “w/d” stands for“withdrawal”.
Category Label Description %
informational
w/d expectations Questions on what to expect when going through withdrawal, es-pecially regarding symptom severity and duration.
11.8
w/d management Questions about how to manage withdrawal and relieve symp-toms.
8.7
w/d method Soliciting advice on how best to quit drug(s) of choice. Topics in-clude method of quitting (cold turkey vs. tapering) and schedulinga time to detox.
7.8
general information Subject posts medical questions unrelated to withdrawal. 8.5
emotional
seek support Specific requests for support (like keeping in thoughts, prayers,getting in touch).
4.6
give support Primary purpose of the post is to offer encouragement to others,often via relating a personal story of overcoming addiction.
9.9
update Posts that comprise a log-like report of the user’s current status.These are often highly detailed and contain no requests for feed-back or support.
35.5
general guidance Subject posts non-medical questions to the community. Theseoften comprise advice for personal relationships and scenariosrequiring moral judgement.
5.0
neitherrelapse concern Subject is worried that she is going to relapse. While rare, these
posts typically forecast relapse due to a required medical proce-dure that will require prescription pain medication. These postsvaried in their information vs. support leanings, so we excludedthem from either category.
2.8
n/a Impossible to speculate on the purpose of the post. 5.4
6.6.2 Classifier Training
To identify posts as either primarily informational or primarily emotional, we built a logistic regression
classifier (which outperformed Support Vector Machine and Naive Bayes classifiers) using the Stanford
CoreNLP toolkit5. For each post, we used the following features: the number of question sentences,
content unigrams and bigrams, positive and negative word counts with polarity score ≥ 0.8 in Senti-
WordNet [19], and number of days clean, if stated. The last feature was determined by applying the
pattern “X days/weeks/months clean” and “on day X” to the post text. A full feature list is documented in
Appendix B.
5http://nlp.stanford.edu/software/corenlp.shtml
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 72
Table 6.3: Descriptions and samples of taxonomy labels. Samples are synthesized in order to preserveuser privacy.
Label Description (+ Additional Notes) Synthesized Sample
w/d expectations What to expect while going through w/d. (Typi-cally users will ask how long symptoms will last,whether the symptoms are normal etc.)
I stopped long term methadone 12 days ago. Iwas wondering if anyone knows how long the anx-iety RLS and hot/cold last? The other symptomsrnt too bad...
w/d management How to handle w/d symptoms. Implies w/d e.(Typically users will try to source ideas for alle-viating pain, RLS etc.)
I’m wondering about the Amino Acid protocol andThomas recipe. What would be the most impor-tant to take from day 1 to 4 during the worst W/Dsymptoms? I know I suffer the most with RLS andchemical chills [...]
w/d method User seeks information on how to quit a sub-stance. (Include questions like whether to go c/tvs. taper, requests for tapering schedules or ad-vice etc.)
I am taking 5000mg of vicodin currently daily cananyone help me with this?
general information User seeks informational advice that is not re-lated to quitting/withdrawal. (Several possibili-ties, including questions about how much wouldit take to overdose etc.)
I’m curious as to how long people were ad-dicted/dependent to their DOC. I know using forlonger makes it harder to quit, and each time youquit WDs are harder than before. As for me, I hada 12 year run with vics/oxies.
seek support User explicitly requests emotional support fromthe community. (Request for emotional supportshould be explicit. Typically users will ask forhelp or prayers or thoughts.)
For those of you who are prayer warriors, pleasecould you pray for my friend, for recovery and pro-tection. Could you also please pray for his family- they are in a very hard place right now. Thankyou!
give support User imparts a strong message of encourage-ment to the community. (Look for terms like “soI just wanted everyone to know that it’s possibleand you can do it”)
Hey y’all! Well today the depression paid me visitbut I kept it caged! Anxiety about 20% Did a2.5 mile run and that helped tons. I can’t say itenough: exercise really helps withdrawals. If youcan then DO IT! When the wds hit don’t crawl intobed - get up and move!
update Update the community on the user’s status The only reason I’m not getting more is the stressinvolved in getting them and setting up a supplybecause you can’t have just one. WD today areok not too bad. It’s my neck that’s killing me andmy body laughing at the Advil I took.
general guidance Non-medical advice that doesn’t fall into any ofthe above categories. (Typical examples includequestions of how to deal with telling spousesabout addiction, whether to cut off a family mem-ber etc.)
Do any of you guys have experience with givinga husband an ultimatum? It seems simple: Gettreated or you’re out. But with 3 young childrenit’s actually quite complicated. Help.
relapse concern Often patients claiming to be clean but need amedical procedure that will require pain meds.
i had an accident yesterday that got me stuck inthe emergency room. today i’m 21 days off myroxies [...]. i ’m scared of going back because Iknow i’ll be given pain meds [...]
n/a Impossible to determine I’ve been away for few days and everything seemsdifferent. Anyway I hope everyone is doing great.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 73
6.6.3 Classifier Performance
The final classifier performs well, achieving an accuracy of 80.98% in 10-fold cross validation versus
a baseline of 59.7% in which every post is labeled with the majority class. Table 6.4 shows precision,
recall, and F1 scores averaged over all 10 folds.
Table 6.4: Classifier performance for labeling initiating posts as seeking informational support or emo-tional support. Performance scores are averaged over 10 folds.
Label Precision Recall F1
support 84.57 83.40 83.84information 76.18 77.12 76.41
Average 80.37 80.26 80.12
6.7 Classifying Updates vs. Non-updates
6.7.1 Classifier Training
To automatically label all posts with update or non-update labels, we again built a logistic regression
classifier, using the same training and testing dataset from § 6.6.1. The non-update posts contain all
posts that are not an update or n/a. We added two features to those used in our informational vs. emo-
tional classifier (§ 6.6): whether the post mentions a number of days (using the pattern: “day” or “days”
followed by a number), and time elapsed (days) since the user’s last initiating post. Table 6.2 shows that
the ratio of update to non-update posts is roughly 1:3. To compensate for this class imbalance, during
classifier training we randomly sub-sample such that non-update post quantity is at most 1.5x that of
update posts. We do not change the test set.
6.7.2 Classifier Performance
Our classifier achieves an accuracy of 78.40% compared to the majority-class baseline accuracy of
62.55% in 10-fold cross validation. Table 6.5 shows precision, recall and F1 scores.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 74
Table 6.5: Classifier performance labeling posts as either update or non-update. Performance scoresare averaged over 10 folds.
Label Precision Recall F1
update 72.15 69.29 70.09non-update 82.36 84.16 82.99
Average 77.25 76.72 76.54
6.8 Results
Users post primarily on their own behalf: In our sample, ∼85% of initiating posts were written by
the author on her own behalf, while only ∼8% were written on behalf of someone else. This differs from
reports by the Pew Research Center that find that ∼50% of online health inquiries are made on behalf
of another [90, 91]. It is possible that the stigmatized nature of addiction prevents users from disclosing
their situation to loved ones, who might otherwise ask questions on their behalf. Another possibility is
that the act of posting on Forum77 during the physically uncomfortable and painful process of withdrawal
is cathartic in and of itself: a benefit unavailable to proxy participants.
Informational and emotional support are the driving motivations for initiating discussion: In
congruence with prior work, our thematic analysis revealed that seeking informational and emotional
support drives user participation on Forum77. Applying our classifier to the entirety of the Forum77 data
set, we find that users seek both types of support in roughly equal proportion: 47% of all initiating posts
seek primarily informational support, while 53% of all initiating posts seek primarily emotional support.
This stands in contrast to our manually-annotated sample (Table 6.2) in which only 36.8% of initiating
posts are informational. Given that our machine-labeled sample comprises recurring Forum77 users,
one potential explanation for this is that longer-tenure or more involved users seek emotional support
more than users who post only a couple of times on the forum.
Informational posts seek explicit medical advice about withdrawal: Users primarily seek knowl-
edge on withdrawal methods, management and expectations in informational posts. Table 6.2 shows
that in our sample, almost 75% of informational posts specifically discuss the topic of withdrawal. A
casual analysis of informational posts also reveals that the type of information requested by users is
often explicitly medical in nature, such as the pharmacological management of withdrawal. A prevalent
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 75
example of this is Thomas’ Recipe, an opioid withdrawal tapering schedule that has evolved on Forum77
over time (§ 6.8.1).
Informational threads receive fewer responses, but have a longer lifespan: Approximately 95%
of both informational and emotional initiating posts receive a response. Of these, initiating posts that pri-
marily seek emotional support receive more responses than those seeking informational support (mean
8.7 vs. 7.4, median 6 vs. 5). The distributions are significantly different (Mann-Whitney-U test, n1 =
39,553, n2 = 38,954, U = 758,376,673, p < 0.001).
The “lifespan” of a discussion is the number of days between its initiating post and the last response
on record. On average, initiating posts that seek primarily informational support have a lifespan 2.5
times as long as those that seek primarily emotional support (mean 74.4 days vs. 27.6 days, median 0
(< 24 hours) vs. 0). The differences in means are statistically significant (Mann-Whitney-U test, n1 =
37,112, n2 = 41,395, U = 817,010,310, p < 0.001). Most (56% of informational and 59% emotional)
discussions have a lifespan of 0 days (<24 hours). Excluding these, informational discussions remain
dominant in terms of lifespan (mean 170.3 days vs. 68.8 days, median 2 days vs. 1 day).
Update posts are the most prevalent type of emotional post: Our classifier identifies some 15,000
out of ∼55,000 (30%) initiating posts as updates. Update posts comprise a log-like status update of
the user’s current condition, and rarely explicitly request any sort of response from the community. For
example:
I was used to taking 8-10 5/325 oxycodones a day. Havent taken any of them since Friday
but I took one Oxy 40mg Sat and one on Sunday morning. Its been almost 24 hrs and not to
bad so far but im sure there is more to come.
Despite the lack of specific requests, update posts do indeed trigger a community response, as we
discuss in the next paragraph.
Update posts have more responses & more unique contributors, but shorter lifespans: To further
assess the role that update posts play, we compared several features of threads that were initiated by
update vs. non-update posts. Update threads have a shorter average lifespan than other threads (mean
= 10.8 days vs. 30.0, sd = 88.8 vs. 151.1; t435332 = -18.2, p < 0.001). It is possible that the personal
nature of an update post makes them difficult to repurpose. Other differences are small: on average,
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 76
Update Non-
update
55% 9.7 days
45% 4.4 days
71% 22 days
29% 8.2 days
Figure 6.2: Normalized transition probabilities and average transition times between consecutive updateand non-update posts.
threads initiated by update posts net slightly more responses (mean 7.19 vs. 6.65; t27230 = 7.2, p <
0.001) and slightly more unique contributors (mean 4.91 vs. 4.35; t27126 = 10.6, p < 0.001).
Time elapsed between consecutive update posts is short: Figure 6.2 shows users’ transition fre-
quencies between initiating update and non-update posts, along with the average number of days be-
tween transitions. Users posting consecutive updates do so in comparatively quick succession, averag-
ing 4.4 days between each update.
6.8.1 Thomas’ Recipe: An Informal Collaboration
During our analysis, we noticed that not only do users share explicit medical advice with one another:
they test, evaluate, modify and re-share it. In others words, users informally collaborate on developing
treatment protocols that are effective at assisting withdrawal. A prevalent example of this on Forum77 is
Thomas’ Recipe.
Thomas’ Recipe6 is a detailed treatment protocol for medication-assisted opioid withdrawal manage-
ment. It was written in the early 2000’s7 by a Forum77 user who had years of experience detoxing from
opioids, but no medical qualifications. Over the years, the original Thomas’ Recipe has evolved. Ta-
ble 6.6 shows a version of Thomas’ Recipe from circa 2000, while Table 6.7 shows a version from circa
2006. While the core content remains, the newer version has a great deal more structure and formal-
ization. Details of the recipe have also changed. For example, the older recipe recommends a 4000mg
6http://www.medhelp.org/tags/health_page/45/Addiction/Thomas-Recipe-Re-Posted?hp_id=167While our data set officially starts in 2007, it also contains some posts from as far back as 1999. We believe that this was
either a pilot program or another forum that was acquired by MedHelp.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 77
dose of L-Tyrosine, while the newer recipe suggests beginning with a 2000mg dose and scaling up as
necessary.
An informal assessment of iterations of Thomas’ Recipe on Forum77 suggest that these changes are
a result of user testing and feedback. Users’ comments, too, suggest that over time, they have modified
Thomas’ Recipe to make it more generally applicable and effective:
“I’m actually doing pretty good I’ve taken the Thomas recipe from day 1 but I’ve also added
Vitamin D, and niacin.”
“I have a modified Thomas Recipe that seems to have done wonders on my withdrawals if
anyone is interested. (No Xanax or Valium etc) Added Potassium pills, Ensure protein drinks
(since I cant eat anything solid yet).”
“If it helps any, I did a modified Thomas’ Recipe. I didn’t use any pharmaceuticals and added
some additional supplements (Magnesium, Potassium and Calcium for RLS and Melatonin
for sleep).”
Thomas’ Recipe is wildly popular on Forum77. Approximately 1.72% of all posts in our data set
mention it directly. Moreover, it is not constrained just to MedHelp: these days Thomas’ Recipe is hosted
on a number of addiction recovery sites8 9 10 11, and a Google search for “Thomas’ Recipe” brings up
sponsored advertisements for opiate withdrawal remedies.
The recipe’s prevalence is likely testimony to the fact that it does genuinely assist the process of
opiate withdrawal. Forum77 users swear by its efficacy, calling it a “life saver”, a “god send”, and some-
thing that “works wonders”. To evaluate the efficacy of Thomas’ Recipe, we showed it to a psychiatrist
specializing in addiction. She noted that not only was the recipe very similar to a treatment she might
have recommended professionally, but also that it contained novel elements that would facilitate the
withdrawal process.
6.9 Discussion
Forum77 serves as a valuable, user-generated repository of medical information pertaining to the pro-
cess of addiction recovery. Moreover, this information is not static: it is curated, tested and modified. As8http://www.drugs.com/forum/featured-conditions/thomas-recipe-opiate-withdrawal-35169.html9http://www-personal.umich.edu/~timaster/biopsych/home.html
10http://opiatewithdrawaltips.com/thomas-recipe11https://www.drugs-forum.com/forum/showthread.php?t=12568
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 78
we saw in the example of Thomas’ Recipe (§ 6.8.1), users actively collaborate on developing effective
treatment protocols. The continual evolution of informational artifacts on Forum77 is likely a contributing
factor to the fact that informational discussions have significantly longer lifespans than emotional dis-
cussions. Another factor that we have observed lengthening the lifespan of informational discussions is
that some users repurpose them, sometimes years after the initial post, to describe their own situation.
In doing so, users may feel that they are not starting from scratch, that they have a ready made descrip-
tion of their condition, or that they are leveraging work that the previous initiator put into finding other
Forum77 members who could address their specific issue.
While users do explicitly seek emotional support on Forum77, most emotional posts are not explicit
requests, but rather, update posts. The prevalence of the update post suggests that users place value
in having a community bear witness to their struggle with addiction. The fact that update posts garner
slightly more responses on average than non-update posts shows, too, that responses are expected. It
is possible that users publicize update posts (rather than writing them, for example, in a private journal)
as a self-enforcement mechanism to help them progress with cessation. Qualitative evidence shows that
users feel a great deal of embarrassment and shame when a withdrawal attempt fails, and that failing
may even delay their return to the community.
In addition to having a community of witnesses, users derive utility from the process of documentation
itself. Authors find it valuable to reflect upon their past posts, which serve as reminders and evidence
of both accomplishments and regressions. For example, one user reflects on something that she was
scared to do:
I just found some old post about no desire for sex. Whew! I was so scared to ask the
question.
Another laments a relapse:
I cna’t believe I’m at 25 days when I was in the hundereds before. I’m so angry at myself for
relapsing and still keep beating myslef up!!
Readers, too, find others’ chronicles both informative and illustrative. This user mentions reading
through hundreds of old posts to glean insight into what his withdrawal will be like:
This is my first d/x and pray that it will be my last. I’ve read through tons of old posts and
they definitely help.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 79
Another poster used narratives on Forum77 to help her husband prepare for the process of her
recovery:
i have showed him this site and let him read some of your stories, so he knows its not all
going to be plane sailing
6.9.1 Limitations and Future Work
The primary limitation to our work is our requirement that a post be labeled as either informational or
emotional. In our experience, while only one of these labels tends to be dominant in an initiating post,
Wang et al. [250] and Biyani et al. [28] do show that finer-grained labeling is possible at scale. Although
picking the dominant label was sufficient for examining our analysis questions, a more nuanced analysis
might benefit from more detail.
A natural avenue for future work is to analyze response posts in addition to initiating posts. While
Wang et al. [250] utilize the same scales of emotional and information support in scoring both initiating
and response posts, our informal analyses of Forum77 response posts suggest that response categories
would require an entirely new descriptive taxonomy. (For example, a fairly common response tactic
that we observed that does not manifest in Table 6.2 is the hijack : when a user attempts to shift the
focal attention of active thread participants away from the initiator and onto herself, usually by claiming
identical circumstances to the initiator. This tactic often kills the thread.) Having derived this taxonomy,
however, one could start to ask questions such as, “What is the most effective way of getting informational
support?”, or “What types of initiating threads draw a diverse crowd of respondents?”.
6.10 Summary
We set out in this chapter to answer the question: “What do users seek on Forum77?”. We first motivated
our focus on the topic of addiction, noting that both its prevalence and stigma make it a potentially
rewarding focus of study (§ 6.1). We then presented related work on identifying types of support seeking
on online health forums (§ 6.2), and described the data samples used in this chapter (§ 6.3).
Through conducting a thematic analysis over a sample of initiating posts, we found that, in congru-
ence with prior work, users seek both informational and emotional support on Forum77. Moreover, we
discovered that the most prevalent form of emotional support seeking was to issue update posts: es-
sentially status logs containing no explicit request for a community response (§ 6.5). With some feature
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 80
engineering, we were able to train two binary statistical classifiers to distinguish emotional from informa-
tional posts (§ 6.6), and update from non-update posts (§ 6.7), with F1 scores of 80.12% and 76.54%,
respectively. Applying our classifier to the entire Forum77 data set, we then analyze differences between
these post categories (§ 6.8). We find, for example, that informational posts have a longer lifespan than
emotional posts, and that while update posts make no explicit request for feedback, they garner more
responses on average than non-update posts. We also analyze Thomas’ Recipe (§ 6.8.1), an informa-
tional artifact of Forum77 that provides users with instructions for medication-assisted detoxification from
opioids.
In conclusion, Forum77 provides two main services to users: first, it serves as a repository of in-
formation on opioid abuse that is generated, tested, and modified for improved efficacy by community
members. Second, it offers a space where the disclosure of personal progress (whether forward or
backward) can be witnessed by others and recorded for posterity. In Chapter 7 we turn our attention to
identifying which drugs Forum77 users abuse.
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 81
Table 6.6: Thomas’ Recipe (circa 2001)
THOMAS RECIPE
Here’s my tried-and-true do-it-yourself ”cold turkey” detox protocol.
Supplies you’ll need first:
As many Valium, Xanax, Librium or Klonopin as you can get your hands on.
— first day off the opiate, use enough Valium or whatever, to, if possible, sleep through most of the first couple days. Then startdecreasing the dose until you’re down to nothing in about 5 or 6 days. You’ll have to do the math. The Valium or one of its sisterdrugs will help tremendously with the anxiety and, somewhat, with the body aches. Valium may make you eat like a pig and, whenwithdrawing from narcotics, one usually craves sweets, so I’d be ready to indulge myself, along with some good escapist movies.That always worked for me.
Around-the-clock access to either hot baths or a Jacuzzi.
–speaking of those goddamn mostly thigh cramps that seem to love to show up in the middle of the night, have that hot bath orJacuzzi at the ready. Don’t hesitate to spend the majority of the week in that hot water if that’s what it takes to get you throughit. You may be wrinkled, but you’ll have your sanity. Don’t underestimate what the hot baths can do to relieve the withdrawaldiscomfort. They really work. Heating pads between the thighs can help with those cramps, too, but not as much as the hot baths.
Brand-name-only Imodium (over the counter at the supermarket)
– if you’re a normal hydro addict, you’ll be getting the runs by no later than the second or third day off the lorcet. In my experience,it’s an especially unpleasant variety. At the first impulse, take two or three and respond to returning urges with two tabs. It’simportant that you do it immediately.
L-Tyrosine (qty 50 of the 500mg caps) - an amino acid available at the health food store.
– chronic use of narcotics depletes the brain of several critical neurotransmitters responsible for well-being and mental performanceand attitude.
Plus: Bottle of 100 mg B6 caps
My experience detoxing with this stuff says take 4000 (four thousand) mg. (8x500mg caps of L-Tyrosine) with two 100mg B6 capsevery day for your ”detox week” to provide your brain with the raw material it needs to replenish its stores of these neurotransmitters.Many feel the difference on the very first dose. ***Take it on an empty stomach, either first thing in the morning or at bedtime. Youcan continue this regimen after the first week if it continues to make you feel good. I continue to use it every other day with veryfew exceptions. After a few weeks, I cut down on the dosage, though, as it can cause the runs at high doses.
Multi-vitamins (most junkies don’t eat too well, so this one’s just for good sense).
Take a look at this link. According to this doc, you also need to add copper, phosphorus and Vitamin C to replenish the dopamine,and the norepinephrine. You might have to do some hunting at the health food store to find the right vitamin or vitamins to supplyall this stuff. I got a pretty good result from just the L-Tyrosine and B6, however.
I also understand from another contributor that zinc and magnesium help replenish and restore vital substances depleted bynarcotics use.
WARNING: This same site says to avoid L-Tyrosine if you’re on an SSRI (serotonin reuptake inhibitor) such as Prozac, etc.
Good luck.
Thomas
Sourced on 9-02-2014 from: http://www.medhelp.org/posts/Addiction-Substance-Abuse/How-Long-Untill-You-Are-Normal/show/43582
CHAPTER 6. WHAT DO PEOPLE SEEK ON FORUM77? 82
Table 6.7: Thomas’ Recipe (circa 2006)
THOMAS RECIPE
PLEASE NOTE: I am not a doctor, simply a long-time Rx opiate junkie who has had many opportunities to develop a way todetox. This is a recipe for at-home self-detox from opiates based on my experience as well as that of many other addicts. It is notintended as professional medical advice. It is always wise to make sure none of the recipe ingredients or procedures conflict withmedications you may be taking. Likewise, if you have any medical condition, disease, allergy or any other health issue, consultyour doctor before using the recipe.
Thanks, Thomas
If you can’t take time off to detox, I recommend you follow a taper regimen using your drug of choice or suitable alternate – theslower the taper, the better.
For the Recipe, You’ll need:
1. Valium (or another benzodiazepine such as Klonopin, Librium, Ativan or Xanax). Of these, Valium and Klonopin are bestsuited for tapering since they come in tablet form. Librium is also an excellent detox benzo, but comes in capsules, makingit hard to taper the dose. Ativan or Xanax should only be used if you can’t get one of the others.
2. Imodium (over the counter, any drug or grocery store).
3. L-Tyrosine (500 mg caps) from the health food store.
4. Strong wide-spectrum mineral supplement with at least 100% RDA of Zinc, Phosphorus, Copper, Magnesium and Potas-sium (you may not find the potassium in the same supplement).
5. Vitamin B6 caps.
6. Access to hot baths or a Jacuzzi (or hot showers if that’s all that’s available).
How to use the recipe:
• Start the vitamin/mineral supplement right away (or the first day you can keep it down), preferably with food. Potassiumearly in the detox is important to help relieve RLS (Restless Leg Syndrome). Bananas are a good source of potassium ifyou can’t find a supplement for it.
• Begin your detox with regular doses of Valium (or alternate benzo). Start with a dose high enough to produce sleep. Beforeyou use any benzo, make sure you’re aware of how often it can be safely taken. Different benzos have different dosingschedules. Taper your Valium dosage down after each day. The goal is to get through day 4, after which the worst WDsymptoms will subside. You shouldn’t need the Valium after day 4 or 5.
• During detox, hit the hot bath or Jacuzzi as often as you need to for muscle aches. Don’t underestimate the effectivenessof hot soaks. Spend the entire time, if necessary, in a hot bath. This simple method will alleviate what is for many the worstopiate WD symptom.
• Use the Imodium aggressively to stop the runs. Take as much as you need, as often as you need it. Don’t take it, however,if you don’t need it.
• At the end of the fourth day, you should be waking up from the Valium and experiencing the beginnings of the opiate WDmalaise. Upon rising (empty stomach), take the L-Tyrosine. Try 2000mgs, and scale up or down, depending on how youfeel. You can take up to 4,000mgs. Take the L-Tyrosine with B6 to help absorption. Wait about one hour before eatingbreakfast. The L-Tyrosine will give you a surge of physical and mental energy that will help counteract the malaise. Youmay continue to take it each morning for as long as it helps. If you find it gives you the ”coffee jitters,” consider loweringthe dosage or discontinuing it altogether. Occasionally, L-Tyrosine can cause the runs. Unlike the runs from opiate WD,however, this effect of L-Tyrosine is mild and normally does not return after the first hour. Lowering the dosage may help.
• Continue to take the vitamin/mineral supplement with breakfast.
• As soon as you can force yourself to, get some mild exercise such as walking, cycling, swimming, etc. This will be hard atfirst, but will make you feel considerably better.
—Thomas
Sourced on 9-02-2014 from: http://www.drugs.com/forum/featured-conditions/thomas-recipe-opiate-withdrawal-35169.html
Chapter 7
Identifying Drugs of Choice
Monitoring drug use at a population level is crucial for observing, managing, and responding to substance
abuse-related issues, such as the emergence of new “designer drugs”, or the existence of particularly
vulnerable populations. Drug use trends could also be useful for exploring more theoretical aspects of
addiction, such as the Gateway Hypothesis [139], which proposes that drug use follows a progressive
and hierarchical sequence in which the user begins with legal addictive substances (e.g. alcohol and
cigarettes), before progressing onto marijuana and, finally, illicit substances.
The stigmatized [174,176,187] and often illegal nature of substance abuse, however, can make such
data collection difficult. Existing substance abuse surveillance efforts are restricted to convenient popula-
tions: schools (Monitoring the Future1), hospital emergency room visits (Drug Abuse Warning Network2),
state run treatment facilities (Treatment Episodes Dataset3), and in-person mutual help groups (Narcotics
Anonymous4). However, as membership in each of these populations can be compelled, these surveys,
while large-scale and thorough, fail to capture a more representative sample of drug users.
Despite the fact that millions of people voluntarily participate in online health communities for sub-
stance use disorders, almost no prior work attempts to derive drug usage data from PAT. Our goal in this
chapter is to profile substance use in the Forum77 population, and to compare this against traditionally
surveyed drug-using populations. We begin by developing a method for automatically identifying Fo-
rum77 users’ drugs of choice (DOCs) from their initiating posts (§ 7.3). As this task is context-sensitive,
we build on lessons learned in Chapter 5 and train a conditional random field (CRF) classifier that identi-
fies DOCs with F1, Precision and Recall scores of 84.65%, 91.12% and 79.46%, respectively. Next, we
1http://www.monitoringthefuture.org2http://www.samhsa.gov/data/dawn.aspx3http://wwwdasis.samhsa.gov/webt/information.htm4http://www.na.org/?ID=PR-index
83
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 84
manually develop a map for resolving identical entities (e.g. Vicodin and Hydrocodone) extracted by our
classifier, and mapping these to classes.
Applying our classifier to the entire Forum77 data set, we develop a profile of substance use in
the Forum77 population. We contrast this with survey data on the face-to-face peer recovery group
Narcotics Anonymous (NA), as well as survey data on individuals who present to addiction treatment
centers (TEDs) and emergency rooms (DAWN) (§ 7.4). After normalizing each data set for comparison
(§ 7.4), we present both comparative results as well as substance use trends on Forum77 over time
(§ 7.5). Compared to other measured drug-using populations, prescription opioid use is highly prevalent
in Forum77, while use of more traditionally-abused substances (e.g. alcohol, marijuana and cocaine)
is notably scarce. Over time, opioid replacement therapy drugs have become increasingly prevalent on
Forum77, while use of other prescription opioids has declined. We discuss possible explanations for and
implications of these results (§ 7.6) before concluding (§ 7.7).
7.1 Related Work
Two branches of prior work apply to this chapter: the primary one is syndromic surveillance, which
is concerned with the utilization of of health-related data for the purpose of detecting, analyzing and
monitoring potential disease outbreaks [128]. We discuss syndromic surveillance in depth in § 3.2. The
second is work related specifically to observing substance abuse trends in online data, which we discuss
below.
Surprisingly little work attempts to survey substance use via online data, although the potential for
doing so has been recognized [44, 113]. In August 2014, the National Institute on Drug Abuse (NIDA)5
announced the funding of a 5-year initiative to build a substance abuse surveillance system using web
data [113]. A related system, called the “Psychonaut Web Mapping Project” already exists in Europe,
and has demonstrated an ability to give timely and accurate information related to the outbreak of novel
drugs [73]. The project aggregates data scraped from myriad sites, including discussion forums, online
stores, and Google search queries, the latter of which have also been shown to correlate with demand for
specific substances [65]. This is unsurprising, given that the Internet plays host to a highly competitive
market for illicit substances [54, 244]. Dasgupta et al. [66] were even able to show that black market
prices for prescription opioids can be accurately assessed via crowdsourcing. Although sparse, this
5http://www.drugabuse.gov
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 85
prior work supports the supposition that PAT is a promising data source for extracting substance use
data.
7.2 Datasets
Users typically offer information about the substance(s) they are using in initiating posts, in which they set
the tone and topic of discussion, and disclose the issue for which they are seeking help. As respondents
may or may not offer similar information about themselves, we restrict our analysis to Forum77’s initiating
posts, of which there are 78,507 authored by a total of 28,005 unique users.
Training & Testing Dataset Our classifiers require labeled data for training. As we felt that our fa-
miliarity with the data set would expedite labeling and reduce errors, we use 500 posts from the 1,000
initiating-post sample described in § 6.3.2. For completeness, we re-specify our sampling methodology
from § 6.3.2 here: first, we curated a sample of initiating posts from recurring Forum77 users by ran-
domly sampling 200 users who had initiated 5 or more posts. Our 200 sampled users authored ∼32,000
initiating posts, of which we took a random sample of 1,000 for subsequent coding. To prevent any user
from dominating the sample, we admitted no more than 30 posts per user.
Analysis Dataset We conduct our final analysis on all of Forum77’s initiating posts (78,507 posts
authored by some 28,005 unique users).
7.3 Automatically Identifying Drugs of Choice
In this section, we describe how we automatically identify DOCs from Forum77 initiating posts. After
defining the term drug of choice, we manually annotate our training & testing data set. Next, we trained
a CRF classifier to automatically identify drugs of choice in Forum77 initiating posts. Finally, we resolve
the extracted DOC entities to specific categories to facilitate analysis and comparison.
7.3.1 Definition of Drug of Choice
In the context of Forum77 data, we define a drug of choice (DOC) as any substance that the user
indicates that she is, or was, addicted to. Such indications can be direct (e.g. “I am addicted to
percs/patches”) or implied (e.g. “I need to get off 32mgs subox”). We also include as DOCs phrases that
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 86
unequivocally imply a misused substance (e.g. “chasing the dragon” implies opium, “blazing” implies
marijuana), although we found such occurrences to be rare.
Identifying DOCs in Forum77 text is a context sensitive task: whether a substance plays the role
of treatment or addiction depends on the user. Methadone and buprenorphine, opioids used in opioid
replacement therapy, are common examples. Valium, which is both an addictive benzodiazepine and an
ingredient in Thomas’ Recipe for aiding opioid withdrawal (§ 6.8.1), is another.
7.3.2 Data Annotation
Using the definition above, two authors each labeled DOCs in 300 of the 500 posts in our sample. Inter-
rater agreement calculated on the 100 overlapping posts was high, with a Cohen’s kappa [50] of 0.84.
Of the total sample, 276 (∼ 55%) of posts contained DOC mentions.
7.3.3 Classifier Training & Evaluation
As discussed in § 5.5.1, conditional random field (CRF) models are particularly well suited to identifying
specific entities in text [151]. CRFs are also context sensitive. For example, a CRF could leverage other
words in a sentence to determine whether a term like methadone refers to a substance being abused
vs. a substance being used as a treatment. This, in addition to the fact that prior work has successfully
utilized CRF models to identify a variety of medical terms [159, 222], makes it an appropriate choice for
the challenge of identifying DOCs in text.
We trained a CRF to automatically identify DOCs mentioned in initiating posts on our labeled training
and testing data set. For training, we exclude annotations of general drug terms such as pills, meds and
drugs. As we observed in our work on ADEPT in Chapter 5, generic terms are uninformative as well as a
significant source of classifier error [201]. For full documentation of classifier features, see Appendix C.
Results
Our CRF performs well at identifying DOCs from initiating posts. On 10-fold cross validation it achieves
an F1-score of 84.65%, and Precision and Recall scores of 91.12% and 79.46%, respectively. Ta-
ble 7.1 shows a breakdown of performance across different types of terms. The CRF performs best
on drug terms that are both specific and correctly spelled (e.g. marijuana, oxycodone) and infor-
mal/morphological variations thereof (e.g. pot, oxides), and performs worst on generic drug terms (e.g.
stuff, pain pills). Table 7.2 illustrates the results of applying our DOC classifier to sample sentences,
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 87
Table 7.1: DOC classifier performance across term categories. The classifier performs best on correctlyspelled, specific drug terms; worst on general drug terms.
Category Examples F1 score (%) Precision (%) Recall (%)
All terms 84.7 91.1 79.5
Specific drug terms,spelled correctly(53.1% of all terms)
marijuana, ultram, phenobarbital,hydrocodone
87.0 90.3 83.9
Informal & morphologicalvariations of drug terms(34.5% of all terms)
roxies, oxyz, subs, pot, vics,blues, hydros, smokes
84.6 93.4 77.2
General drug terms(12.8% of all terms)
pain pills, painkillers, powder,stuff, substances
79.7 94.0 69.2
Table 7.2: Examples of DOCs extracted by our CRF classifier. Identified SOA terms are shown in boldin the context of their originating sentence, and the resolved drug name, generic name and category areshown on the right.
Sentence ResolvedDrug
ResolvedGeneric
ResolvedCategory
My doc prescribed suboxone on Sunday to helpquitting from vicodin.
Vicodin hydrocodone opioid
I need help. I am on vic for the last 20 years. Vicodin hydrocodone opioid
She began with meth months ago and now is usingcoke.
cocainemethamphetamine
cocainemethamphetamine
cocainestimulant
As for myself, it was a 7 year run with percs/patches. Percocet oxycodone opioid
and resolving these to drug categories as per § 7.3.4. Note the model’s sensitivity to context: in the first
sentence, suboxone is not extracted because it is being used as a treatment for the author’s addiction to
Vicodin.
7.3.4 Drug Term Resolution
The DOC terms extracted by our classifier vary widely in terms of spelling (we saw 58 variations on
Vicodin alone) and specificity (users refer to drugs with brand, generic and even class names). For
example, somebody might refer to Suboxone as buprenorphine, or even just as an opiate. Resolving
related drug terms to common entities is necessary for analysis and comparison.
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 88
Table 7.3: Summary of similarities and differences between our Forum77, NA, TEDS and DAWNdatasets. Forum77 is unique in that participation is always voluntary and that users report only sub-stances that they deem relevant.
Forum77 NA TEDS DAWN
Population size 19,634 8,837 1,844,720 131,698
Time in which data were generated 2007-2011 2011 2011 2011
Data self-reported? Yes Yes Yes Yes
Duplicate users in dataset possible? Yes Yes Yes Yes
Survey population membership voluntary? Yes Not always Not always Not always
Users can report multiple substances Yes Yes Yes Yes
Substances reported only those which userperceives as relevant
All All All
To resolve drug names, we compiled a list mapping misspellings in our data set to a single drug
name (either brand or generic). We then mapped all brand names to their respective generic names,
and finally, categorized each substance into a general class (Table C.1). We ultimately resolved ∼1,200
terms to 90 entities in 10 drug classes (see Appendix C).
7.4 Comparing Real-World DOC Distributions
We compare our results to survey data on the face-to-face peer recovery group Narcotics Anonymous
(NA), as well as survey data on individuals who present to addiction treatment centers (TEDs) and emer-
gency rooms (DAWN). We use the 2011 (most recently available) reports for each of these surveys, and
compare results to the Forum77 data set spanning 2007-2011. We include multiple years of Forum77
data as we find that the DOC distributions in the Forum77 population vary only slightly over time. Below,
we describe how we process each data set, and summarize key similarities and differences between
them (Table 7.3). Final categorical alignment for cross-survey comparison between surveys is described
in Table 7.4.
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 89
Table 7.4: Alignment of categories across the Forum77, NA, TEDS and DAWN datasets for comparativepurposes. Exact category terms from each survey have been preserved in this table for replicability.
Forum77 NA TEDS DAWN
Alcohol Alcohol Alcohol Alcohol
Cocaine Cocaine, Crack Cocaine/Crack Cocaine
Hallucinogens Hallucinogens (LSD, PCP) PCPOther Hallucinogens
LSDPCPMisc. hallucinogens
Heroin Opiates (heroin, morphine) Heroin Heroin
Inhalants Inhalants (glue, Nitrous Oxide) Inhalants Inhalants
Marijuana Cannabis (pot, hashish) Marijuana/Hashish MarijuanaSynthetic cannabinoids
Methadoneand Suboxone
Methadone/Buprenorphine Methadone (non-RX) Methadone/Buprenorphine
Opioids Opioids (Oxycodone, Vi-codin, Fentanyl)
Opiates/Synthetics Opiates/Opioids
Stimulants EcstasyStimulants (speed, crystalmeth)
MethamphetamineOther AmphetaminesOther Stimulants
AmphetaminesAmphetamine-dextroamphetamineGHBMDMAMethamphetamineMethyphenidate
Sedatives Tranquilizers (Klonopin,Valium, Xanax)
BarbituatesBenzodiazepinesNon-Barbituate sedativesOther non-benzodiazepinetranquilizers
BarbituratesBenzodiazepinesKetamineMisc. anxiolyticssedatives and hypnotics
7.4.1 Forum77
Our classifier identifies DOCs for 19,634 (70%) of the 28,005 users who initiated discussions on Fo-
rum77, corresponding to ∼50% of the 78,507 initiating posts analyzed. This corroborates our observa-
tion that ∼55% of the posts in our 500-post training and testing sample contained DOC mentions. To
acquire a distribution of DOCs in the Forum77 population, we count, for each drug category (see Ta-
ble 7.4) the number of unique users who abused a drug in that category. We then normalize the counts
by the DOC-identifiable population size.
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 90
7.4.2 Narcotics Anonymous
Narcotics Anonymous (NA) conducts an annual membership survey in which respondents identify both
main drugs used as well as any other drugs used on a regular basis [2]. Responses are identified using a
checklist of drug categories (Table 7.4). As the results are published only in aggregate form, we acquired
the raw data from NA for the online component of the survey for analysis. Omitting entries with either a
0 second response time or in which the user declined to answer the drug-related questions, there were
8,837 respondents.
Categorizing heroin in the NA survey data: While both DAWN and TEDS have a separate category
for heroin, NA groups heroin in to the category “Opiates (heroin, morphine etc.)”. To align the NA data
set with DAWN and TEDS, we classify “Opiates (heroin, morphine, etc.)” with “Heroin”, based on the
assumption that most users in this category are using heroin rather than morphine or other opiates.
7.4.3 TEDS
The Treatment Episode Dataset is an annual survey detailing peoples’ self-reported drug use upon
admission to state and national rehabilitation facilities [241]. There is no need to process this data set
further, and we report results directly from the TEDS 2011 survey (1,844,720 respondents).
7.4.4 DAWN
The Drug Abuse Warning Network (DAWN) is a nationally representative public health surveillance sys-
tem that monitors drug-related emergency department visits to hospitals. The survey records up to 22
drugs related to an emergency room visit [231]. We considered only DAWN data set instances cor-
responding to drug misuse (131,698 instances). As 95.5% of the users in this population mention at
most three drugs, we consider only the first three substances mentioned. From these, we filter out sub-
stances that are common but not typically abused, such as insulin. Finally, we map the remaining drugs
to categories using the DAWN Drug Reference Vocabulary6.
6Available at http://www.samhsa.gov/data/dawn.aspx
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 91
7/4/2014 localhost:8081/index2.html
http://localhost:8081/index2.html 1/1
0% 25% 50% 75%FORUM77
0% 25% 50% 75%TEDS (2011)
0% 25% 50% 75%NA (2011)
0% 25% 50% 75%DAWN (2011)
OpioidsSuboxoneSedativesAlcoholCocaineHeroin
MarijuanaStimulants
HallucinogensInhalants
Figure 7.1: Drug of choice distributions (% of population using) across the Forum77, TEDS, NA andDAWN data sets.
7.5 Results
Forum77 users struggle with opioid addiction at much higher rates than other surveyed popu-
lations of drug users Figure 7.1 shows substance usage distributions across the Forum77, TEDS,
NA and DAWN surveys. Prescription opioids, utilized by ∼70% of the population, are by far the most
prevalent DOC, followed by opioid replacement therapy opioids Methadone and Suboxone (25%). This
is more than double the population prevalence reported in any of the other three surveys.
Relatively few Forum77 users mention struggling with traditionally abused drugs: Alcohol, mar-
ijuana and cocaine are the three most prevalent DOCs in the NA, TEDS and DAWN populations (Fig-
ure 7.1). However, these three substances are conspicuously scarce in the Forum77 population. For
example, alcohol is reportedly abused by approximately 80%, 55% and 37% of the NA, TEDS and DAWN
populations, respectively, but only by 10% of Forum77 users.
After peaking in 2008, the Forum77 population slowly declines: Figure 7.2(a) shows the number of
active monthly users by DOC on Forum77. In February 2008, ∼180 unique hydrocodone users initiated
a discussion on Forum77. In contrast, the corresponding number of users for February 2014 is ∼60.
The decline in population of hydrocodone and oxycodone users is steeper than that of other DOCs. To
analyze DOC prevalence over time accounting for population decline, we normalize by population size
(Figures 7.2(b) and 7.2(c)).
Hydrocodone and oxycodone are the most prevalent DOCs on Forum77, but this prevalence de-
clines over time: Figure 7.2(b) shows the prevalence of the six most common opioids in the Forum77
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 927/4/2014 localhost:8081/trends_interactive_raw.html
http://localhost:8081/trends_interactive_raw.html 1/1
hydrocodone
oxycodone
suboxone
methadone
tramadol
heroin
20
40
60
80
100
120
140
160
180
2007 2008 2009 2010 2011 2012 2013 2014
Num
ber
of
month
ly u
sers
by S
OA
Raw Data Smoothing scale (0-100):
LOESS fit Smoothing parameter [0.25, 0.5, 0.75]:
(a) Number of unique monthly users for the 5 most prevalent opioids in Forum77 from 2007-2014.7/2/2014 localhost:8081/trends_interactive.html
http://localhost:8081/trends_interactive.html 1/1
hydrocodoneoxycodonesuboxonemethadonetramadolheroin
0
10
20
30
40
50
2007 2008 2009 2010 2011 2012 2013 2014
Per
cent
age
of d
rug-
iden
tifia
ble
popu
latio
n us
ing
(%)
Raw Data Smoothing scale (0-100):
LOESS fit Smoothing parameter [0.25, 0.5, 0.75]:
(b) Unique monthly users for the 5 most prevalent opioids from Forum77 as a percentage of thepopulation. LOESS [48] fit lines with 95% confidence intervals indicate trends.7/2/2014 localhost:8081/trends_interactive_agg.html
http://localhost:8081/trends_interactive_agg.html 1/1
Rx opioids
ORT opioids
heroin
0
10
20
30
40
50
60
2007 2008 2009 2010 2011 2012 2013 2014
Perc
enta
ge o
f dru
g-identifiable
popula
tion u
sin
g (
%)
Raw Data Smoothing scale (0-100):
LOESS fit Smoothing parameter [0.25, 0.5, 0.75]:
(c) Unique monthly users of opioid replacement therapy (ORT) opioids, other prescription opioids andheroin as a proportion of the Forum77 population. LOESS fit lines with 95% confidence intervalsindicate trends.
Figure 7.2: Prevalence of major opioids in the Forum77 population over time.
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 93
population over time. Locally weighted smoothing (LOESS [48]) is used to fit lines to each series, and
95% confidence intervals for each fit are shown. In 2007, hydrocodone and oxycodone are utilized by
approximately 45% and 33% of the population, respectively. By 2011, they each have a prevalence of
approximately 30%, which declines to about 27% (hydrocodone) and 26% (oxycodone) by 2014.
Opioid replacement therapy (ORT) opioids methadone and buprenorphine increase in prevalence
over time: Figure 7.2(c) aggregates the data shown in Figure 7.2(b), showing the prevalence of ORT
opioids (methadone and buprenorphine), other prescription opioids (e.g. oxycodone, hydrocodone etc.),
and heroin in the Forum77 population over time. While prescription opioids remain the most preva-
lent DOCs, this prevalence declines from about 70% to 56% over time, while ORT opioid prevalence
increases from approximately 19% to 28%.
Heroin prevalence increases slightly in 2013: On average, about 5% of Forum77 participants abuse
or misuse heroin until 2013, when the proportion of heroin users starts to increase noticeably, reaching
10% and looking to keep increasing by the end of our data set (Figures 7.2(b) and 7.2(c)). Moreover,
Figure 7.2(a) indicates a small absolute increase in heroin users from mid-2013 onwards, indicating that
the increase illustrated in Figures 7.2(b) and 7.2(c) is not purely an artifact of population normalization
with a decline of hydrocodone and oxycodone users.
7.6 Discussion
Prescription opioids are the strongly dominant DOC on Forum77, with their prevalence far exceeding
that measured in other drug-using populations. We suspect that this is the result of several factors.
First, users may be more receptive to seeking help anonymously online than discussing the issue with a
health care provider, since the healthcare provider may be the unwitting source of the opioids in the first
place [249]. Second, despite a robust evidence base for the medical treatment of opioid addiction [230],
few physicians have training in such treatment [263] and the condition remains highly stigmatized within
the medical community [176, 187]. Third, the more traditional self-help venues for addiction support,
namely Alcoholics Anonymous and Narcotics Anonymous, demand overcoming the stigma associated
with attending such meetings. The fact that opioid use disorders tend not to stem from recreational drug
use, which such venues are historically associated with, likely enhances this stigma. Finally, prescription
painkiller overdoses are growing at a significantly faster rate in the female population [8]. This, combined
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 94
with the fact that women are more likely than men to seek help online for health issues [37, 57, 90–92,
165], could partially account for the high prevalence of prescription opioid users on Forum77.
The scarcity of alcohol, marijuana and cocaine, the three most prevalent drugs present in the NA,
TEDS and DAWN surveys, could suggest a low number of recreational drug users in the Forum77
population. Alternatively, it is possible that Forum77 users are using alcohol and marijuana, but do not
see this use as problematic and so do not mention it. As we note in Table 7.3, the Forum77 data set
is unique in that users mention DOCs at their own discretion, and are not encouraged to disclose all
substances that they might be abusing. It is also possible that users approach different communities for
these issues: MedHelp, for example, has a separate, albeit very small, forum dedicated to alcoholism7.
Temporal trends indicate an increase in prevalence of opioid replacement therapy (ORT) opioids and
heroin, and a corresponding decline in other prescription opioids. It is possible, perhaps even likely, that
these trends reflect real-world drug usage: Cicero et al. [46] report a recent increase in heroin usage due
to oxycodone being more difficult to acquire and tamper with. In addition, survey data report a steady
increase in national buprenorphine usage [232] over time, and a slight decrease in non-medical use of
prescription opioids in the younger population [242]. While non-medical use of prescription opioids has
increased in the population of users 50 and older [242], this demographic is less prevalent online [7].
However, drawing epidemiological conclusions from these data without further study into what other
factors might be influencing these trends is ill advised.
7.6.1 Limitations & Future Work
While our work is the first to analyze drug usage trends in an online population, several challenges
remain. Foremost is extending similar analyses to a variety of online forums. Analyzing multiple data
sources would yield more comprehensive insights, and would also help to triangulate features in PAT
that are universally useful for monitoring substance abuse trends.
Finally, a difficult but necessary challenge is to investigate whether and how drug usage trends re-
flected in PAT align with those observed in the real world. As we discussed in Chapter 2, online health
seeking populations are not necessarily representative of real-world populations. As such, understand-
ing the relationship between PAT-observed and real-world drug usage trends would be necessary prior
7http://www.medhelp.org/forums/Alcoholism/show/158
CHAPTER 7. IDENTIFYING DRUGS OF CHOICE 95
to utilizing such data for monitoring and surveillance. In sum, however, our contributions in this chap-
ter both propose a viable methodology for automatically identifying DOCs from PAT, and lend the first
data-driven insights into drug usage in an online community.
7.7 Summary
Our goal in this chapter was to profile substance use in Forum77, and compare this to substance use
reported in traditionally surveyed drug-using populations. The ability to monitor population-level drug use
trends is valuable. Despite the popularity and uniqueness of OHCs focused on the topic of substance
abuse, however, no work to date focuses on automatically identifying users’ drugs of choice (DOCs) from
PAT. As such, our contributions – a method for automatically extracting and resolving DOCs, as well as
insights on the Forum77 population acquired through the application of this method – are both novel and
useful.
To automatically extract a user’s DOCs from her Forum77 initiating posts, we used manually-labeled
data to train a CRF classifier (§ 7.3.2 and 7.3.3). We use a CRF classifier as the problem of identifying
DOCs is context sensitive: many commonly abused drugs are also used as legitimate treatments for
withdrawal. Our CRF classifier is highly accurate, achieving F1, Precision and Recall scores of 84.65%,
91.12% and 79.46%, respectively (§ 7.3.3). Finally, to facilitate analysis and comparison, we resolve
extracted entities (e.g. vics, benzos) to drugs (e.g. Vicodin, benzodiazepines), and drugs to categories
(e.g. opiates, sedatives) (§ 7.3.4).
To profile substance use on Forum77, we applied our method to the entire set of initiating posts
on Forum77 (78,507 posts authored by some 28,005 users), and compared our results to those from
three surveys: the Narcotics Anonymous annual membership survey, the Treatment Episode Dataset,
which surveys users in state-funded rehabilitation facilities, and the Drug Abuse Warning Network, which
collects data on substance abuse related admissions to emergency departments (§ 7.4). Our results
(§ 7.5) show that Forum77 users are disproportionately addicted to prescription opioids, while more
traditionally-abused substances, such as alcohol, marijuana and cocaine, are infrequently reported. Our
analyses of drug usage trends on Forum77 over time suggest that Forum77 may reflect real-world trends
in substance use.
Chapter 8
Quantifying Recovery and Relapse
8.1 Introduction
Despite the prevalence of online health forums for substance use disorders, we have little understanding
of the role that they play in the process of cessation. For example, when in the cycle of abuse are they
most helpful to users? As we noted in Chapter 7, most substance abuse data are collected at point-
of-care facilities. As such, online health communities (OHCs) are uniquely poised to offer quantified
answers to questions that have previously been answered only anecdotally. For example, in a cohort
of people with substance use disorders attempting recovery, what percentage relapse? Of those who
recover, how long do these recovery periods tend to last?
Our goal in this chapter is to educe patterns of relapse and recovery as they manifest on Forum77.
We begin by describing the process of prescription abuse drug cessation and related prior work (§ 8.2),
and describing the data samples used in this chapter (§ 8.3). We then make the following contributions:
A quantified taxonomy of phases of addiction as expressed by users on Forum77 (§ 8.4). Our taxon-
omy, developed in concert with an addiction specialist, is based on Prochaska’s Transtheoretical Model
(TTM) of behavior change [203], and serves both as a labeling rubric for mapping text to phases of
addiction, as well as a quantified summary of phase-based activity on Forum77. We use the taxonomy
to manually label initiating post sequences from 191 Forum77 users (2,266 posts total) with the labels
USING, WITHDRAWING or RECOVERING. We find that Forum77 is most heavily utilized when users are
WITHDRAWING.
An analysis of activity and linguistic features across the phases of addiction (§ 8.5). We identify
features that are characteristic of each phase, and leverage them to train a conditional random field
(CRF) model to automatically label users’ phases of addiction over their tenure on Forum77. Our CRF
96
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 97
achieves an F1-score of 67.6% against a baseline F1-score of 20%. Using CRF-labeled sequences, we
are able to identify (1) whether a user relapsed at some point during their tenure, and (2) whether a user
was RECOVERING at the time of her final initiating post, with F1-scores of 78% and 82%, respectively.
An analysis of transition, relapse and recovery based on the CRF-labeled phase sequences of 2,848
Forum77 users (32,345 posts) (§ 8.6 and § 8.7). We find that overall, progressive transitions are more
prevalent than regressive transitions. Moreover, despite the fact that relapse is common (almost half
of users relapse at some point during their tenure), the chances of a user RECOVERING by her final
post are favorable. Finally, we observe a significant correlation between high forum engagement (both
frequency of participation and volume of response posts authored) during a user’s phases of USING and
WITHDRAWING and the probability that she is RECOVERING when she leaves Forum77.
We discuss our results in the context of Forum77’s efficacy as a withdrawal aide, implications for
future forum design, and implications for Addiction research (§ 8.8) before concluding (§ 8.9).
8.2 Background
To our knowledge, our work is the first to investigate the topic of prescription drug abuse cessation in
social media. Given the secretive and stigmatized nature of this condition [174, 176, 187], our contri-
bution provides a unique and often overlooked perspective on prescription drug abuse: that of patients
themselves. In this section, we provide an overview of prescription drug abuse as well as the traditional,
in-person mutual help groups Alcoholics Anonymous (AA) and Narcotics Anonymous (NA). Next, we
present work that, like ours, attempts to infer a person’s health state from her social media contributions.
For a review of literature analyzing the efficacy of OHC participation, we refer the reader to § 2.2.4.
8.2.1 The Prescription Drug Abuse Cycle
Prescription drug abuse (or “nonmedical use”) is defined as “the use of a medication without a prescrip-
tion, in a way other than prescribed, or for the experience or feelings elicited” [249]. Opioid pain relievers,
such as hydrocodone, oxycodone, morphine and codeine, are the most frequently abused prescription
medications [5]. In 2010, some 5.1 million Americans reported misusing prescription pain relievers in
the last month, followed by sedatives (2.6 million) and stimulants (1.1 million) [5].
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 98
Withdrawal
Withdrawal (or detoxification) is a painful process that is frequently compared to having a bad case of
influenza [6, 84]. Common withdrawal symptoms include agitation, anxiety, muscle aches, insomnia,
sweating, abdominal cramping, diarrhea, goose bumps, nausea and vomiting [6]. Typically, symptom
onset aligns with the first missed dose in the case of a “cold turkey” approach, or within a few days of dose
reduction in the case of a taper [84]. Symptom severity peaks within a few days of final exposure, and
gradually reduces as the user’s physical dependence on the drug weakens [84]. Withdrawal duration,
dependent on biological factors, drug and dosage levels, and withdrawal method, ranges broadly from
7-10 days (cold turkey) [102] to 20-35 days (methadone-assisted taper) [84].
Self-Detoxification
Research on easing the withdrawal process focuses primarily on medication-assisted detoxification over-
seen by a medical professional, with almost no work on the subject of self-detoxification. We found two
studies in which attendees of the same London methadone treatment facility were interviewed about
prior self-detoxification attempts. In both studies, most patients had attempted self-detoxification, and
many had made multiple attempts [102, 184]. The short-term success rate of achieving 24 hours of
abstinence per episode was 41% [184], while the medium-term success rate of achieving 10 days of
abstinence per episode was 24% [102]. The design of these studies naturally exclude patients who suc-
cessfully maintain long-term abstinence. When asked why their attempts had failed, subjects pointed to
lack of support during detoxification [102], as well as easy access to drugs and severity of withdrawal
symptoms [102,184]. Patient-reported strategies for effectively completing withdrawal include distraction
and avoidance, especially in the form of physical activity [102]. In addition, Green et al. [106] showed that
informing patients in full as to the type and severity of withdrawal symptoms that they were likely to expe-
rience resulted both in lower self-reported symptom severity scores as well as an increased probability
of completing the detoxification process.
Relapse & Recovery
Relapse rates for opioid use are high. Reported reuse statistics for individuals having gone through
detoxification programs range from 81-91% [103, 227]. However, long-term prognoses are more favor-
able, with evidence suggesting that 45-51% of patients may achieve sustained abstinence, and that
sustained abstinence is a gradual process [103].
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 99
“Recovery” is a hotly contested term in drug use disorder communities. Many align with the Alcoholics
Anonymous viewpoint that addiction is an uncurable disease and, as such, an individual never fully
“recovers” from addiction [1]. Rather, users who reach sustained sobriety are referred to as being “in
recovery”. In this work, we refer to users who have overcome physical withdrawal as RECOVERING.
8.2.2 In-Person Mutual Help Groups
Alcoholics Anonymous (AA), founded in the 1930s, is one of the most utilized services for substance
use disorders in the world, with over 4 million members across 100 different societies [133]. It has also
given rise to other peer recovery groups for addiction, like Narcotics Anonymous (NA) and Gamblers
Anonymous (GA). AA and NA are almost entirely based on mutual support, even condemning the giving
of medical advice as outside the expertise of the group, instead encouraging members to see a doctor if
medical or psychiatric problems arise [133].
Three decades of accumulated evidence demonstrates that active participation in such groups for
addiction improves outcomes [155], although success rates are ill-defined and vary across studies [20].
A high participation level in AA is reported to be one of the strongest predictors for abstinence [190,
223]. For example, Pagano et al. [190] found that users who actively helped other AA members had
a relapse rate of 55%, while those who did not relapsed at a rate of 75%. Correspondingly, many of
the benefits of AA are thought to stem from the social network that it provides its members, who afford
each other support, role modeling and experiential advice [140]. Kelly et al. [141] find that through their
interactions with other AA members, users experience increased abstinence self-efficacy, increased
spirituality/religiosity and reduced negative affect. Having a sponsor is also thought to help newcomers
avoid relapse [237].
8.2.3 Inferring Health State from Social Media
The idea that social media users’ health states will be somehow reflected in the content that they con-
tribute, and that it may be possible to predict health state from these data, has captured the interest of
several researchers. De Choudhury et al. [69–71] analyze how postpartum depression (PPD) might be
reflected on both Twitter and Facebook. Using their findings, they leverage activity and linguistic fea-
tures to build models that can predict the onset of PPD from Facebook data [71]. In other social media
studies, both activity features, such as social engagement and connectivity, and linguistic features, such
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 100
as affect and writing style, have been shown to be useful indicators of depression [72, 129, 191, 208],
neuroticism [208] and post-traumatic stress disorder [118].
A related challenge is to identify a user’s current phase within a specific medical condition. Jha and
Elhadad [136] found that a combination of linguistic and activity features are helpful for identifying can-
cer stages I–IV. Murnane and Counts [180] conducted an analysis of smoking cessation as reflected on
Twitter. They find that linguistic features of positive and negative sentiment, as well as social interac-
tion variables, were significant differentiators between users who relapsed and users who ceased their
smoking behavior during the time of the study. Finally, Wen and Rose use logistic regression and flex-
ible pattern matching over posts from an online cancer community to extract pre-defined events onto a
timeline [252].
8.3 Data
Typically, users present their own current substance use situation (e.g., drugs used and number of days
clean) in initiating posts. In contrast, users are liable to discuss a wide range of substance abuse
situations in response posts, including their own and the initiator’s. Accordingly, we restrict our analysis
to Forum77’s initiating posts, of which there are 78,507 authored by a total of 28,005 unique users.
Below, we describe the data sets that we use for taxonomy development, classifier training and testing,
and analysis.
Taxonomy Development: Our taxonomy development (§ 8.4) is an iterative process; for each iteration
we randomly sampled 1,000 of Forum77’s initiating posts.
Training & Testing Dataset: In § 8.4.4 we describe the importance of labeling sequences of initiating
posts rather than randomly sampled individual posts (as we did for taxonomy development). For our
labeled data set (§ 8.5.1) we randomly sample 200 users who had authored > 5 initiating posts on
Forum77, and all of their 2,266 initiating posts.
Analysis Dataset: We analyze all initiating post sequences of users who authored > 5 initiating posts
on Forum77. This totals 41,387 initiating posts authored by 2,848 users.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 101
8.4 Exploring & Modeling Phases of Addiction
To systematically analyze phases of substance abuse in Forum77, we require both a valid taxonomy of
phases and a rubric mapping post text to these phases. Towards this aim, we derive a rubric based on
labels from the Transtheoretic Model (TTM) of behavior change, which we describe below.
8.4.1 Transtheoretical Model for Behavior Change
The Transtheoretical Model (TTM) is a framework that describes six stages of change that a per-
son traverses in order to manifest permanent behavior change. Established in 1997 by Prochaska &
Velicer [203], the TTM has been applied to a range of behaviors, from smoking cessation [75, 180, 247]
and substance abuse [175], to sustainable energy usage [123]. The intuitiveness and universal appli-
cability of the TTM make it a useful descriptive tool; however, care should be taken before utilizing it to
inform treatment or intervention [175,253].
According to the TTM, a person begins in the stage of pre-contemplation, in which she is not thinking
about initiating a behavior change. After contemplation, she moves on to preparation, in which she
makes preparations necessary to initiate a behavior change. The person then moves on to action, a
concerted and deliberate attempt to affect short-term behavior change. If successful, the person enters
a period of maintenance, in which she tries to sustain the behavior change in the long term. If successful,
the person eventually enters the stage of termination [203]. As there is considerable debate over whether
addiction is a terminable condition [1], we omit this stage for our purposes.
8.4.2 Rubric Development
In order to match Forum77 posts to TTM stages, we randomly sampled 1,000 initiating posts. Two au-
thors mapped these posts to stages in the TTM, assigning descriptive labels to emergent sub-categories
specific to the topic of addiction (e.g., tapering and cold turkey are both part of the TTM stage Action) in
the style of a General Inductive Approach [236]. We repeated this process several times, reviewing the
rubric with an addiction specialist prior to finalization. (Note: this is the same thematic analysis process
as that described in Figure 6.1 in § 6.5.)
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 102
8.4.3 A Taxonomy of the Phases of Addiction
Table 8.1 describes our resulting phase taxonomy, along with example posts (synthesized from genuine
posts to preserve user privacy) and the prevalence of each label in our final 1,000 initiating post sample.
Although descriptively interesting, several of the labels in the taxonomy (e.g., intent to quit and about to
quit) are rare. For parsimony, and to aid subsequent classification accuracy, we collapse labels into three
categories: USING, WITHDRAWING and RECOVERING. This improves inter-annotator agreement (over a
100-post, independently labeled sample) from a Cohen’s Kappa of 0.73 to 0.78.
8.4.4 Labeling People, not Posts
Moving forward, we want to analyze addiction phases at the level of individual people. Two factors that
emerged in our taxonomy development (see Table 8.1) convinced us that labeling randomly sampled
posts would be insufficient for such analyses, and that we should instead label users’ entire post se-
quences. The first was the high prevalence (9.8%) of n/a labels. These posts are often social in nature
and, taken independently, impossible to assign to a class. However, when read in the context of the
author’s previous and subsequent posts, the label is usually obvious (see Figure 8.1). The second factor
was the low prevalence of relapse labels. We noticed that while many users relapse, few announce the
fact directly. Rather, most users will mention a relapse when they are already committed to another ces-
sation attempt (e.g., about to quit or even quitting again). However, a relapse can still be observed in a
regressive sequence, such as WITHDRAWING → USING (see Figure 8.1). Based on these observations,
in the rest of this paper we label sequences of posts.
8.5 Characterizing the Phases of Addiction
Phases of addiction coincide with distinct physiological and psychological states. In this section, we
analyze activity and linguistic features that might characterize an author’s phase on an initiating-day. We
define an initiating-day to be any day on which the user initiated a thread on Forum77. If the author
initiated multiple posts, we combine them for analysis. Our goal is two-fold: (1) to characterize phases
of addiction as they are expressed on Forum77, and (2) to identify discriminative features that might be
used for classification.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 103
Table 8.1: Addiction Phase Taxonomy derived via a thematic analysis.
Final Category TTM phase Label Description Synthesized Example %
USING Pre-contemplation
Using Subject is using sub-stances and demon-strates no intention toquit.
it has been forever since I’ve beenhere and not much has changed.I am still using the prescribedamount of oxycodone for neckpain.
3.1
Addicted Subject is using sub-stances and indicatesthat she is addicted, butdemonstrates no intentto quit.
my girlfriend and i r both addictedto percs but she is taking waymore than me and keeps gettingchest painonce every other week.
7.4
Relapse Subject has used sub-stances again after anattempt to quit.
I just messed up majorly. I was6 days clean, doing OK-ish, whenmy mother stopped by with 10Vics “incase I needed them”. Ofcourse, being the WEAK person Iam, I took them all right there.
1.3
Contemplation Intent to quit Subject expresses de-sire to stop abusing asubstance in the future.
I want off roxies. is methadonethe answer. I need to work daily.I cannot do withdrawls. PLEASEHELP!
9.3
Preparation About to quit Subject notes timeand/or plan (e.g., ta-pering schedule) toquit.
i was planning to quit the firstweek of March. True to form ad-dict fashion I’m out of both moneyand pills. So I‘m about to go ctnow instead of next week when I‘dplanned.
2.5
WITHDRAWING Action Quitting Subject is in withdrawal;method unspecified.
Today is my 5th day of FREE-DOM! I havent experienced anyw/ds yet. So much energy.
39.1
Tapering Subject is in withdrawal;detoxification method isa taper.
Have some Vics I am taking. I amdown to 6 a day. I plan to go downto 3 a day then 1 a day until I amdone!
6.4
Cold Turkey Subject is in withdrawal;detoxification method iscold turkey.
I am on day 6 of CT from 150mg+a day of ocycodone. I‘m doing finejust some overall anxiousness
3.3
RECOVERING Maintenance In recovery Subject has finisheddetoxing; no physicalwithdrawal symptomsexpressed
Just an update to tell you that Ihave 67 clean days today. I feelamazing. I sleep well now and feelgood! I’ve had a lot of discussionsabout aftercare.
17.8
n/a Impossible to determinestatus based on post
I’ve been away for few days andeverything seems different. Any-way I hope everyone is doinggreat.
9.8
8.5.1 Sample & Labeling
To study how addiction phase sequences change over time, we restrict our analysis to users who have
initiated at least 5 threads on Forum77 (n=2,848 out of 29,196 users who initiated at least one post).
Of these, we randomly sampled 200 users (∼7% of the full 2,848) and all of their initiating posts. We
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 104
First post Last post
Hey guys. Just checking who’s hanging around on the forum tonight. Peace!
Day 4 off vics today and some cravings but I’m going strong!! -WilB
6 days today and feeling pretty terrible. The restless legs are killing me, can’t…
Label sequences, not posts
Absence USING WITHD. RECOV.
105
Relapse
Figure 8.1: Illustration of how sequence analysis can (1) reduce NA labels by leveraging context fromsurrounding posts, and (2) capture relapse events in regressive sequences without requiring the user toexplicitly state that she relapsed.
discarded 9 users from the sample: two who had authored more than 100 posts, one account that
belonged to MedHelp, and six accounts for which there was no clear ownership (several different people
appeared to be using the same MedHelp account). The resulting sample contains 2,266 initiating posts
(average 11.9 posts per user) and comprises ∼5.5% of the full 41,387 initiating posts authored by the
2,848 users who have authored ≥ 5 posts on the forum.
Two authors categorized each initiating post in the sample using the taxonomy presented in Table 8.1.
We labeled each user’s data in chronological order so as to transfer context learned from surrounding
labels. Disagreements (which were rare) were relabeled based on a consensus reached after discussion.
8.5.2 Activity Features
We identify 15 activity characteristics that describe an initiator’s global activity over time, her local activity
5 days prior to the initiating-day in question, and both the initiator’s and respondents’ activity on the
initiating-day. The features capture user activity volume (e.g., number of posts initiated in the last 5
days), engagement (e.g., days elapsed since last response to another user) and attention (e.g., number
of unique respondents to a user’s initiating post on the initiating-day). For a full description of all features,
as well as summary statistics of their distributions across each class, we refer the reader to Table D.3.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 105
8.5.3 Linguistic & Content Features
LIWC Features
Differences in word use and linguistic style are believed to reveal a range of information about people,
from psychological state to social identity [196]. The Linguistic Inquiry and Word Count (LIWC) [195]
software calculates 80 linguistic variables over text. In prior work, LIWC has been used to characterize
and distinguish women suffering from Post-Partum Depression (PPD) [71], individuals at risk for depres-
sion [72] and smokers on Twitter who are at risk for relapse [180]. We calculate all 80 LIWC variables
over initiating post text as well as over all responses received on the initiating-day. We then examine
differences in these variables across the USING, WITHDRAWING and RECOVERING phases (Tables D.1
& D.2).
Days Mentioned and Question Features
In addition to the LIWC features, we calculate three variables over initiating post text. Users frequently
mention how long they have been clean at the time of posting. We extract days clean automatically by
using hand written patterns, such as “clean X days” and “X weeks off”, where X represents a number.
We convert X to days if necessary. We also use a more relaxed version of this feature, called days
mentioned, in which we do not require the user to explicitly mention terms like “clean” or “off”. Finally,
we count the number of questions asked by identifying sentences that start with a question word and/or
end with a question mark. This feature has proved helpful in prior work [71]. We find that including these
three extra features improves classifier performance by ∼2.2%.
Phase-Specific Term Features
Finally, we count how many phase-specific words occur in both initiating post text as well as response
text. To determine whether a term t is particularly descriptive of a phase p, we calculate its frequency-
based odds ratio. If fp(t) is the number of posts of phase p that contain t, then:
OR(t, p) =fp(t) ∗ fp(t)
fp(t) ∗ fp(t)
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 106
The odds ratio is a measure of strength of association. We calculate the odds ratio for each term
across each phase, and retain terms with an odds ratio >2. Table 8.2 shows sample terms for both
initiating and response posts.
Table 8.2: Sample phase specific terms for the USING, WITHDRAWING and RECOVERING categories.
Initiating Posts Response Posts
USING withdrawls, wants, hate, addicted, scared,tried, stop
situation, willing, treatment, withdrawl, op-tion, advise, rehab, counseling
WITHDRAWING rls, hot, restless, aches, slept, arms, legs,headache, wd, worst, stomach, tramadol
potassium, heating, fluids, baths, pad, show-ers, legs, melatonin, hot, slept, bananas
RECOVERING craving, recovery, lately, sober, fight, truly,clean, cravings, true, worth
inspiration, accomplishment, congratula-tions, sharing, thank, miss, proud, paws
8.5.4 Results: Activity and Linguistic Features
We present linguistic features over initiating posts in Table D.1, linguistic features over response posts in
Table D.2, and activity features in Table D.3. Unless otherwise mentioned, we use Kruskal-Wallis tests
to assess statistical significance. A non-parametric test is appropriate for data that are not expected to
follow a normal distribution (such as ours), and a Kruskal-Wallis test determines whether any pair in a
trio of distributions is significantly different.
Our feature analysis indicates that both users’ activity and users’ content and linguistic characteristics
differ measurably across addiction phases. We discuss particularly descriptive features of each phase
below.
USING: This phase is characterized by long absences from the forum and, correspondingly, low levels
of recent activity. Users who are USING have, on average, been absent from forum participation in all
capacities for more than twice as long as users who are WITHDRAWING or RECOVERING (40 vs. ∼18
days since last activity ). A longer absence from the forum may partially explain why USING posts are, on
average, longer (208 vs. ∼180 words): users must account for lost time and bring their audience back
up to speed.
Both days clean and days mentioned vary widely in USING posts, and have surprisingly high median
values. Examining the underlying data provides an explanation: users who are USING often mention how
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 107
long they had been clean prior to relapse in statements such as, “I was clean for 4 months before...” or
“I would have had 717 days clean today”.
Finally, USING posts offer the lowest levels of positive affect (16% less than WITHDRAWING and 32%
less than RECOVERING), and the highest levels of discussion around the topic of health (16% more
than WITHDRAWING and 36% more than RECOVERING); characteristics that are mirrored in responses to
USING posts. The lack of positivity resonates with the fact that users who are USING have either relapsed
or failed to progress towards recovery.
WITHDRAWING: In recent activity, users who are WITHDRAWING issue more initiating posts and self
responses than those who are USING or RECOVERING. In addition, they have the smallest average
number of days since last initiating post (21 vs. 31 RECOVERING and 50 USING) and days since last
self-response (29 vs. 42 RECOVERING and 66 USING).
As we might expect, WITHDRAWING users express the lowest numbers of days clean and days men-
tioned. In addition there is a great deal more language about feeling, biological processes and the body.
These observations align with the nature of detoxification as an uncomfortable physical process from
which people constantly seek relief [84].
Responses to WITHDRAWING posts are not particularly distinctive. Aside from expressing slightly
more anxiety, and writing slightly more about feeling and the body, other linguistic variables tend to take
on a value somewhere in between those of responses to USING and RECOVERING. It is possible that
respondents try to influence users from one side of the spectrum to the other, modifying their language
according to the user’s progress.
RECOVERING: These users are highly active, especially in the area of responding to other peoples’
posts. In recent activity they issue, on average, 15.2 responses to other peoples’ threads, compared to
5.5 by users who are WITHDRAWING and 1.9 by users who are USING. Moreover, unlike WITHDRAWING
and USING users, their # initiating posts# responses authored tends to be <1.
Linguistic features also suggest that RECOVERING users tend to focus on others. The pronoun you
is used almost 100% more while the I pronoun is used less, and language is more social. Moreover,
users express significantly more positive affect (25% more than WITHDRAWING, 48% more than USING)
and less anxiety (18% less than WITHDRAWING, 16% less than USING). The evident outward focus of
initiating posts from RECOVERING users resonates with the 12th step in traditional twelve-step programs
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 108
such as AA, which encourage people to strengthen their sobriety by using their experiences to help
others achieve it [1].
Responses to RECOVERING posts are distinct in that they express substantially more positive affect
(27% more than responses to WITHDRAWING, 57% more than responses to USING). They also tend to
host a notable quantity of exclamation marks (100% more than WITHDRAWING, 350% more than USING).
Inspection reveals that this is an expression of excitement and encouragement in response to good
news, for example, “hoooooorrrraaaahhhhh!!!!!!!!!” and “I am so PROUD of YOU!!!!!”.
8.6 Automatically Classifying Addiction Phase
Informed by our feature analysis, we next train a statistical classifier to automatically label Forum77 posts
as USING, WITHDRAWING or RECOVERING. Analyses of phase sequences can give insight into events
such as relapse and recovery. Our classifier allows us to scale such analyses to the entire Forum77 data
set. Below, we describe our classifier and report its performance. We discuss relapse and recovery in
§ 8.7.
8.6.1 Model & Features
A user’s path through the different phases of addiction forms a natural sequence. A conditional random
field (CRF) [151] is a probabilistic graphical model that performs inference over sequences, rather than
individual data points. By taking into account prior and subsequent data items in a sequence, CRFs
are context sensitive. For example, unlike a CRF, a non-sequence-based classifier might have difficulty
classifying a post like, “I’ve been away for a few days and everything seems different. Anyway I hope
everyone is doing great...”, even if it was sandwiched between two posts that were obviously USING, as
the post itself contains no clues as to the user’s phase.
Accordingly, we train a 3-class CRF to annotate a user’s sequence of initiating-days with the labels
USING, WITHDRAWING or RECOVERING. We use an adapted a version of the Stanford Named Entity
Recognizer package, a trainable, Java implementation of a CRF classifier1, that analyzes sequences of
documents (default unit of analysis is a token). Tables D.1, D.2 and D.3 indicate the subset of features
that we used for classifier training. We selected features based on apparent discriminability and itera-
tive evaluation through 10-fold cross validation. In order to improve robustness and model potentially
1http://nlp.stanford.edu/software/CRF-NER.shtml
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 109
Table 8.3: CRF performance scores aggregated over 10 runs of 10-fold cross validation, with randomlyshuffled input sets.
Label Precision Recall F1 score Accuracy
Combined 68.3 68.0 67.6 69.8USING 62.4 61.7 61.4WITHDRAWING 70.6 71.9 70.9RECOVERING 72.1 71.2 70.9
Baseline 14.0 33.0 20.0 43.0
non-linear responses, we binned numeric features into octiles: ranks that divide the data evenly into 8
groups. While using quartiles is arguably more common in standard practice, we found that using octiles
improved classifier performance.
8.6.2 Performance
Table 8.3 shows precision, recall and F1 scores for the CRF classifier. Our classifier achieves an F1
score of 67.6% against a baseline F1 score of 20.0%, acquired by labeling each instance with the
majority class, WITHDRAWING.
It is useful to know which labels the CRF is likely to confuse. Figure 8.2 shows the CRF classifier’s
confusion matrix. Diagonal entries indicate counts of correctly-classified instances. The strong diagonal
indicates a relatively high level of accuracy. Most classification errors occur between adjacent phases:
confusing USING and WITHDRAWING, and confusing WITHDRAWING and RECOVERING is common, but
confusing USING and RECOVERING less so. This resonates with a point prevalent in the addiction litera-
ture: stages of recovery are not black and white but rather fall on a spectrum [79,168].
8.6.3 Results
We analyze the result of applying our CRF classifier to the entirety of the Forum77 membership base
who have initiated > 5 posts (2,848 users, 32,345 initiating posts). Our results give us insight into
common transitions between addiction phases, enabling us to answer questions such as, “If a user is
WITHDRAWING today, how likely is it that she will be RECOVERING on her next initiating-day?” and “what
is the most frequent phase change observed on Forum77?”
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 110
6/4/2014 localhost:8080/index_transition.html
http://localhost:8080/index_transition.html 1/1
327.2 131.8 62.2
150.2 686.9 142.7
52.2 139.8 560.2
Using Withd. Recov.
Rec
ov.
With
d. U
sing
GOLD LABELS
CR
F LA
BEL
S
Using Withd. Recov.
Rec
ov.
With
d. U
sing
GOLD LABELS
CR
F LA
BEL
S
Figure 8.2: Confusion matrix for our CRF classifier aggregated across 10 randomized runs of 10-foldcross validation.
Figure 8.3(a) shows the normalized transition frequency matrix for USING, WITHDRAWING and RE-
COVERING. The most common transitions lie along the diagonal, indicating that users typically initiate
consecutive posts in any one phase. Self-transitions aside, the progressive edges between consecutive
stages (USING→ WITHDRAWING and WITHDRAWING→ RECOVERING) are the most common, accounting
for approximately 6% and 5.2% of total transitions, respectively. In contrast, regressive edges between
consecutive stages (WITHDRAWING → USING and RECOVERING → WITHDRAWING) are less common,
accounting for 2.6% and 1.1% of total transitions, respectively.
Figure 8.3(b) shows conditional transition probabilities across states. The likelihood of a same-
state transition increases with the progressiveness of the state. For example, there is a 71% chance
that a USING user will be USING in her next post, an 81% chance that a WITHDRAWING user will be
WITHDRAWING in her next post, and a 91% chance that a RECOVERING user will be RECOVERING in her
next post.
Figure 8.4 shows the distributions of phase length in days for each phase. We calculate phase
length as the number of days between the first and last post in a contiguous sequence. The typical
WITHDRAWING phase lengths align well with those reported in the literature on addiction, which suggests
a 7–35 day duration depending on the detoxification method used, as well as other factors [84,102].
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 111
6/3/2014 localhost:8080/index_transition.html
http://localhost:8080/index_transition.html 1/1
17.35 6.04 1.12
2.56 33.85 5.23
1.78 1.11 30.96
6/3/2014 localhost:8080/index_transition.html
http://localhost:8080/index_transition.html 1/1
70.79 24.64 4.57
6.15 81.29 12.56
5.26 3.28 91.46
Using Withd. Recov.
Rec
ov.
With
d. U
sing
TARGET STATE
SOU
RC
E ST
ATE
Using Withd. Recov. TARGET STATE
Using Withd. Recov.
Rec
ov.
With
d. U
sing
GOLD LABELS
CR
F LA
BEL
S
Using Withd. Recov.
Rec
ov.
With
d. U
sing
Target State
Sou
rce
Sta
te
Using Withd. Recov. Target State
Using Withd. Recov.
Rec
ov.
With
d. U
sing
Gold Labels
CR
F La
bels
(a) (b) Figure 8.3: (a) Normalized transition frequencies between addiction phases (e.g., USING→ RECOVERING
edges comprise 1.12% of the total transitions in the CRF-labeled data) and (b) conditional transitionprobabilities (e.g., the probability of a user moving from USING to RECOVERING is 4.57%.)
8.7 Automatically Classifying Relapse and Recovery
Relapse and recovery are critical events in the process of addiction that are often viewed as “failure”
or “success”. Prior work in the addiction literature suggests that recovery is a long, iterative process
of which relapse is a part [103]. Leveraging our CRF classifier, we present methods for identifying (1)
if a user has relapsed during her tenure on the forum, and (2) if a user is RECOVERING on her last
initiating-day on Forum77. We then investigate if relapse adversely correlates with a user’s chance of
RECOVERING. Finally, we identify activity features during USING and WITHDRAWING phases that discrim-
inate between users who wrote their final post on Forum77 in a state of RECOVERING, and those who
did not.
8.7.1 Identifying Relapse
To identify a relapse incident, we codify three regressive transition patterns:
RECOVERING→ { WITHDRAWING, USING }
WITHDRAWING→ USING
WITHDRAWING→ (45+ days absent)→ WITHDRAWING
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 1127/24/2014 localhost:8080
http://localhost:8080/ 1/1
1 3 8 24 55 600
20406080
100120140160180200220240
7/24/2014 localhost:8080
http://localhost:8080/ 1/1
1 3 7 16 35 600
20406080
100120140160180200220240260
7/24/2014 localhost:8080
http://localhost:8080/ 1/1
1 7 17 36 79600
20406080
100120140160180200
USING phase length (days)
WITHDRAWING phase length (days)
RECOVERING phase length (days)
Median Q1 – Q3 (1.5 * IQR) within Q1, Q3
Figure 8.4: Distributions of phase lengths. A red bar indicates the median value, while the dark blueregion indicates the middle spread. The light blue region indicates values that fall within 1.5 ∗ theinterquartile range of the middle spread.
This last pattern is based on the observation that a general window for withdrawal duration is 7-35
days [84, 103]. As such, if a user was absent for more than 45 days, and then returned in a state of
WITHDRAWING, it is likely that she failed in her initial attempt and has restarted. While it is possible that
this pattern will capture individuals on a slow taper, in our experience it is unlikely that such users would
be inactive for a full 45 days.
We identify whether a user relapsed or not during her tenure on Forum77 by testing whether any of
the above patterns exist in her sequence of phase transitions. To evaluate the efficacy of this approach,
we apply it to both the gold label sequences as well as the CRF-labeled sequences in our labeled sample
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 113
Table 8.4: Performance for identifying relapse events (top) and whether a user’s final state is RECOVER-ING (bottom). Combined scores across classes are shown in bold.
Identifying a relapse event
Label Precision Recall F1 score Accuracy
Combined 79.92 78.18 78.04 78.42Relapse 86.11 66.67 75.15No relapse 73.73 89.69 80.93
Baseline 25.65 50.00 33.91 51.30
Identifying final initiating post phase
Label Precision Recall F1 score Accuracy
Combined 81.47 81.52 81.49 81.57RECOVERING 79.78 80.68 80.23¬RECOVERING 83.17 82.35 82.76
Baseline 26.84 50.00 34.93 53.40
data set. Using this technique, we achieve an F1-score of 78% and accuracy of 78% in identifying
Relapse and No relapse, compared to baseline scores of 33.9% and 51.3% if we labeled each user with
the majority class, No relapse (Table 8.4).
8.7.2 Identifying Recovery
To identify whether a user was RECOVERING when she last initiated a post on Forum77, we simply
examine the final phase label in her transition sequence. Using the CRF-labeled sequences, we classify
a user’s last post as RECOVERING or ¬RECOVERING with an F1-score of 81.5% and accuracy of 81.6%;
the comparative baselines are 34.9% and 53.4%, in which all last posts are labeled as ¬RECOVERING
(Table 8.4).
8.7.3 Results
Using the methods described above, we identify users who are RECOVERING at the time of their last
initiating post on Forum77, as well as users who have relapsed at least once during their tenure on
Forum77. We apply this analysis to the entirety of the Forum77 membership base who have initiated >
5 posts (2,848 users, 32,345 initiating posts).
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 114
6/3/2014 Sankey Diagram
http://localhost:8081/ 1/1
May 22, 2012Mike Bostock
Sankey Diagrams
First post Last post
Usi
ng 4
8%
With
d. 4
4%
37%
17
%
Rec
ov. 4
6%
Rel
apse
48%
N
o re
laps
e 52
%
Figure 8.5: Aggregated user transitions from start to end state. Bar widths denote population proportion.For example, 48% of users in our sample relapsed during their tenure on Forum77.
Do users tend to recover on Forum77? Overall, users progress towards recovery during their tenure.
Figure 8.5 shows the distribution over start state, relapse, and end state for the 2,848 users described
above. Most users first initiate contact on the forum when they are USING (48%), followed by WITH-
DRAWING (44%). In contrast, only 17% of users are USING by the time of their last post, while 37% are
WITHDRAWING and 46% are RECOVERING.
Does relapsing hurt recovery likelihood? Roughly half of users experience a relapse during their
tenure. Users who experience no relapse are significantly more likely to end in RECOVERING than users
who relapse (53.4% vs. 44.4% end in RECOVERING, χ21 = 55.1, p < 0.001). Despite this, RECOVERING
is still the most likely end state for Forum77 users who relapse.
Are relapses associated with longer tenure? Given the documented prevalence of relapse [103,
227], the observation that more than half of the users in our data set experience no relapse is surprising.
Analyzing tenure values reveals that the average tenure of no relapse users is 128 days, compared to
418 days for users who relapse. One hypothesis is that users who experience no relapse do relapse
after leaving the forum and do not return.
What differentiates users who are ultimately RECOVERING? We define a user as active if she ini-
tiated a post on the forum in the last 45 days of our data set, and remove these. We then analyze
users’ global activity characteristics (Table D.3) aggregated over their USING and WITHDRAWING posts
(RECOVERING posts are omitted as this is the phenomenon that we are studying). Table 8.5 shows the
results.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 115
Table 8.5: Comparison of activity features for users who are and are not RECOVERING in their last initiat-ing post. Per-user values are aggregated over USING and WITHDRAWING posts. Statistical significanceis determined using Kruskal-Wallis tests (*** p < 0.001) after Bonferroni corrections.
RECOVERING not RECOVERING
Activity Characteristic p Mean Med. IQR MAD Mean Med. IQR MAD
# initiating posts authored *** 8.99 5 8 4.44 9.89 6 6 2.96# self responses authored *** 19.56 8 16 10.37 17.04 9 16 8.89# responses authored *** 45.56 9 31 13.34 33.81 8 24 10.37
# initiating posts# responses authored *** 0.73 0.50 0.76 0.44 1.04 0.67 0.83 0.49
Days since last init. *** 16.39 3.33 12.41 3.95 27.05 8.30 28.36 10.53Days since last self-response *** 17.47 3.00 13.38 3.95 29.53 8.29 31.45 10.81Days since last response *** 15.92 1.66 7.32 2.47 25.30 4.37 21.75 5.99Days since last activity *** 14.11 1.80 6.09 1.90 20.94 4.80 20.09 5.79# self responses *** 1.93 1.50 1.64 1.19 1.83 1.50 1.50 1.11# replies received *** 5.63 5.00 3.40 2.37 5.56 4.83 3.30 2.29# respondents *** 4.09 3.83 2.00 1.60 4.01 3.70 2.03 1.42
Users who leave the forum in a state of RECOVERING are significantly more engaged in forum activity,
even when they are USING and WITHDRAWING. The average time lapse between any form of activity
(initiation, self-response and response) is about 30% shorter for those who are RECOVERING when
they leave. Moreover, their activity is focused outwardly on other community members: users who are
RECOVERING author, on average, 50% more responses than those who are ¬RECOVERING (average
45.6 vs. 33.8), but author slightly fewer initiating posts (average 9.0 vs. 9.9). These results resonate
strongly with prior work on AA that finds that both active participation in AA and explicitly focusing on
helping other members correlates with sustained abstinence [190,223].
8.8 Discussion
Our motivating goals were to study phases of addiction as they manifest on Forum77 and to analyze
the forum’s effectiveness in promoting recovery. In this section, we discuss Forum77’s efficacy as a tool
for supporting users through withdrawal, relapse and sustained recovery, drawing on post excerpts to
contextualize our findings. We then discuss how our results might inform future interface design, before
touching on potential implications for addiction treatment.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 116
8.8.1 Use and Efficacy of Forum77
Supporting Withdrawal: Our results suggest that Forum77 is an effective tool for helping users through
opioid withdrawals and physical detoxification. In general, users progress more often than they regress
(Figure 8.3), and these local progressions translate into a global trend of many users reaching a state of
RECOVERING during their tenure. When first initiating a post, 48% of users are USING, 44% WITHDRAW-
ING and 8% RECOVERING; in their most recent initiating post, however, only 17% of users are USING,
37% are WITHDRAWING and 46% are RECOVERING, despite the fact that almost half of the population
experiences a relapse (Figure 8.5). If we interpret our results as a 46% success rate on users’ final
detoxification attempt before leaving the forum, this is an improvement over self-detoxification success
rates reported in the addiction literature [102, 184]. We must be cautious here, however, as we are
comparing across different study designs.
Forum77’s efficacy at supporting detoxification may be attributable, in part, to both the strong social
support and the detailed information on withdrawal that members receive from each other. Both of these
factors have been shown to improve withdrawal outcomes [102, 106, 184], and qualitative remarks from
users suggest that Forum77 meets the mark on both. “I have tried to cope by myself for too long. Its
so hard to deal with something like addiction by your self”, wrote one user. “[T]here is so much support
and advice on getting through this and addiction I am living proof it works!!!!!!”, and “i was on here once
before and was able to achieve 9 months of sobriety due to the support i had here and from meetings.”
remarked others. In other cases, simply discovering a supportive community might galvanize a cessation
attempt: “up until 3 weeks ago, I had no intentions of quitting, i was just looking to find some stuff on
addiction...and i just happened to run across this forum...”.
Relapse and Shame: Despite the favorable prognosis that users are more likely to reach a state of
RECOVERING during their tenure (Figure 8.5), we do not know whether they maintain this state upon
leaving. It is possible that the same strong support network that helps users through detoxification
deters them from wanting to admit a relapse. Quantitatively, although almost half of our sample relapsed
(Figure 8.5), we rarely observed posts in which users reported a relapse immediately after the fact
(Table 8.1).
The hypothesis that users are too ashamed to admit relapse until they implement a renewed attempt
to quit is qualitatively well supported. Statements such as “I suck!! I am so sorry, I’ve been too em-
barrased too admit I fell off the proverbial wagon around Christmas.” are common. Others, such as
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 117
“haven’t posted in a few weeks because, of course, i slipped up and am ashamed. but now i am back
on track with the sub” and “Im in day 3 of detox, i was too embarassed to post the first 3 days...” echo
these sentiments, and suggest that some users feel that a new detoxification effort is required as proof
of commitment before returning to the community.
Supporting Sustained Recovery: Without observing users’ behavior outside the forum, we cannot
quantify Forum77’s effectiveness at supporting long term recovery. Qualitatively, however, some users
feel that this is something that Forum77 could improve upon. One user summarizes: “I wonder if there
is not a need for a forum community for long-term support. This community is great, but is skewed
towards the short-term wd symptoms and getting through the initial physical pain of wd.”. Also prevalent
are observations that the forum does not sufficiently prepare users to handle post-acute withdrawal
syndrome (PAWS): “I wish people would warn others about this PAWS thing”, wrote one user. “i was
doing so good i made it to about 100 days sober ... the PAWS really got me”, expressed another.
Moreover, users who return to Forum77 after some time may find that their support network has moved
on. One user who was struggling not to relapse asked “Where are all of the friends i made here that I no
longer see?!?”.
Other users, however, give qualitative evidence in support of Forum77’s efficacy at aiding sustained
recovery. “I have not posted much lately but continue to log on and read ppls posts and I believe that
is a key aspect in my recovery”, states one user. Another wrote “when I get a craving I come here
and read, even if I read it before, it helps me think of what I went through what I’m going through and
how others cope”. We found that higher engagement, in the form of activity levels and volumes of
responses contributed, correlate with the chances of a user being in a phase of RECOVERING by the end
of her tenure. Extending this idea, one possibility is that remaining engaged with the forum (even in the
form of “lurking”) after reaching a state of RECOVERING helps to prevent relapses in a similar way that
continued participation in AA correlates with longer periods of sobriety [190,223]. A deeper analysis into
the mechanisms through which Forum77 does and does not support long-term recovery is an important
topic for future work.
8.8.2 Implications for Forum Design
Computational tools for automatically identifying addiction phases, relapses, and whether a user’s tenure
ends in RECOVERING could prove valuable to communities like Forum77. One question commonly asked
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 118
by users is what to expect when they quit their drug of choice, and having access to this information has
been shown to improve the chances of a successful cessation attempt [106]. Using phase sequence
data labeled by our CRF classifier, users could set realistic expectations by exploring patterns based
on thousands of others’ prior experiences. Having a realistic perspective of the process of relapse and
recovery may also reduce the number of instances in which users feel too embarrassed or ashamed to
return to Forum77 after relapsing. Finally, exposing such data could help people find others who exhibit
similar patterns to their own. Finding “people like me” is one of the primary stated reasons for user
participation in online health communities [90].
While Forum77 appears to promote detoxification effectively, we observed that users have mixed
feelings about how well it supports sustained recovery. It is possible that this could be addressed via
altering community dynamics. For example, as we suggested above, continued participation in Forum77
post RECOVERING might help users achieve sustained recovery. Efforts focused on decreasing user
churn and increasing member retention could support this. Alternatively, in a similar vein to AA’s spon-
sorship program, which is thought to promote sustained recovery [237], we might consider automatically
matching newcomers with long-term members who would act as formal mentors (or sponsors). Finally,
it is possible that the community dynamics that support detoxification are different from those that would
support sustained recovery. In this case, a forward reference to a different community might help RE-
COVERING Forum77 users plan what to do next.
8.8.3 Implications for Addiction Treatment
Forum77 accrues, at scale, information that is difficult to acquire through formal medical channels. First,
abusing prescription drugs usually entails deceiving one’s doctor. Second, addiction research data are
typically acquired at point-of-care facilities (e.g., emergency rooms) or surveys at high schools or col-
leges. Although the ethics and privacy of such analyses must be carefully considered, it is possible that
data extracted from sites like Forum77 (e.g., CRF-based transition frequencies, recovery trends, etc.)
could help medical professionals and policy makers better understand patients’ experiences with drug
abuse. For example, insight into the day to day difficulties of opioid-assisted withdrawal might inform
policy for improving the management of this popular treatment down the road. It is also possible that
research like ours could illuminate poorly understood aspects of addiction: to our knowledge, ours is the
first attempt to quantify the cycle of addiction.
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 119
8.8.4 Limitations
One limitation of this work is the selection bias of our subjects: users who come to Forum77 are likely
already open to (or at least, considering) the possibility of quitting. This problem is well known to those
hoping to analyze the efficacy of Alcoholics Anonymous [20]. As such, care should be taken in applying
our results to a more general population who misuse prescription medication. We cannot assume, for
example, that a random sample of people who misuse prescription medication would similarly progress
towards recovery if they were asked to participate in Forum77. We also cannot draw epidemiological
conclusions that apply to the population as a whole from these data. However, the size of Forum77,
the prevalence of the opioid epidemic, and the increasing popularity of online health communities alone
make the forum worth studying.
Another limitation is the acceptable, but still improvable, accuracy of our CRF classifier. While we
were able to use CRF-based sequences to identify relapse, and whether a user’s final post was written
when she was RECOVERING with high accuracy, improving our underlying classifier performance would
open up more nuanced analyses. Finally, having page view data would allow us to incorporate measures
of passive participation (“lurking”), which would add a new dimension to our study. We hope to address
such opportunities in future work.
8.9 Summary
Our goal in this chapter was to analyze the process of opioid withdrawal, recovery and relapse on Fo-
rum77, MedHelp’s Addiction: Substance Abuse community. Drawing on literature from the Addiction
community, we first present an overview of prescription drug abuse and present key concepts and ter-
minology (§ 8.2). Next, using Prochaska’s Transtheoretical Model for behavior change, we develop a
taxonomy of phases of addiction that comprises three main categories: USING, WITHDRAWING and RE-
COVERING (§ 8.4). The majority of initiating posts are authored when users are WITHDRAWING. Next, we
analyze linguistic and behavioral features across the USING, WITHDRAWING and RECOVERING phases.
Several significant differences characterize each phase (§ 8.5), and we leverage these results to train a
CRF model to automatically annotate users’ phase sequences (§ 8.6). We can identify relapse events,
and whether a user was RECOVERING when she authored her final post, with high accuracy from our
CRF-annotated sequences (§ 8.7).
CHAPTER 8. QUANTIFYING RECOVERY AND RELAPSE 120
Applying our classifiers to 2,848 users (§ 8.6.3 and § 8.7.3) reveals that progressive transitions to-
wards RECOVERING are much more prevalent than regressive transitions. Moreover, despite the fact that
almost 50% of users relapse during their tenure, leaving Forum77 in a state of RECOVERING is the most
probable outcome for all users. Finally, we find that increased participation in the community correlates
with a user RECOVERING by the end of her tenure: users who are RECOVERING by their final initiating
post are significantly more engaged with the community when they are USING and WITHDRAWING than
users who are ¬RECOVERING by their final initiating post.
To our knowledge, ours is the first work to investigate the efficacy of online mutual help groups for
prescription drug abuse. Our results, which help to illuminate a previously poorly understood resource,
suggest that Forum77 is an effective detoxification aid. Based on our findings, we also highlight several
ways in which Forum77 might be enhanced to better support its users (§ 8.8), such as exposing aggre-
gate user data describing the cycle of addiction, or matching newcomers with sponsors. Finally, as the
type of information shared on Forum77 is difficult to acquire at scale through traditional channels, we
note that the tools and insights presented here may be of use to the addiction research community.
Chapter 9
Conclusion
This dissertation presents both methods for automatically extracting medically-relevant data from patient
authored text (PAT) as well as insights derived through the application of these methods. In concert,
our contributions both underscore PAT’s latent potential for illuminating poorly understood or clandestine
medical topics that may be invisible to traditional medical data collection, as well as offer viable methods
that dramatically improve our ability to realize this potential. In this final chapter, we reiterate the contri-
butions of this thesis (§ 9.1) and present principal opportunities for future research (§ 9.2) before offering
concluding thoughts (§ 9.3).
9.1 Contribution Summary
Our work is predicated on the observation that despite being both abundant and uniquely valuable,
patient authored text (PAT) is a heavily underutilized health data resource. In Chapter 2 we presented
an overview of prior work describing online health seeking behavior and, more specifically, online health
community (OHC) participation. Synthesized via a cross-disciplinary literature review, this chapter serves
to illuminate how people use the Internet as a health resource. In Chapter 3 we present a novel review
of prior work that utilizes PAT as a primary data source. We discuss goals, data sources, methodological
approaches and outcomes, providing a contextual background against which to interpret and evaluate
the rest of our work. To our knowledge, this review is the first such synthesis of prior work focused on
extracting value from PAT.
The development of ADEPT (Chapter 5) – our CRF classifier that automatically identifies medically-
relevant terms in PAT – was prompted by our observation that existing biomedical term annotation toolk-
its perform poorly on PAT. While statistical classifiers present an attractive alternative, acquiring large,
expert-annotated PAT corpora on which to train and test them is a major challenge. To this end, we prove
121
CHAPTER 9. CONCLUSION 122
that a crowd of non-experts yields annotations comparable in quality to experts’ for the PAT medical term
identification task. Our result offers an alternative method for acquiring large annotated PAT corpora both
quickly and cheaply. However, our task design failed to yield similar quality results for more specific PAT
annotation tasks (e.g. identifying all symptom terms). This underscores the tradeoff between design-
ing crowdsourcing tasks and annotating the data oneself. Applying ADEPT to large PAT corpora yields
high-level insights useful for summarization and hypothesis generation; however, the tool is too broad
for fine-grained analysis. For higher-resolution insights, we narrow our focus to the topic of addiction: a
highly prevalent but stigmatized medical condition.
Understanding why people author PAT is crucial for matching it with appropriate research questions.
In Chapter 6, we investigate users’ motivations for participating in Forum77: MedHelp’s Addiction: Sub-
stance Abuse community. Our thematic analysis over initiating posts concurs with prior work stating that,
in general, people seek both informational and emotional support from OHCs. However, our analysis
also reveals distinct sub-categories of these two kinds of support. Of particular interest is the update:
a prevalent emotional support seeking post in which the user does not explicitly request a community
response. We train two logistic regression classifiers: the first distinguishes emotional from informational
support-seeking posts; the second, update from non-update posts. Applying these to the entire Forum77
data set reveals that update posts garner slightly more responses on average than non-update posts.
The prevalence of update posts suggests that users value the forum as a place where their personal
progress can be witnessed by others and recorded for posterity. Forum77 also serves as a repository
for information on opioid withdrawal. In fact, Thomas’ Recipe, a protocol for medication-assisted opioid
withdrawal that evolved on Forum77, suggests that Forum77 users actively collaborate on developing
effective treatment protocols.
In Chapter 7 we investigate the distribution of drugs of choice (DOCs) in the Forum77 population. A
close reading indicates that identifying DOCs is a context sensitive problem, as a variety of substances
can serve as either addiction or treatment. A CRF classifier trained on manually annotated data is able to
identify DOCs with high accuracy. Our resulting analysis, which compares the Forum77 DOC distribution
to those of other drug-using populations, reveals that the Forum77 population struggles disproportion-
ately more with prescription opioids, and disproportionately less with traditionally abused substances
such as alcohol, marijuana and cocaine. While it is difficult to ascertain whether Forum77 reflects real-
world drug use trends, our results do suggest that Forum77 represents a population of drug users that
is not well covered by existing monitoring systems.
CHAPTER 9. CONCLUSION 123
Finally, in Chapter 8, we analyze the process of opioid withdrawal, recovery and relapse on Fo-
rum77. Through a thematic analysis, we develop a taxonomy describing phases of addiction based on
Prochaska’s Transtheoretic Model for behavior change. Phases of addiction are accompanied by distinct
physiological and psychological changes, and this is mirrored in users’ usage of the site: exploring activ-
ity and linguistic features from posts across the phases USING, WITHDRAWING and RECOVERING reveals
several significant differences. We leverage these differences to train a sequence-based CRF model
to annotate users’ phase sequences automatically. We can also identify relapse events from these se-
quences, as well as whether a user’s final post was made in a state of RECOVERING, with high accuracy.
Our resulting analysis of all Forum77 users’ transition sequences indicates that despite the fact that
relapse is common, leaving the forum in a state of RECOVERING remains the most probable outcome.
Moreover, we show that high engagement with the community correlates with the probability of a user
RECOVERING by her last initiating post on the forum. Overall, these results suggest that Forum77 is an
effective detoxification aide. To our knowledge, this work is the first that attempts to quantify the phases
of addiction and the transitions between them.
9.2 Future Work
Given the considerably high levels of enthusiasm currently surrounding health-related technology, our
contributions present a timely foundation and reference. However, many limitations to realizing the full
value of PAT remain. In this section, we articulate key opportunities for future research.
9.2.1 Supporting the Methodological Process
Figure 9.1 (replicated from Chapter 1) illustrates the stages of our methodological process for extracting
insights from PAT. At present, most of the stages in the main process (top row) must be cobbled together
in an ad-hoc fashion by the researcher. This hurts efficiency, replicability and makes comparison be-
tween studies difficult. Developing this process into more of a standardized pipeline would enable closer
synergy between disparate research efforts, and make it easier to identify quality results. We suggest
several areas for improvement below.
CHAPTER 9. CONCLUSION 124
Labeled Data (auto)
Medical Discovery
PAT interface design
application
Content Schema
Labeled Data
(human) Classifier Features PAT
close reading
annotation training
Insights Process-ed Data
schema revision
processing & analysis
tuning
Future Work
108 Figure 9.1: Our general methodological process. Nodes in grey show avenues for future work supportedby our contributions.
Interface Support for Thematic Analysis
Thematic analyses are frequently used to develop deep insights into text-based corpora and to inform
future analyses. Moreover, as we note in Chapters 6 and 8, not only do the results of thematic analyses
stand as their own qualitative contribution, they also indicate junctions at which we may shift from a close-
reading to a large-scale, automated analysis. In spite of their complexity and importance, there is no
interface support for thematic analyses: provenance of this iterative process is never recorded; reasons
(and supporting examples) for making particular decisions about categories are lost; and the clustering,
combining, and splitting of categories is done primarily in the researchers’ working memories. Based on
our own experience, a starting point for interface support would provide visual “sand boxes” for comparing
and organizing data elements into categories; support for flagging items that either especially support,
or especially contend, the proposed taxonomy; and facilitate the easy expression of categorization rules.
Aside from making thematic analyses more efficient and consistent, externalizing the process in this
fashion would make resulting taxonomies easier for a third party to verify, compare against and reuse.
Improved Tools for Annotation
Related to the matter of interface support for thematic analyses is interface support for data annotation.
In our work, we conducted this process primarily through the use of shared spreadsheets. While this
makes data output easy, it hinders comparison between non-adjacent data elements; does not support
CHAPTER 9. CONCLUSION 125
the capture of spontaneous updates to annotation rules that arise from encountering novel examples;
and only weakly supports collaboration between annotators. Examples of features that an annotation in-
terface might provide include visual support for clustering and comparing data elements; automatic label
suggestions based on underlying text analytics; iterative updating of annotation rules in response to new
data elements; and automatic evaluation of inter-annotator agreement that facilitates rapid exploration of
agreements and errors. Not only would such an interface make the annotation process faster and more
consistent, but it may also encourage standardization in annotation and reporting practices.
Mapping the Limits of the Crowd in PAT Annotation Tasks
In Chapter 5 we showed that the crowd can replace medical experts for some PAT annotation tasks. How-
ever, correctly designing crowdsourcing tasks is sufficiently time consuming that in subsequent chapters,
we elected to annotate our data manually. Exploring the crowd’s ability to perform a variety of PAT anno-
tation tasks, however, remains a crucial avenue for future work. Without it, it would be difficult to scale
analyses such as ours to larger forums or to multiple data sets. More importantly, however, this would
make it easier to create and share large, labeled corpora within the research community. Due to our
data sharing agreement with MedHelp, we were unable to share any of our labeled data sets. However,
making a large, labeled PAT corpus available to the public would be the most direct way to stimulate
research on these topics.
9.2.2 PAT Interface Design & Support
Despite their popularity, the general structure of online health communities (OHCs) has barely changed
since the late 1990’s. However, both insights and classifiers derived through the PAT analysis pipeline
could prove valuable if incorporated into OHCs. As we show in Figure 9.1, closing this loop may create
a virtuous cycle, in which the results of interface improvements result in higher volumes and quality of
PAT. This, in turn, would lead to more fine-grained insights and improved classifiers. While we do not
implement any interface changes in this work, we have several suggestions.
Expose Aggregate Data to Users
OHC participants spend hours doing tasks that often amount to simple aggregation, such as calculating
treatment popularity, establishing what Forum77’s most popular DOC is, and estimating the probability
of a successful detoxification attempt conditional on a specific withdrawal method. This is inefficient: not
CHAPTER 9. CONCLUSION 126
only are OHCs difficult to navigate for these sorts of tasks, but often many users will conduct identical
analyses at different points in time. In the best case, exposing such data to users could alleviate users’
need to reinvent the wheel for each analysis, freeing their time for alternative tasks.
Support Data Entry
One critique of PAT is that it is often incomplete in terms of containing all relevant medical information.
Nudging users towards providing more complete accounts of their conditions would enrich our analyses
and enhance PAT’s credibility as a data source. One example is “symptom autocomplete”: rather than
relying on users to remember and list all of their symptoms (some of which may not even be severe
enough to notice), it would be relatively straightforward to automatically suggest (or “autocomplete”)
symptoms based on the ones already entered.
Automatically Construct User Timelines
Personal timelines are commonplace in social media and the quantified self movement. Our work on
users’ reasons for participating in Forum77 (Chapter 6) indicate that they value its archival features.
Making it easier for users to browse their histories, especially histories enhanced with structured data
provided by classifiers, could facilitate an array of tasks, from discovering behavioral patterns to finding
other “people like them”. Quantitative uses aside, a timeline comprises a narrative of important life
events, failures, and accomplishments that would have strong emotional significance to users. Given the
chance, it is likely that users would take it upon themselves to curate their own timelines: a situation that
could be leveraged to have users label their own data.
9.2.3 Making the Leap to Medical Discoveries
Our work adds to a growing body of proof that medically-relevant insights are automatically extricable
from PAT. However, the holy grail is to move from medical insights to actionable medical discoveries. In
our own work, efforts along these lines might include extending our work on identifying drugs of choice
(Chapter 7) to support real-time identification of new drugs, or extending our work on phases of addiction
(Chapter 8) to prove that participation in Forum77 measurably reduces the number of relapses that
someone experiences. However, making such leaps is nontrivial. Challenges include understanding how
signals in PAT correspond to real-world trends, in spite of the fact that PAT rarely contains demographic
data; clinically verifying results, which is both slow and expensive; and developing new experimental
CHAPTER 9. CONCLUSION 127
designs that are compatible with online health seeking behavior. Such challenges could only be met
through a close-knit collaboration with medical professionals who agree that PAT is a valuable data
source.
9.3 Concluding Remarks
Patient authored text is the abundant byproduct of hours of human intelligence spent on complex, health-
related problem solving tasks. As long as this valuable resource is underutilized, researchers, patients
and medical professionals alike will be deprived of the unique insights and benefits that it has to offer.
Although this dissertation takes a step towards leveraging some of the considerable work that patients do
in managing their own health, this is only the tip of the iceberg: we anticipate a future in which technology
creates, supports and encourages synergy between patients, providers and data.
Appendix A
ADEPT Supplementary Material
Table A.1: The following features are specified when training our CRF. Other features retain their defaultvalues as described at http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html
Property Name Type Value Description
useClassFeature boolean TRUE Include a feature for the class (as a class marginal). Puts a prior on theclasses which is equivalent to how often the feature appeared in the trainingdata.
useWord boolean TRUE Gives you feature for w
useNGrams boolean TRUE Make features from letter n-grams, i.e., substrings of the word
noMidNGrams boolean TRUE Do not include character n-gram features for n-grams that contain neither thebeginning or end of the word
useDisjunctive boolean TRUE Include in features giving disjunctions of words anywhere in the left or rightdisjunctionWidth words (preserving direction but not position)
maxNGramLeng int 7 If this number is positive, n-grams above this size will not be used in themodel
usePrev boolean TRUE Gives you feature for (pw,c), and together with other options enables otherprevious features, such as (pt,c) [with useTags)
useNext boolean TRUE Gives you feature for (nw,c), and together with other options enables othernext features, such as (nt,c) [with useTags)
useSequences boolean TRUE
usePrev boolean TRUE
useNext boolean TRUE
maxLeft int 1 The number of things to the left that have to be cached to run the Viterbialgorithm: the maximum context of class features used.
useTypeSeqs boolean TRUE Use basic zeroeth order word shape features.
useTypeSeqs2 boolean TRUE Add additional first and second order word shape features
useTypeySequences boolean TRUE Some first order word shape patterns.
wordShape String chris2useLC Either none for no wordShape use, or the name of a word shape functionrecognized by WordShapeClassifier.lookupShaper(String)
128
Appendix B
F77 Purpose Supplementary Material
Table B.1: Features used to train our purpose classifiers, which distinguish emotional from informationalsupport seeking, as well as update from non-update posts.
Feature Name Description
containsQuestion whether the post contains a question (binary)
numQuestions number of questions the post contains
unigrams all words present in the post
bigrams all bigrams (two consecutive words) present in the post
timeMentioned number of days clean time (if mentioned). Extracted using the following twopatterns:
X:NUM (day—days—week—weeks—months—month—year—years) (clean—off)
on? ”day—days” X:NUM
where NUM is any number and ”—” represents the OR operator. We thenconvert weeks/months/years to days and use the number of days as the featurevalue. The default value is 0.
numPosWords number of words with a positive sentiment score in SentiWordNet of ¿ 0.8
numNegWords number of words with a negative sentiment score in SentiWordNet of ¿ 0.8
daysMentioned whether the user mentions a number followed by the term “day” or “days”’
days since last initiating post the number of days since the user’s last initiating post
129
Appendix C
F77 Drug of Choice Supplementary Ma-
terial
Table C.1: Drug term resolution map, manually compiled from classifier output. The i column indicateswhether the drug category is included in our analysis in Chapter 7.
Category Drug name Resolved drug terms i
alcohol alcohol acholic, acoholic, alcahol, alchohol, alchol, alcholo, alcohol, alcoholic, alcoholoc, alcolhol, alcololic,alocholic, alocohol, champagne, beer, beers, vodka, wine, drink alcohol, drink beer, drinking alco-hol, drink wine, drinking beer, drinking wine, drinks, beer bottles, beer drinking, alcohol drinking,alcohol drinks, alcoholic drink, drink, drinking
◦
cigarettes cigarettes cigarettes, cigaretts, cigarrettes, cigars, cigerattes, ciggaretes, ciggarettes, ciggaretts, ciggerettes,ciggies, ciggs, cigrattes, cigs, smoke, smoke cigarets, smoke cigarettes, smoke cigs, smoked,smoker, smokes, smokes cigarettes, smokes ciggaretts, smokes cigs, smokin, smokin cigs, nico-tine, smoking cigarettes, smoking, smoking cigs
◦
cocaine cocaine cocaine, cocain, cocaine, cocane, coccaine, cociane, coke, coaine, powder, smoke cocaine, smokecoke, smokin coke, smoking coke, smoking crack, smoking crack cocaine
◦
hallucinogens hallucinogens hallucinogens, mescaline ◦
psilocybin mushroom psilocybin mushroom, mushrooms, shrooms, psychedelics ◦
heroin heroin heroin, herioin, herion, heroin, heroin cocaine, heroine, smoking heroin, smack, smoke heroin,heroin heroin, heroin smoking
◦
marijuana marijuana marijuana, marajuana, marihuana, marijanna, marijauna, marijuan, marijuana, marijuana smoker,marijuanna, marijuanna smoker, marjuana, marjuana smoke, pot, pot brownies, pot smoke, potsmoker, pot smokers, pot smokin, weed, weed smoker, smoke marijuana, smoke marijuanna,smoke pot, smoke weed, smoked pot, smokes marijuana, smokes pot, smokes weed, smokinpot, smoking marijuana, smoking pot, smoking weed, dope, pot smoking, smoking weed, smok-ing dope, smoke dope, hash, hashish, smoked weed, smokin dope, marijuana smoke, marijuanasmoked, marijuana smoking
◦
methadone methadone methadone, mehadone, mehtadone, mehtadone pain killers, metadone, methadoen, methadome,methadon, methadone, methadone pain killers, methadones, methadont, methadose,methandone, methaodne, methatdone, methdaone, methdone, methedome, methedone, method-one, methodone pain pills, methondone, methone, mdone
◦
suboxone sub, suoxone, subbies, subboxin, subboxine, subboxone, subetex, subitext, subloxone, subo,subone, subonoxe, subooxone, subotex, subox, suboxan, suboxe, suboxen, suboxene, subox-ens, suboxin, suboxine, suboxins, suboxne, suboxom, suboxome, suboxon, suboxone, subox-ones, suboxtone, suboxyn, suboxzone, subozone, subroxone, subs, soboxan, soboxen, sobox-ene, soboxin, soboxine, soboxion, soboxon, soboxone, soboxones, sabonxon, saboxan, saboxen,saboxin, saboxins, saboxon, saboxone, subtex, subutec, subutek, subutex, subutext, subutox, sub-uxone, subx, subxone, syboxin, syboxone, symboxin, buprenorphine, buprenorphine, bupenor-phine, bupenorphrine, bupernepherine, bupernorphine, bupremorphine, buprenex, buprenophine,buprenorphene, buprenorphine, bupreorphine
◦
Continued on next page
130
APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 131
Table C.1 – Continued from previous page
Category Drug name Resolved drug terms i
opioid codeine codeine, codeiene, codein, codeine, codeine otc pills, codeine painkillers, codeine sulphate, co-dene, codene pain pills, codien, codiene, codiene painkillers, codiens, codine, codone, coedine,tylenol 3, tylenol3
◦
dextropropoxyphene dextropropoxyphene, darovcet, darv, darvacet, darvacets, darvaset, darvecet, darvecette, darvicet,darviset, darvo, darvocet, darvocets, darvocett, darvocette, darvon, darvoncet, darvos, darvoset,darvs, darvys, davocet, davort, dextropropoxyphene
◦
dialudid diladid, diladin, diladud, dilantin, dilatin, dilaudad, dilauded, dilaudeds, dilaudid, dilaudin, dilauid,dillauded, dilodid, dilodids, diloted, dilotid, dilotted, diloudid, diluadid, diluadids, diludid, diluidid,hydromorphone, hydromophone, hydromorophone, hydromorphcontin, hydromorphine, hydromor-phone
◦
fentanyl actiq, fenatyl, fentaly, fentanol, fentanyl, fentanyl pain patch, fentanyl pain patches, fentayl, fentenal,fentenyl, fentinol, fentnyl, fentora, fentyl, fentynal, fentynal pain patches, fentynl, fentynol, fentynyl,fetynal
◦
hydrocodone hydrocodone, hrdrocodone, hudro, hycodan, hycodne, hydo, hydocodone, hydorcodone, hydors,hydos, hydos-75, hydr, hydracodone, hydrco, hydrcodene, hydrcodone, hydro, hydro codeine,hydro-codone, hydroc, hydrocdone, hydrochodone, hydroco, hydrocod, hydrocodan, hydrocode,hydrocodeine, hydrocoden, hydrocodene, hydrocodien, hydrocodiene, hydrocodin, hydrocodine,hydrocodine pills, hydrocodne, hydrocodon, hydrocodone, hydrocodones, hydrocodons, hydrocon-done, hydrocondone pain medication, hydrocone, hydrocordon, hydrocordone, hydrodcodone, hy-drododone, hydrodone, hydromet, hydromorphone hydrochloride, hydros, hydrycodone, hyrdo, hyr-docodone, hyrdos, hyrdro, hyrdrocodone, hyrdros, hyro, hyrocodone, hyros, smoke hydro
◦
lortab lortab, loratab, loratabs, loratb, lorcet, lorcets, lorcett, lorecet, lorecets, lorects, loretab, loricet,loritab, loritabs, lorocet, lorocets, lorotabs, lorset, lortab, lortab◦, lortab◦-5, lortabs, lotab, lotabs,loracet, loracets
◦
meperidine meperidine, demerol, demeral, demerol, demoral, demorol ◦
morphine morphine, mophine, moraphine, morhine, morhphine, morhpine, moriphine, morophine, morp, mor-phane, morpheine, morphen, morphene, morphin, morphine, morphines, morphone, morpine,mscontin, morphine mscontin, morphine sulf, morphine sulphate, ms-contin, avinza, ms contin,oramorph, kadian
◦
norco norco, noco, noraco, norc, norce, norco, norco vicodin, norcos, norcs, nordco, noreco, norko,noroco, norocs, narco, narcos
◦
opiates opiates, opates, opiade, opiants, opiat, opiate, opiates, opiats, opiete, opiets, opiot, opiote, opiotes,opitaes, opitate, opitates, opites, oopiate, opaite, opaite pain meds, opaites, opiate meds, opiatenarcotic pain pills, opiate narcotics, opiate pain killer, opiate pain killers, opiate pain medication,opiate pain medications, opiate pain medicines, opiate pain meds, opiate pain pill, opiate painpills, opiate painkillers, opium, opiads-heroine/percs/hydro, opiate drug, opiate narcotic pain, opiatepain, opiate pain med, opiates percs, opiates vicodin, opiates xanax, oppiates, smoking opium
◦
opioids opioids, opiod, opiods, opioid, opioids, opoid, opoids, opiod drug, opiod narcotic, opioid meds,opioid pain med, opioid pain medications, opioid pain meds
◦
oxycodone oxycodone, roxcodone, roxi, roxicdone, roxicet, roxicets, roxicodne, roxicodone, roxicodones, roxi-contin, roxicontins, roxicotin, roxies, roxiodone, roxis, roxocodone, roxy, roxy codone, roxy3, roxy4,roxycet, roxycodine, roxycodone, roxycodones, roxycontin, roxycontins, roxycotin, roxycottin, roxys,oxcodone, oxcontin, oxcotin, oxcy, oxcycodone, oxcycontin, oxcycotin, oxcyontin, oxcys, oxen, oxey,oxeys, oxi, oxicoden, oxicodon, oxicodone, oxicontin, oxicotin, oxicotines, oxicoton, oxie, oxie co-dine, oxies, oxocodone, oxtcontin, oxxy, oxy, oxy codone, oxy contin, oxy-contin, oxy4, oxy8, oxy8s,oxyc, oxyco, oxycocet, oxycocets, oxycod, oxycode, oxycodeine, oxycoden, oxycodene, oxycodin,oxycodine, oxycodne, oxycodon, oxycodone, oxycodones, oxycodpne, oxycoidone, oxycondin, oxy-condone, oxyconin, oxycontiin, oxycontin, oxycontine, oxycontins, oxyconton, oxycontontin, oxy-coontin, oxycotdin, oxycoten, oxycotin, oxycotine, oxycotins, oxycotion, oxycoton, oxycotten, oxy-cottin, oxycottins, oxycotton, oxydocone, oxydodone, oxydone, oxyicodone, oxyies, oxyir, oxynorm,oxys, oxytocin, oxyxodones, oxyz, oycodone, blues, blue pills, ocs, ocycodone, oxy hydro, oxyocs, oxy vics, oxy-norm, oxy/percs/tabs, oxycodone oxycontin, oxycodone pain meds, oxycontin,oxycotontin, smoking oxy, smoking oxycontin
◦
oxymorphone oxymorphone, opana, opanas ◦
percocet percocet, perc, percacet, percacets, percaset, percasets, perccet, percecet, percecets, percet,percets, percicet, percks, perco, percocect, percocet, percocete, percocets, percocett, percocette,percocetts, percocite, percocoet, percoct, percodan, percodone, percoet, percoets, percoset, per-cosets, percot, percote, percots, percs, perkacet, perkeset, perkocet, perkocets, perks, perocaet,perocet, perocets, persocet, pecocet, pecocets, pers, perts
◦
tramadol tramadol, tradol, tram, tramacet, tramadal, tramadaol, tramado, tramadol, tramadole, tramadols,tramadon, tramal, tramdol, tramedol, tramidol, trammadol, tramodal, tramodol, tramol, trams,tranadol, ulram, ultam, ultracet, ultram, ultrams, ultrm, ultrum
◦
Continued on next page
APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 132
Table C.1 – Continued from previous page
Category Drug name Resolved drug terms i
vicodin vics, vicks, vic, vicadan, vicaden, vicadin, vicadine, vicadon, viccodin, viccoding, vicdin, vicdon,vicdone, viciden, vicidin, vicidine, vicidon, vicidons, viciodin, vico, vicodan, vicodein, vicodeine,vicoden, vicodene, vicodens, vicodent, vicodien, vicodine, vicodines, vicoding, vicodins, vicodion,vicodn, vicodon, vicodone, vicodyn, vicoin, vicondin, vicos, vicotin, vidodin, vik, vikcs, vike, vikes,vikoden, vikodin, viks, viocdin, viocidin, viocoden, viodin, vivodin, vivodins, vocidin, vocodin, vicodin
◦
vicoprofen vicaprofen, vicobrofin, vicoprofen, vicoprofin, vicoprohen, vicoprophen, vicoprophin, vicroprofen,vicuprofen
◦
OTC acetaminophen acetaminophen, acetamenophin, acetamenophine, acetaminaphen, acetaminaphin, ac-etaminophen, acetem, aceteminophen, acetomenophine, acetominophen, acetominophin
◦
benadryl benadryl, benadril, benadryl, benadryll, bendryl, benedryl, benodryl, benydryl ◦
dextromethorphan dextromethorphan, dxm ◦
ibuprofen advil, ibeprofen, ibogaine, ibp, ibprofen, ibprofin, ibprohin, ibprophin, ibu, ibupofen, ibupro, ibupro-fen, ibuprofin, ibuprophen, ibuprophin, ibuprophren, ibupropin, mortin, mortrin, motrin, neurofen,neurophen, nurofen
◦
melatonin melantonin, melatonin, meletonin, melitonin, melotonin ◦
naproxen naproxen, aleeve, aleve, aleive, alieve, alleve ◦
nyquil nyquil, nyquill ◦
paracetamol paracetamol, paracetemol, paracetomal, paracetomol, parecetamol ◦
tylenol tyelonol, tyenol, tyl, tylanol, tylenal, tylenol, tylenol oc, tyleonol, tylinol, tylonal, tylonel, tylonol, tylox,tyloxes, tynenol, tyneol, tylenol
◦
sedative alprazolam alpralozam, alprazalam, alprazolam, alprozalam, alprozolam, ◦
ativan ativan, adavan, adavant, adavin, adivan, advan ◦
barbiturates barbiturates, barbituates, butalbital, phenobarbital, barbs ◦
benzodiazepine benzodiazepine, benzo, benzocaine, benzodiazapenes, benzodiazapines, benzodiazepams, ben-zodiazepines, benzodiazpines, benzoes, benzoids, benzos, oxazepam
◦
buspirone buspirone, buspar ◦
chlordiazepoxide chlordiazepoxide, librium ◦
clonazepam clonazepam, klnopin, klodopin, klon, klonapin, klonapins, klonepin, kloni, klonidine, klonipin,klonipin oxycontins, klono, klonoin, klonoipn, klonopan, klonopin, klonopine, klonopines, klonopins,klonpin, klonpion, klonzapam, klopin, kloponin, kolonapin, kolonipin, kolonopin, kolonopins, kpins,clonazepam, clonazepams, clonazepham, clonozepam, clonapin, clonapine, clonopin, clonopine,clonipin, clonipine, clonipins, colonopin
◦
diazepam diazapam, diazapams, diazepam, diazipam ◦
eszopiclone eszopiclone, lunesta ◦
fioricet fioricet, fioricet, fierocet, fioracet, fiorcet, fiorecet, fiorecett, fioricet, fiorocet, fiurecet, floricet, fiori-nal, fiorinals, fiorinol, fiornal, fiorinal, fiorinals, fiorinol, fiornal
◦
flunitrazepam flunitrazepam, rohypnol ◦
gabapentin gabapentin, nerontin, neuotin, neuratin, neurontin, neuronton, neurontrin, neurotin, neuroton, neu-rontin
◦
ghb ◦
lorazepam lorazapam, lorazapan, lorazepam, lorazepan, lorazopam, lorazpam, lorezapam, lorezepam ◦
soma soma, soma pills, somas ◦
valium valium, valiums, vallium, vallum, valuem, valuim, valuims, valum, valume ◦
xanax xaanx, xana, xanac, xanacs, xananx, xanax, xanex, xanix, xannax, xannies, xantax, xantex, xanx,xanxa, xanxax, xnax, xznax, zanax, zanaz, zanex, zanix, zannax, zantac, zanx
◦
zolpidem zolpidem, ambein, ambian, ambiem, ambien ◦
sedatives sedatives, ketamine ◦
stimulant adderall adderal, adderal, adderall, adderalll, adderol, adderral, adderrall, adderrol, addreall, aderol, aderoll,aderrall, dexedrine, dextroamphetamine
◦
amphetamine amphetamine, amphetamines ◦
amphetamine ◦
LSD lsd, acid ◦
mdma mdma, ecstacy, ecstasty, ecstasy, exstacy, extacy ◦
Continued on next page
APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 133
Table C.1 – Continued from previous page
Category Drug name Resolved drug terms i
methamphetamine methamphetamine, meth, meth smoker, methamphedamines, methamphetamine, metham-phetamines, methamphetimines, methanphetamine, smoking meth
◦
methylphenidate methylphenidate, ritalin, ritilan, ritilin, concerta ◦
modafinil alertec ◦
general general meds, drugs, drug, med
narcotics narcotics, narc, narc meds, narc pain meds, narc painkillers, narcan, narcanon, narcatic, narcatics,narcodics, narcotic, narcotic meds, narcotic pain killers, narcotic pain medication, narcotic painmedications, narcotic pain medicine, narcotic pain medicines, narcotic pain meds, narcotic painpill, narcotic pain pills, narcotic pain reliever, narcotic pain relievers, narcotic pain-killers, narcoticpainkillers, narcotic pills, narcotics, narcotis, narcs, narctoics
painkillers pain pill, analgesic, analgesics, pain meds, pain pills, pain killers, painkillers, pain medication, painmedicine, pain medications, pain relievers, pain kiilers, pain killer, pain killer pills, pain killlers, painkills, pain kller, painpills, pain med, pain reliever, painkillers hydros, painkliiers, painmeds, painsmeds, pill, pills, narcotic pain, pils, ls, pilss, pharmaceuticals, pain
antidepressant amitriptyline amitriptyline, amiltriptyline, amitriptaline, amitripthyline, amitriptyline, amitripyline, amitryptaline,amitryptilline
aripiprazole aripiprazole, abilfy, abilify
citalopram citalopram, celexa, celexia, celxa
duloxetine duloxetine, cymbalta, cybalta, cymalta, cymbalata, cymbalta, cymbalts, cymbata, cymbolta, cyn-balta
fluoxetine prozac, fluoxetine, fluoxtine
lexapro
paroxetine paroxetine, paroxetine, paroxotine, paxatine, paxial, paxil, paxill
trazodone trazadone, trazodone
venlafaxine venlafaxine, effexor, efforex, effxor, eflexor
wellbutrin bupropion, buproprio, wellbutrin, welbrutrin, welbutrin, wellbrutrin, wellbutrin
zoloft zoloft, zoloff
NA albuterol albuterol sulphate, albuterol
amoxicillin amoxicillin, amoxcillin, amoxicillin, amoxxillin
antibiotics antibiotics, anitbiotics, anitdepressants, anphetamines, antabuse, antibiotic, antibiotics, antibitics,antibotics
carisoprodol carisoprodol
clonidine clonidine, cloadine, clondine, clonidine, clonine, clonodin, clonodine, colodine, colondine, coloni-dine, colonodine
cyclobenzaprine cyclobenzaprine, flexaril, flexarill, flexeral, flexerall, flexeril, flexerill, flexerils, flexerol, flexiril, flexirils,flexirl, flexril, flexrill
naloxone naloxone, nalorex, naloxone
naltrexone naltrexone, naltexone, naltraxone, naltrex, naltrexone, naltrexone hydrochloride, naltexone, naltrax-one, naltrex, naltrexone, naltrexone hydrochloride
prednisone prednisone, predensone, predinsone, predisolone, predisone, prednisolone, prednison, pred-nisone, prednizone
pregabalin pregabalin, lyrica
quetiapine quetiapine, seraquel, seraquil, sereoqol, serequel, serequil, serezone, seriquil, seroqel, seroquel,seroquell, seroquels, seroquil, serqual, serquil
steroids steroids, roids
vitamins vitamins, vitaimns, vitamans, vitamians, vitamines, vitamins, vitamns, vitams, vitc, vite, vitiamins,vitiams, vitimans, vitimins, vits, supplements
zaleplon zaleplon, sonata
APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 134
Table C.2: The default feature list for Stanford’s NER classifier is at nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/ie/NERFeatureFactory.html. Here, we list all features whose defaultvalues were changed to train our DOC classifier.
Feature Name Feature Value
useTag trueuseClassFeature trueuseWord truemaxNGramLeng 3useNGrams trueusePrev trueuseNext trueuseSequences trueusePrevSequences truemaxLeft 1useTypeSeqs falseuseTypeSeqs2 falseuseTypeySequences falsewordShape chris2useLCuseLemmas trueuseDistSim truedistSimLexicon We used Twitter word clusters [189] and word clusters generated using the Brown hierar-
chical word clustering algorithm [32,157] on all MedHelp posts.useDisjunctive truedisjunctionWidth 3cleanGazette truegazette We utilized a dictionary composed from several online lists of commonly misused sub-
stances. Table C.3 shows all dictionary terms.
APPENDIX C. F77 DRUG OF CHOICE SUPPLEMENTARY MATERIAL 135
Table C.3: Gazette of common substances used as a feature in the DOC classifier. This gazette wascompiled from a range of online resources.
Acamprosate, acid, actiq, adderall, aerosol propellants, alcohol, alprazolam, ambien, amidone, amobarbital, am-phetamine, amphetamines, amytal, anadrol, anexsia, angel dust, antabuse, apache, ativan, avinzaBarbs, beer, bennies, bidis, big o, biocodone, biocondone, biphetamine, biscuits, black beauties, black stuff, blueheaven, blues, blunt, buprenorphine, butalbital, butane propane, butorphanolCactus, campral, captain cody, carisoprodol, cat valium, chalk, charlie, china girl, china white, chlordiazepoxide,cigarettes, cigars, clarity, clonazepam, clonidine, cocaine, cocaine hydrochloride, codeine, cody, coke, concerta,crack, crack cocaine, crank, crosses, crystal, crystal meth, cubes, cyclohexylDamason-p, dance fever, darvocet, darvon, demerol, demmies, depade, depo-testosterone, desoxyn, dexedrine,dextroamphetamine, dextromethorphan, dextropropoxyphene, dextrostat, di-gesic, diacetylmorphine, diazepam, di-codid, dilaudid, dillies, disulfiram, dolophine, dope, downers, duodin, durabolin, duragesic, duramorph, dxmEcstasy, empirin, empirin with codeine, equipoise, eszopicloneFentanyl, fioricet, fiorinal, fiorinal with codeine, fizzies, flake, flunitrazepam, forget-me pillGamma-hydroxybutyrate, ganja, gasoline, georgia home boy, ghb, glues, goodfella, goop, grievous bodily harm, gymcandyHalcion, hash, hash oil, hearts, hemp, heroin, hillbilly, hycodan, hydrococet, hydrocodone, hydromorphone, hydrosInhalant, isoamyl isobutylJackpot, jif, jointKadian, kapanol, ketalar sv, ketamine, klonopinLa turnaround, laam, laudanum, laughing gas, levacetylmethadol, librium, liquid ecstasy, liquid x, liquor, little smoke,lorazepam, lorcet, lortab, love boat, lover’s speed, lsd, luminal, lunesta, lysergic acid diethylamideMagic mint, magic mushrooms, maria pastora, marijuana, mary jane, meperidine, meperidine hydrochloride, mesc,mescaline, meth, methadone, methadose, methadrine, methamphetamine, methaqualone, methylphenidate, mex-ican valium, microdot yellow sunshine, miss emma, monkey, morphine, mrs. o, ms contin, msir, murder 8, mush-roomsNaltrexone, nembutal, nitrites, nitrous oxide, norco, numorphone, numporphanO bomb, o.c., octagons, opana, opium, oramorph, orlaam, oxandrin, oxy, oxycet, oxycodone, oxycontin, oxycottonPaint thinners, palladone, panacet, paregoric, pcp, peace, peace pill, pentobarbital, percocet, percocet:oxy, perco-dan, percs, peyote, phencyclidine, phennies, phenobarbital, poppers, pot, pumpers, purple passionQuaaludeR-ball, red birds, reds, reefer, revia, ritalin, roach, robitussin, robitussin a, robitussin a-c, robitussin b, robitussin c,robo, robotripping, roche, rohypnol, roids, roofies, roofinol, rophies, roxanol, roxicodone, roxicondone, ryzoltSally-d, salvia, schoolboy, secobarbital, seconal, shepherdess’s herb, shrooms, sinsemilla, skag, skippy, sleepingpills, smack, smoke, snappers, solvents, soma, sonata, special k, speed, steroids, stilnox, stop signs, sublimaze,suboxone, subutex, symtanTango and cash, temesta, the smart drug, tnt, tooies, toot, tramadol, tramal, tranks, triazolam, triple c, truck drivers,tussionex, tylenol, tylenol with codeine, tyloxUltram, uppersValium, vicodin, vicoprofen, vike, vitamin k, vitamin r, vivitrolWatson-387, weed, white horse, white stuff, wineXanax, xodolYellow jackets, yellowsZaleplon, zolpidem, zydone
Appendix D
F77 Phase Supplementary MaterialTable D.1: LIWC features for the three classes in the labeled dataset over initiating posts. Only statisti-cally significant variables are shown. Statistical significance is determined using Kruskal-Wallis tests (*p < 0.05; ** p < 0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise error rateacross all 184 variables (includes activity features). Column c denotes (◦) if the feature is used in ourCRF classifier.
Initiating Post Linguistic Features
USING WITHDRAWAL POST-WITHDRAWALc p Mean Median SD Mean Median SD Mean Median SD
Word count ◦ * 208.20 151 211.06 178.92 127.00 168.81 183.23 124.50 209.24Dic ◦ *** 89.26 90.17 4.89 88.10 88.89 6.26 89.38 90.54 6.59Numerals ◦ *** 1.28 0.89 1.51 1.75 1.33 1.97 1.32 0.83 2.04Function words ◦ *** 60.50 60.92 5.31 58.40 59.28 6.45 59.74 60.48 7.07Pronoun *** 18.51 18.68 4.32 16.99 17.17 4.70 17.97 18.16 5.28Personal pronoun ◦ *** 12.83 13.05 3.83 11.49 11.54 4.19 11.88 11.86 4.60Pronoun: I ◦ *** 9.72 9.97 3.60 9.02 9.14 3.76 7.89 8.18 4.31Pronoun: you *** 0.98 0.41 1.70 1.02 0.13 2.04 2.05 0.99 2.89Pronoun: he/she ◦ *** 1.14 0 2.08 0.74 0 1.82 1 0 2.46Pronoun: they ◦ *** 0.65 0.20 1.05 0.47 0 1.12 0.54 0 1.03Pronoun: impers. * 5.68 5.33 2.82 5.49 5.26 2.81 6.09 5.76 3.35Verb ◦ ** 18.54 18.69 3.76 17.64 17.59 4.20 18.13 17.96 4.91Present tense ◦ *** 12.56 12.55 3.90 11.53 11.24 4.09 11.95 11.63 4.45Numbers ◦ ** 0.71 0.48 0.93 0.75 0.37 1.12 0.54 0 0.89Social ◦ *** 7.60 6.59 4.79 6.38 5.26 5.18 8.85 7.89 5.90Humans ◦ * 0.49 0 0.76 0.40 0 0.79 0.57 0 1.04Affect ◦ *** 5.30 5.00 2.76 5.76 5.54 3.09 6.41 6.11 3.52Affect: positive ◦ *** 2.80 2.45 1.99 3.33 2.86 2.85 4.14 3.50 3.16Affect: anxiety ◦ ** 0.61 0.25 0.88 0.55 0 0.98 0.45 0 0.90Cognitive Mech. ◦ * 17.27 16.98 4.50 17.14 17.09 4.95 17.93 17.96 5.11Certain ◦ * 1.21 1.03 1.22 1.41 1.21 1.41 1.57 1.36 1.53Inhibition ◦ * 0.50 0.23 0.70 0.41 0 0.74 0.43 0 0.76See ◦ * 0.34 0 0.65 0.30 0 0.80 0.50 0 1.14Feel ◦ *** 0.73 0.45 1.10 1.18 0.83 1.50 0.85 0.50 1.23Biological ◦ *** 3.87 3.46 2.63 4.01 3.70 2.90 3.31 2.89 2.72Body ◦ *** 0.58 0 1 1.13 0.63 1.53 0.68 0 1.12Health ◦ *** 3.00 2.63 2.29 2.58 2.13 2.36 2.20 1.72 2.25Relative ◦ *** 13.46 13.39 4.65 15.04 14.75 5.25 13.72 13.61 5.23Time ◦ *** 7.24 6.86 3.46 8.51 7.87 4.21 7.33 7.02 4.23Home ◦ *** 0.30 0 0.54 0.40 0 0.77 0.68 0.14 1.18Comma ◦ ** 3.01 2.17 3.36 2.75 1.94 3.27 2.19 1.63 2.43QMark ◦ * 1.35 0.52 2.87 1.34 0.40 2.58 1.50 0 4.92Other Punctuation ◦ *** 0.81 0 1.77 0.89 0 1.91 0.62 0 2.05
136
APPENDIX D. F77 PHASE SUPPLEMENTARY MATERIAL 137
Table D.2: LIWC features for the three classes in the labeled dataset. Only statistically significant vari-ables are shown. Statistical significance is determined using Kruskal-Wallis tests (* p < 0.05; ** p <0.005; *** p < 0.001) after Bonferroni corrections to adjust for family-wise error rate across all 184variables (includes activity features). Column c denotes (◦) if the feature is used in our CRF classifier.
Response Post Linguistic Features
USING WITHDRAWAL POST-WITHDRAWALc p Mean Median SD Mean Median SD Mean Median SD
Word count *** 494.69 347.00 506.67 427.38 284.00 487.46 356.29 210.50 439.75Words per sentence *** 19.21 15.40 18.60 17.04 14.09 14.73 14.98 12.99 14.25Numerals ◦ * 0.75 0.43 1.02 0.95 0.68 1.12 0.95 0.56 1.49Function words *** 59.01 59.85 4.56 56.95 57.69 5.41 55.82 57.06 7.17Personal Pronouns *** 10.86 11.36 3.99 10.21 10.53 3.81 10.86 11.58 4.71Pronoun: she/he ** 0.68 0 1.35 0.44 0 1.16 0.64 0 1.63Pronoun: they *** 0.66 0.41 0.91 0.49 0.27 0.66 0.49 0.13 0.90Pronoun: impers. ** 5.48 5.67 2.20 5.57 5.78 2.36 5.10 5.32 2.75Article *** 4.91 4.98 2.06 4.75 4.96 2.02 4.20 4.41 2.23Verb ** 17.26 18.15 4.94 17.13 17.82 4.88 16.09 17.23 5.78Aux. verb *** 10.67 11.11 3.51 10.37 10.68 3.44 9.66 10.33 3.96Future *** 1.50 1.44 1.07 1.50 1.43 1.13 1.10 1.01 1.03Preposition *** 11.63 12.27 3.57 11.19 11.66 3.38 10.61 11.51 4.14Conjunction *** 6.39 6.76 2.33 6.18 6.58 2.46 5.72 6.13 2.69Quantitative *** 3.00 2.99 1.52 2.94 2.88 1.64 2.50 2.58 1.67Social ◦ *** 10.26 10.11 4.77 8.83 8.75 4.23 9.78 9.81 5.45Affect ◦ *** 5.73 5.76 2.68 6.55 6.34 3.25 7.54 7.33 4.31Affect: positive ◦ *** 3.72 3.53 2.43 4.61 4.10 3.17 5.84 5.13 4.36Affect: negative ◦ *** 1.96 1.92 1.34 1.90 1.87 1.33 1.67 1.50 1.51Affect: anxiety ◦ *** 0.36 0.24 0.47 0.40 0.23 0.55 0.32 0 0.61Cognitive Proc. *** 19.37 17.81 7.77 18.71 17.43 7.83 18.77 16.80 10Discrepancy *** 2.32 2.32 1.31 1.92 1.88 1.33 1.63 1.60 1.30Tentative *** 3.35 3.25 1.79 3.12 3.09 1.77 2.55 2.45 1.96Exclusive *** 3.35 3.40 1.62 3.07 3.18 1.66 2.56 2.60 1.83Perceptual proc. *** 1.52 1.48 1.07 1.90 1.81 1.34 1.87 1.68 1.55Feel *** 0.64 0.53 0.70 0.91 0.76 0.85 0.65 0.45 0.76Biological *** 3.46 3.20 2.17 3.42 3.22 2.46 2.71 2.41 2.39Body *** 0.52 0.28 0.78 0.78 0.45 1.08 0.52 0.19 0.90Health ◦ *** 2.68 2.45 1.85 2.24 1.95 1.90 1.70 1.32 1.76Sexual *** 0.15 0 0.35 0.14 0 0.36 0.30 0 0.89Ingetion * 0.17 0 0.39 0.30 0 0.66 0.25 0 0.71Relativity ** 11.46 11.82 4.39 12.36 12.68 4.70 11.90 12.50 5.37Time ** 5.29 5.10 2.90 5.88 6.06 3.12 5.66 5.69 3.33Money * 0.32 0.13 0.55 0.28 0 0.56 0.23 0 0.42Assent ◦ *** 0.27 0.07 0.50 0.40 0.18 0.81 0.62 0.27 2.01Colon ** 0.09 0 0.20 0.15 0 0.42 0.27 0 0.84Exclamation ◦ *** 1.02 0.34 1.79 2.25 0.82 5.08 4.52 1.68 8.40Dash ** 0.79 0.28 2.08 0.82 0 2.20 0.62 0 1.64Other punct. ◦ *** 3.41 2.84 2.55 4.29 3.53 3.22 5.64 4.29 6.35All punct. ◦ *** 22.07 21.51 9.71 25.75 23.69 14.52 29.69 26.82 19.27
APPENDIX D. F77 PHASE SUPPLEMENTARY MATERIAL 138
Tabl
eD
.3:
Act
ivity
and
cont
ent-b
ased
feat
ures
fort
heth
ree
clas
ses
inth
ela
bele
dda
tase
t.S
tatis
tical
sign
ifica
nce
isde
term
ined
usin
gK
rusk
al-W
allis
test
s(*p<
0.05
;**p<
0.00
5;**
*p<
0.00
1)af
ter
Bon
ferr
onic
orre
ctio
nsto
adju
stfo
rfa
mily
-wis
eer
ror
rate
acro
ssal
l184
varia
bles
(incl
udes
160
LIW
Cva
riabl
es).
Col
umn
cde
note
s(◦
)ift
hefe
atur
eis
used
inou
rCR
Fcl
assi
fier.
US
ING
WIT
HD
RA
WIN
GR
EC
OV
ER
ING
cp
Mea
nM
edIQ
RM
AD
Mea
nM
edIQ
RM
AD
Mea
nM
edIQ
RM
AD
Act
ivity
Cha
ract
eris
tics
All
time
#in
itiat
ing
post
sau
thor
ed**
*8.
845.
0010
.00
5.93
8.78
5.00
8.00
4.45
20.7
314
.00
22.0
013
.34
#se
lfre
spon
ses
auth
ored
***
13.9
35.
0018
.00
7.41
13.8
08.
0015
.00
8.90
33.2
623
.00
36.2
523
.72
#re
spon
ses
auth
ored
***
26.9
06.
0021
.00
8.90
23.6
18.
0021
.00
10.3
817
8.69
67.0
015
9.25
83.7
7#
initi
atin
gpo
sts
#re
spon
ses
auth
ored
◦**
*1.
361.
001.
311.
021.
280.
821.
260.
850.
530.
220.
350.
21D
ays
sinc
ela
stin
it.po
st◦
***
50.9
45.
0024
.00
5.93
21.0
42.
005.
001.
4831
.04
4.00
12.0
04.
45D
ays
sinc
ela
stse
lfre
sp.
◦**
*66
.34
9.00
43.5
011
.86
29.9
42.
008.
001.
4842
.05
6.00
17.0
07.
41D
ays
sinc
ela
stre
spon
se◦
***
73.3
75.
0027
.00
5.93
33.5
12.
006.
001.
4828
.68
2.00
5.00
1.48
Day
ssi
nce
last
activ
ity**
*39
.56
3.00
13.0
02.
9716
.66
1.00
2.00
0.00
17.7
61.
004.
000.
00
Last
5da
ys
#in
itiat
ing
post
sau
thor
ed◦
***
0.93
0.00
1.00
0.00
2.01
1.00
3.00
1.48
1.81
1.00
3.00
1.48
#se
lfre
spon
ses
auth
ored
◦**
*1.
370.
002.
000.
003.
321.
004.
001.
482.
890.
004.
000.
00#
resp
onse
sau
thor
ed◦
***
1.87
0.00
2.00
0.00
5.48
1.00
6.00
1.48
15.2
05.
0016
.00
7.41
#in
itiat
ing
post
s#
resp
onse
sau
thor
ed◦
***
1.02
1.00
0.00
0.00
1.06
1.00
0.58
0.64
0.58
0.33
0.87
0.42
Toda
y#
repl
ies
rece
ived
.5.
154.
005.
002.
975.
524.
005.
002.
976.
094.
006.
004.
45#
resp
onda
nts
.3.
823.
003.
002.
974.
053.
003.
002.
974.
683.
004.
002.
97#
self
resp
onse
s**
1.57
1.00
2.00
1.48
1.89
1.00
3.00
1.48
1.53
1.00
2.00
1.48
Post
and
Res
pons
eC
onte
ntC
hara
cter
istic
s
Initi
atin
g
Day
scl
ean
◦**
*42
1.15
14.0
017
5.00
17.7
947
.50
5.00
7.00
4.45
125.
9745
.00
74.0
043
.00
Day
sm
entio
ned
◦**
*52
.10
10.0
038
.25
11.8
619
.08
5.00
7.00
4.45
57.0
327
.00
48.0
028
.17
#qu
estio
ns◦
**2.
942.
003.
001.
482.
352.
002.
001.
482.
602.
002.
001.
48#
US
ING
term
s◦
***
0.73
0.00
1.00
0.00
0.35
0.00
1.00
0.00
0.25
0.00
0.00
0.00
#W
ITH
DR
AW
ING
term
s◦
***
0.50
0.00
1.00
0.00
1.11
1.00
2.00
1.48
0.44
0.00
1.00
0.00
#R
EC
OV
ER
ING
term
s◦
***
0.38
0.00
1.00
0.00
0.39
0.00
1.00
0.00
0.94
1.00
1.00
1.48
Res
pons
es#
US
ING
term
s◦
**0.
310.
000.
000.
000.
190.
000.
000.
000.
180.
000.
000.
00#
WIT
HD
RA
WIN
Gte
rms
◦**
*0.
860.
001.
000.
001.
181.
002.
001.
480.
760.
001.
000.
00#
RE
CO
VE
RIN
Gte
rms
◦**
*0.
530.
001.
000.
000.
530.
001.
000.
000.
780.
001.
000.
00
Bibliography
[1] Alcoholics Anonymous (“Big Book,” 4th ed.). AA World Services, Inc. (2001). [Online: http:
//www.aa.org/bigbookonline, accessed 20-May-2014].
[2] Narcotics Anonymous Annual Membership Survey. Narcotics Anonymous (2011). [Online:
http://www.na.org/admin/include/spaw2/uploads/pdf/PR/NA_Membership_Survey.pdf,
accessed 12-August-2013].
[3] Vital signs: Overdoses of prescription opioid pain relievers United States, 1999-2008. Center for
Disease Control. Morbidity and Mortality Weekly Report. (2011). [Online: http://www.cdc.gov/
mmwr/preview/mmwrhtml/mm6043a4.htm, accessed 93/4/2014.].
[4] Addiction medicine, closing the gap between science and practice. CASAColumbia (2012). [On-
line: http://www.casacolumbia.org/download/file/fid/1177, accessed 4/5/2014.].
[5] Commonly abused prescription drugs. National Institute on Drug Abuse (2012). [Online: http://
www.drugabuse.gov/sites/default/files/rx_drugs_placemat_508c_10052011.pdf, ac-
cessed 28-May-2014].
[6] Opiate withdrawal. MedlinePlus - U.S. National Library of Medicine (2012). [Online: http://www.
nlm.nih.gov/medlineplus/ency/article/000949.htm, accessed 28-May-2014].
[7] Internet user demographics. [Online: http://www.pewinternet.org/data-trend/
internet-use/latest-stats/, accessed 7/1/2014].
[8] Prescription painkiller overdoses: A growing epidemic, especially among women. Vital Signs.
CS238899B. Center for Disease Control. (2013). [Online: http://www.cdc.gov/vitalsigns/
pdf/2013-07-vitalsigns.pdf, accessed 9/4/2014].
[9] State and County QuickFacts. U.S. Census Bureau (2013). [Online: http://quickfacts.
census.gov/qfd/states/00000.html, accessed 28-August-2014].
139
BIBLIOGRAPHY 140
[10] Achrekar, H., Gandhe, A., Lazarus, R., Yu, S.-H., and Liu, B. Predicting flu trends using Twitter
data. In Computer Communications Workshops, IEEE (2011), 702–707.
[11] Ahmad, F., Hudak, P. L., Bercovitz, K., Hollenberg, E., and Levinson, W. Are physicians ready for
patients with Internet-based health information? Journal of Medical Internet Research 8, 3 (2006),
e22.
[12] Alpers, G. W., Winzelberg, A. J., Classen, C., Roberts, H., Dev, P., Koopman, C., and Barr Taylor,
C. Evaluation of computerized text analysis in an Internet breast cancer support group. Computers
in Human Behavior 21, 2 (2005), 361–376.
[13] Anand, S. G., Feldman, M. J., Geller, D. S., Bisbee, A., and Bauchner, H. A content analysis
of e-mail communication between primary care providers and parents. Pediatrics 115, 5 (2005),
1283–1288.
[14] Anderson, J. G., Rainey, M. R., and Eysenbach, G. The impact of cyberhealthcare on the
physician–patient relationship. Journal of Medical Systems 27, 1 (2003), 67–84.
[15] Aramaki, E., Maskawa, S., and Morita, M. Twitter catches the flu: detecting influenza epidemics
using Twitter. In Empirical Methods in Natural Language Processing, ACL (2011), 1568–1576.
[16] Aronson, A. R. Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap
program. In American Medical Informatics Association Annual Symposium, AMIA (2001), 17.
[17] Aronson, A. R., and Lang, F.-M. An overview of MetaMap: historical perspective and recent
advances. Journal of the American Medical Informatics Association 17, 3 (2010), 229–236.
[18] Ayers, J. W., Ribisl, K. M., and Brownstein, J. S. Tracking the rise in popularity of electronic nicotine
delivery systems (electronic cigarettes) using search query surveillance. American Journal of
Preventive Medicine 40, 4 (2011), 448–453.
[19] Baccianella, S., Esuli, A., and Sebastiani, F. SentiWordNet 3.0: An enhanced lexical resource for
sentiment analysis and opinion mining. In Language Resources and Evaluation (2010).
[20] Bebbington, P. E. The efficacy of Alcoholics Anonymous: the elusiveness of hard data. The British
Journal of Psychiatry 128, 6 (1976), 572–580.
BIBLIOGRAPHY 141
[21] Bell, V. Online information, extreme communities and Internet therapy: Is the Internet good for our
mental health? Journal of Mental Health 16, 4 (2007), 445–457.
[22] Bender, J. L., Jimenez-Marroquin, M.-C., and Jadad, A. R. Seeking support on Facebook: a
content analysis of breast cancer groups. Journal of Medical Internet Research 13, 1 (2011), e16.
[23] Benton, A., Ungar, L., Hill, S., Hennessy, S., Mao, J., Chung, A., Leonard, C. E., and Holmes,
J. H. Identifying potential adverse effects using the web: A new approach to medical hypothesis
generation. Journal of Biomedical Informatics 44, 6 (2011), 989–996.
[24] Berger, M., Wagner, T. H., and Baker, L. C. Internet use and stigmatized illness. Social Science &
Medicine 61, 8 (2005), 1821–1827.
[25] Berland, G. K., Elliott, M. N., Morales, L. S., Algazy, J. I., Kravitz, R. L., Broder, M. S., Kanouse,
D. E., Munoz, J. A., Puyol, J.-A., Lara, M., et al. Health information on the Internet: accessibility,
quality, and readability in English and Spanish. Journal of the American Medical Association 285,
20 (2001), 2612–2621.
[26] Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell,
D., and Panovich, K. Soylent: a word processor with a crowd inside. In User Interface Software
and Technology, ACM (2010), 313–322.
[27] Birnbaum, H. G., White, A. G., Schiller, M., Waldman, T., Cleveland, J. M., and Roland, C. L.
Societal costs of prescription opioid abuse, dependence, and misuse in the United States. Pain
Medicine 12, 4 (2011), 657–667.
[28] Biyani, P., Caragea, C., Mitra, P., and Yen, J. Identifying emotional and informational support in
online health communities. In Computational Linguistics, ICCL (2014), 827–836.
[29] Braithwaite, D. O., Waldron, V. R., and Finn, J. Communication of social support in computer-
mediated groups for people with disabilities. Health Communication 11, 2 (1999), 123–151.
[30] Braun, V., and Clarke, V. Using thematic analysis in psychology. Qualitative Research in Psychol-
ogy 3, 2 (2006), 77–101.
[31] Brennan, P. F., and Aronson, A. R. Towards linking patients and clinical information: detecting
UMLS concepts in e-mail. Journal of Biomedical Informatics 36, 4 (2003), 334–341.
BIBLIOGRAPHY 142
[32] Brown, P. F., Desouza, P. V., Mercer, R. L., Pietra, V. J. D., and Lai, J. C. Class-based n-gram
models of natural language. In Computational Linguistics, vol. 18, ICCL (1992), 467–479.
[33] Brownstein, J. S., Freifeld, C. C., Reis, B. Y., and Mandl, K. D. Surveillance Sans Fron-
tieres: Internet-based emerging infectious disease intelligence and the HealthMap project. PLoS
Medicine 5, 7 (2008), e151.
[34] Buchanan, H., and Coulson, N. S. Accessing dental anxiety online support groups: An exploratory
qualitative study of motives and experiences. Patient Education and Counseling 66, 3 (2007),
263–269.
[35] Buehler, J. W., Berkelman, R. L., Hartley, D. M., and Peters, C. J. Syndromic surveillance and
bioterrorism-related epidemics. Emerging Infectious Diseases 9, 10 (2003), 1197.
[36] Buis, L. R. Emotional and informational support messages in an online hospice support commu-
nity. Computers Informatics Nursing 26, 6 (2008), 358–367.
[37] Bundorf, M. K., Wagner, T. H., Singer, S. J., and Baker, L. C. Who searches the Internet for health
information? Health Services Research 41, 3p1 (2006), 819–836.
[38] Butler, D. When google got flu wrong. Nature 494, 7436 (2013), 155.
[39] Card, S. K., Mackinlay, J. D., Pirolli, P. L., and Pitkow, J. E. Method and apparatus for clustering a
collection of linked documents using co-citation analysis, 2000. US Patent 6,038,574.
[40] Carmichael, A. Infertility-Asthma Link Confirmed. Cure Together Blog. [Online:
www.curetogether.com/blog/2011/03/07/infertility-asthma-link-confirmed, ac-
cessed 15-Sept-2013].
[41] Carneiro, H. A., and Mylonakis, E. Google trends: a web-based tool for real-time surveillance of
disease outbreaks. Clinical Infectious Diseases 49, 10 (2009), 1557–1564.
[42] Cartright, M.-A., White, R. W., and Horvitz, E. Intentions and attention in exploratory health search.
In Research and Development in Information Retrieval, ACM SIGIR (2011), 65–74.
[43] Chapman, W. W., Fiszman, M., Dowling, J. N., Chapman, B. E., and Rindflesch, T. C. Identifying
respiratory findings in emergency department reports for biosurveillance using MetaMap. Medinfo
11, Pt 1 (2004), 487–91.
BIBLIOGRAPHY 143
[44] Chary, M., Genes, N., McKenzie, A., and Manini, A. F. Leveraging social networks for toxicovigi-
lance. Journal of Medical Toxicology 9, 2 (2013), 184–191.
[45] Chee, B. W., Berlin, R., and Schatz, B. Predicting adverse drug events from personal health
messages. In American Medical Informatics Association Annual Symposium, AMIA (2011), 217.
[46] Cicero, T. J., Ellis, M. S., and Surratt, H. L. Effect of abuse-deterrent formulation of oxycontin. New
England Journal of Medicine 367, 2 (2012), 187–189.
[47] Civan, A., and Pratt, W. Threading together patient expertise. In American Medical Informatics
Association Annual Symposium, AMIA (2007), 140.
[48] Cleveland, W. S., and Devlin, S. J. Locally weighted regression: an approach to regression analy-
sis by local fitting. Journal of the American Statistical Association 83, 403 (1988), 596–610.
[49] Cline, R. J., and Haynes, K. M. Consumer health information seeking on the Internet: the state of
the art. Health Education Research 16, 6 (2001), 671–692.
[50] Cohen, J. Weighted kappa: Nominal scale agreement provision for scaled disagreement or partial
credit. Psychological Bulletin 70, 4 (1968), 213.
[51] Coiera, E. Information epidemics, economics, and immunity on the Internet: We still know so little
about the effect of information on public health. British Medical Journal 317, 7171 (1998), 1469.
[52] Collier, N., Doan, S., Kawazoe, A., Goodwin, R. M., Conway, M., Tateno, Y., Ngo, Q.-H., Dien, D.,
Kawtrakul, A., Takeuchi, K., et al. Biocaster: detecting public health rumors with a web-based text
mining system. Bioinformatics 24, 24 (2008), 2940–2941.
[53] Cooper, C. P., Mallon, K. P., Leadbetter, S., Pollack, L. A., and Peipins, L. A. Cancer Internet
search activity on a major search engine, United States 2001-2003. Journal of Medical Internet
Research 7, 3 (2005), e36.
[54] Corazza, O., Valeriani, G., Bersani, F. S., Corkery, J., Martinotti, G., Bersani, G., and Schifano,
F. “Spice”, “Kryptonite”, “Black Mamba”: An Overview of Brand Names and Marketing Strategies
of Novel Psychoactive Substances on the Web. Journal of Psychoactive Drugs 46, 4 (2014),
287–294.
BIBLIOGRAPHY 144
[55] Corley, C., Mikler, A. R., Singh, K. P., and Cook, D. J. Monitoring influenza trends through mining
social media. In Bioinformatics and Computational Biology (2009), 340–346.
[56] Corley, C. D., Cook, D. J., Mikler, A. R., and Singh, K. P. Text and structural data mining of influenza
mentions in web and social media. International Journal of Environmental Research and Public
Health 7, 2 (2010), 596–615.
[57] Cotten, S. R., and Gupta, S. S. Characteristics of online and offline health information seekers
and factors that discriminate between them. Social Science & Medicine 59, 9 (2004), 1795–1806.
[58] Coulson, N. S. Receiving social support online: an analysis of a computer-mediated support group
for individuals living with irritable bowel syndrome. CyberPsychology & Behavior 8, 6 (2005), 580–
584.
[59] Coulson, N. S., Buchanan, H., and Aubeeluck, A. Social support in cyberspace: a content analysis
of communication within a Huntington’s disease online support group. Patient Education and
Counseling 68, 2 (2007), 173–178.
[60] Coulson, N. S., and Knibb, R. C. Coping with food allergy: exploring the role of the online support
group. CyberPsychology & Behavior 10, 1 (2007), 145–148.
[61] Coursaris, C. K., and Liu, M. An analysis of social support exchanges in online HIV/AIDS self-help
groups. Computers in Human Behavior 25, 4 (2009), 911–918.
[62] Culotta, A. Towards detecting influenza epidemics by analyzing Twitter messages. In workshop
on Social Media Analytics, ACM (2010), 115–122.
[63] Culotta, A. Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter
messages. Language Resources and Evaluation 47, 1 (2013), 217–238.
[64] Culver, J. D., Gerr, F., Frumkin, H., et al. Medical information on the Internet. Journal of General
Internal Medicine 12, 8 (1997), 466–470.
[65] Curtis, B., Alanis-Hirsch, K., Kaynak, O., Cacciola, J., Meyers, K., and McLellan, A. T. Using
web searches to track interest in synthetic cannabinoids (aka “herbal incense”). Drug and Alcohol
Review 34, 1 (2014), 105–108.
BIBLIOGRAPHY 145
[66] Dasgupta, N., Freifeld, C., Brownstein, J. S., Menone, C. M., Surratt, H. L., Poppish, L., Green,
J. L., Lavonas, E. J., and Dart, R. C. Crowdsourcing black market prices for prescription opioids.
Journal of Medical Internet Research 15, 8 (2013), e178.
[67] Davison, K. P., Pennebaker, J. W., and Dickerson, S. S. Who talks? The social psychology of
illness support groups. American Psychologist 55, 2 (2000), 205.
[68] De Bock, G. H., Jacobi, C. E., Seynaeve, C., Krol-Warmerdam, E. M., Blom, J., Van Asperen, C. J.,
Cornelisse, C. J., Klijn, J. G., Devilee, P., Tollenaar, R. A., et al. A family history of breast cancer
will not predict female early onset breast cancer in a population-based setting. BMC Cancer 8, 1
(2008), 203.
[69] De Choudhury, M., Counts, S., and Horvitz, E. Major life changes and behavioral markers in social
media: case of childbirth. In Computer Supported Cooperative Work, ACM (2013), 1431–1442.
[70] De Choudhury, M., Counts, S., and Horvitz, E. Predicting postpartum changes in emotion and
behavior via social media. In Human Factors in Computing Systems, ACM (2013), 3267–3276.
[71] De Choudhury, M., Counts, S., Horvitz, E. J., and Hoff, A. Characterizing and predicting post-
partum depression from shared Facebook data. In Computer Supported Cooperative Work, ACM
(2014), 626–638.
[72] De Choudhury, M., Gamon, M., Counts, S., and Horvitz, E. Predicting depression via social media.
In International Conference on Weblogs and Social Media, AAAI (2013).
[73] Deluca, P., Davey, Z., Corazza, O., Di Furia, L., Farre, M., Flesland, L. H., Mannonen, M., Majava,
A., Peltoniemi, T., Pasinetti, M., et al. Identifying emerging trends in recreational drug use; out-
comes from the Psychonaut Web Mapping Project. Progress in Neuro-Psychopharmacology and
Biological Psychiatry 39, 2 (2012), 221–226.
[74] Diaz, J. A., Griffith, R. A., Ng, J. J., Reinert, S. E., Friedmann, P. D., and Moulton, A. W. Patients’
use of the Internet for medical information. Journal of General Internal Medicine 17, 3 (2002),
180–185.
[75] DiClemente, C. C., Prochaska, J. O., Fairhurst, S. K., Velicer, W. F., Velasquez, M. M., and Rossi,
J. S. The process of smoking cessation: an analysis of precontemplation, contemplation, and
preparation stages of change. Journal of Consulting and Clinical Psychology 59, 2 (1991), 295.
BIBLIOGRAPHY 146
[76] Dingare, S., Nissim, M., Finkel, J., Manning, C., and Grover, C. A system for identifying named
entities in biomedical text: How results from two evaluations reflect on both the system and the
evaluations. Comparative and Functional Genomics 6, 1-2 (2005), 77–85.
[77] Doing-Harris, K. M., and Zeng-Treitler, Q. Computer-assisted update of a consumer health vocab-
ulary through mining of social network data. Journal of Medical Internet Research 13, 2 (2011),
e37.
[78] Dunning, T. Accurate methods for the statistics of surprise and coincidence. Computational Lin-
guistics 19, 1 (1993), 61–74.
[79] DuPont, R. L., McLellan, A. T., White, W. L., Merlo, L. J., and Gold, M. S. Setting the standard
for recovery: Physicians’ health programs. Journal of Substance Abuse Treatment 36, 2 (2009),
159–171.
[80] Esquivel, A., Meric-Bernstam, F., and Bernstam, E. V. Accuracy and self correction of information
received from an Internet breast cancer list: content analysis. British Medical Journal 332, 7547
(2006), 939–942.
[81] Eysenbach, G. Infodemiology: tracking flu-related searches on the web for syndromic surveillance.
In American Medical Informatics Association Annual Symposium, AMIA (2006), 244–248.
[82] Eysenbach, G., and Kohler, C. How do consumers search for and appraise health information on
the world wide web? Qualitative study using focus groups, usability tests, and in-depth interviews.
British Medical Journal 324, 7337 (2002), 573.
[83] Eysenbach, G., Powell, J., Kuss, O., and Sa, E.-R. Empirical studies assessing the quality of
health information for consumers on the world wide web: a systematic review. Journal of the
American Medical Association 287, 20 (2002), 2691–2700.
[84] Farrell, M. Opiate withdrawal. Addiction 89, 11 (1994), 1471–1475.
[85] Fernandez-Luque, L., Karlsen, R., and Bonander, J. Review of extracting information from the
social web for health personalization. Journal of Medical Internet Research 13, 1 (2011), e15.
[86] Finfgeld, D. L. Therapeutic groups online: the good, the bad, and the unknown. Issues in Mental
Health Nursing 21, 3 (2000), 241–255.
BIBLIOGRAPHY 147
[87] Finkel, J., Dingare, S., Nguyen, H., Nissim, M., Manning, C., and Sinclair, G. Exploiting context
for biomedical entity recognition: from syntax to the web. In joint workshop on Natural Language
Processing in Biomedicine and its Applications, ACL (2004), 88–91.
[88] Fleiss, J. L. Measuring nominal scale agreement among many raters. Psychological Bulletin 76,
5 (1971), 378.
[89] Fox, N., Ward, K., and O’Rourke, A. Pro-anorexia, weight-loss drugs and the Internet: an “anti-
recovery” explanatory model of anorexia. Sociology of Health & Illness 27, 7 (2005), 944–971.
[90] Fox, S. Peer-to-Peer Health Care. Pew Internet & American Life Project, 2011. [Online:
http://pewinternet.org/Reports/2011/P2PHealthcare/Summary-of-Findings.aspx, ac-
cessed 6-January-2014].
[91] Fox, S., and Duggan, M. Health Online. Pew Internet & American Life Project, 2013.
[Online: http://pewinternet.org/Reports/2013/Health-online/Summary-of-Findings.
aspx, accessed 2-April-2013].
[92] Fox, S., and Rainie, L. Vital Decisions: How Internet Users Decide what In-
formation to Trust when They Or Their Loved Ones are Sick. Pew Internet &
American Life Project, 2002. [Online: http://www.pewinternet.org/2002/05/22/
vital-decisions-a-pew-internet-health-report/, accessed 2-April-2013].
[93] Franklin, V. L., Waller, A., Pagliari, C., and Greene, S. A. A randomized controlled trial of Sweet
Talk, a text-messaging system to support young people with diabetes. Diabetic Medicine 23, 12
(2006), 1332–1338.
[94] Frantzi, K., Ananiadou, S., and Mima, H. Automatic recognition of multi-word terms: the c-
value/nc-value method. International Journal on Digital Libraries 3, 2 (2000), 115–130.
[95] Friedrich, C. M., Revillion, T., Hofmann, M., and Fluck, J. Biomedical and chemical named entity
recognition with conditional random fields: the advantage of dictionary features. In Semantic
Mining in Biomedicine, vol. 7 (2006), 85–89.
[96] Frost, J. H., and Massagli, M. P. Social uses of personal health information within PatientsLikeMe,
an online community: what can happen when patients have access to one anothers data. Journal
of Medical Internet Research 10, 3 (2008), e15.
BIBLIOGRAPHY 148
[97] Gade, E. J., Thomsen, S. F., Lindenberg, S., Kyvik, K. O., Lieberoth, S., and Backer, V. Asthma
affects time to pregnancy and fertility: a register-based twin study. European Respiratory Journal
43, 4 (2014), 1077–1085.
[98] Gavin, J., Rodham, K., and Poyer, H. The presentation of “pro-anorexia” in online group interac-
tions. Qualitative Health Research 18, 3 (2008), 325–333.
[99] Gibbs, R. D., Gibbs, P. H., and Henrich, J. Patient understanding of commonly used medical
vocabulary. The Journal of Family Practice 25, 2 (1987), 176–178.
[100] Ginsberg, J., Mohebbi, M. H., Patel, R. S., Brammer, L., Smolinski, M. S., and Brilliant, L. Detecting
influenza epidemics using search engine query data. Nature 457, 7232 (2008), 1012–1014.
[101] Gooden, R. J., and Winefield, H. R. Breast and prostate cancer online discussion boards a the-
matic analysis of gender differences and similarities. Journal of Health Psychology 12, 1 (2007),
103–114.
[102] Gossop, M., Battersby, M., and Strang, J. Self-detoxification by opiate addicts. a preliminary
investigation. The British Journal of Psychiatry 159, 2 (1991), 208–212.
[103] Gossop, M., Green, L., Phillips, G., and Bradley, B. Lapse, relapse and survival among opiate
addicts after treatment. A prospective follow-up study. The British Journal of Psychiatry 154, 3
(1989), 348–353.
[104] Grandinetti, D. A. Doctors and the web. Help your patients surf the Net safely. Medical Economics
77, 5 (2000), 186.
[105] Gray, N. J., Klein, J. D., Noyce, P. R., Sesselberg, T. S., and Cantrill, J. A. Health information-
seeking behaviour in adolescence: the place of the Internet. Social Science & Medicine 60, 7
(2005), 1467–1478.
[106] Green, L., and Gossop, M. Effects of information on the opiate withdrawal syndrome. British
Journal of Addiction 83, 3 (1988), 305–309.
[107] Greene, J. A., Choudhry, N. K., Kilabuk, E., and Shrank, W. H. Online social networking by pa-
tients with diabetes: a qualitative evaluation of communication with Facebook. Journal of General
Internal Medicine 26, 3 (2011), 287–292.
BIBLIOGRAPHY 149
[108] Grimes, A., Landry, B. M., and Grinter, R. E. Characteristics of shared health reflections in a local
community. In Computer Supported Cooperative Work, ACM (2010), 435–444.
[109] Grishman, R., Huttunen, S., and Yangarber, R. Information extraction for enhanced access to
disease outbreak reports. Journal of Biomedical Informatics 35, 4 (2002), 236–246.
[110] Guest, G., MacQueen, K. M., and Namey, E. E. Applied Thematic Analysis. Sage, 2011.
[111] GuoDong, Z., and Jian, S. Exploring deep knowledge resources in biomedical name recognition.
In workshop on Natural Language Processing in Biomedicine and its Applications, ACL (2004),
96–99.
[112] Gupta, S., MacLean, D. L., Heer, J., and Manning, C. D. Induced lexico-syntactic patterns improve
information extraction from online medical forums. Journal of the American Medical Informatics
Association 21, 5 (2014), 902–909.
[113] Hampton, T. Warning system aims to detect emerging trends in illegal drug use. Journal of the
American Medical Association 312, 8 (2014), 779–779.
[114] Hansen, D. L., Derry, H. A., Resnick, P. J., and Richardson, C. R. Adolescents searching for
health information on the Internet: an observational study. Journal of Medical Internet Research
5, 4 (2003), e25.
[115] Hansen, R. N., Oster, G., Edelsberg, J., Woody, G. E., and Sullivan, S. D. Economic costs of
nonmedical use of prescription opioids. The Clinical Journal of Pain 27, 3 (2011), 194–202.
[116] Hardey, M. Doctor in the house: the Internet as a source of lay health knowledge and the challenge
to expertise. Sociology of Health & Illness 21, 6 (1999), 820–835.
[117] Hardey, M. the story of my illness: Personal accounts of illness on the Internet. Health: 6, 1 (2002),
31–46.
[118] Harman, G. A., Coppersmith, C. T., and Dredze, M. H. Measuring post traumatic stress disorder
in Twitter. In International Conference on Weblogs and Social Media, AAAI (2014), 579–582.
[119] Harpaz, R., DuMouchel, W., Shah, N. H., Madigan, D., Ryan, P., and Friedman, C. Novel data-
mining methodologies for adverse drug event discovery and analysis. Clinical Pharmacology &
Therapeutics 91, 6 (2012), 1010–1021.
BIBLIOGRAPHY 150
[120] Harris, S., and Gerich, E. Retiring the NSFNET Backbone Service: Chronicling the end of an era.
Connexions 10, 4 (1996).
[121] Hartzband, P., and Groopman, J. Untangling the Web: patients, doctors, and the Internet. New
England Journal of Medicine 362, 12 (2010), 1063–1066.
[122] Hartzler, A., and Pratt, W. Managing the personal side of health: How patient expertise differs
from the expertise of clinicians. Journal of Medical Internet Research 13, 3 (2011), e62.
[123] He, H. A., Greenberg, S., and Huang, E. M. One size does not fit all: applying the transtheoretical
model to energy feedback technology design. In Human Factors in Computing Systems, ACM
(2010), 927–936.
[124] He, Y., and Kayaalp, M. Biological entity recognition with conditional random fields. In American
Medical Informatics Association Annual Symposium, AMIA (2008), 293.
[125] Hearst, M. S. A simple algorithm for identifying abbreviation definitions in biomedical text. In
Pacific Symposium on Biocomputing (2003), 451–462.
[126] Heer, J., and Bostock, M. Crowdsourcing graphical perception: using Mechanical Turk to assess
visualization design. In Human Factors in Computing Systems, ACM (2010), 203–212.
[127] Heffernan, R., Mostashari, F., Das, D., Karpati, A., Kulldorff, M., Weiss, D., et al. Syndromic
surveillance in public health practice, New York City. Emerging Infectious Diseases 10, 5 (2004),
858–864.
[128] Henning, K. J. What is syndromic surveillance? Morbidity and Mortality Weekly Report (2004),
7–11.
[129] Homan, C. M., Lu, N., Tu, X., Lytle, M. C., and Silenzio, V. Social structure and depression in
TrevorSpace. In Computer supported Cooperative Work, ACM (2014), 615–625.
[130] Houston, T. K., Cooper, L. A., and Ford, D. E. Internet support groups for depression: a 1-year
prospective cohort study. American Journal of Psychiatry 159, 12 (2002), 2062–2068.
[131] Høybye, M. T., Johansen, C., and Tjørnhøj-Thomsen, T. Online interaction. Effects of storytelling
in an Internet breast cancer support group. Psycho-Oncology 14, 3 (2005), 211–220.
BIBLIOGRAPHY 151
[132] Hulth, A., and Rydevik, G. Web query-based surveillance in Sweden during the influenza A (H1N1)
2009 pandemic, April 2009 to February 2010. Euro Surveillance 16, 18 (2011).
[133] Humphreys, K. Circles of recovery: Self-help organizations for addictions. Cambridge Univ. Press,
2004.
[134] Hwang, K. O., Ottenbacher, A. J., Green, A. P., Cannon-Diehl, M. R., Richardson, O., Bernstam,
E. V., and Thomas, E. J. Social support in an Internet weight loss community. International Journal
of Medical Informatics 79, 1 (2010), 5–13.
[135] Jamison-Powell, S., Linehan, C., Daley, L., Garbett, A., and Lawson, S. I can’t get no sleep:
discussing #insomnia on Twitter. In Human Factors in Computing Systems, ACM (2012), 1501–
1510.
[136] Jha, M., and Elhadad, N. Cancer stage prediction based on patient online discourse. In workshop
on Biomedical Natural Language Processing, ACL (2010), 64–71.
[137] Johnson, H. A., Wagner, M. M., Hogan, W. R., Chapman, W., Olszewski, R. T., Dowling, J., Barnas,
G., et al. Analysis of web access logs for surveillance of influenza. Studies in Health Technology
and Informatics 107, Pt 2 (2004), 1202–1206.
[138] Jonquet, C., Shah, N. H., and Musen, M. A. The Open Biomedical Annotator. In summit on
Translational Bioinformatics, AMIA (2009), 56.
[139] Kandel, D. B. Stages and pathways of drug involvement: Examining the gateway hypothesis.
Cambridge University Press, 2002.
[140] Kaskutas, L. A., Bond, J., and Humphreys, K. Social networks as mediators of the effect of
Alcoholics Anonymous. Addiction 97, 7 (2002), 891–900.
[141] Kelly, J. F., Hoeppner, B., Stout, R. L., and Pagano, M. Determining the relative importance of
the mechanisms of behavior change within Alcoholics Anonymous: a multiple mediator analysis.
Addiction 107, 2 (2012), 289–299.
[142] Kendall, L., Hartzler, A., Klasnja, P., and Pratt, W. Descriptive analysis of physical activity conver-
sations on Twitter. In extended abstracts on Human Factors in Computing Systems, ACM (2011),
1555–1560.
BIBLIOGRAPHY 152
[143] Keselman, A., Smith, C. A., Divita, G., Kim, H., Browne, A. C., Leroy, G., and Zeng-Treitler, Q.
Consumer health concepts that do not map to the UMLS: where do they fit? Journal of the
American Medical Informatics Association 15, 4 (2008), 496–505.
[144] Keselman, A., Tse, T., Crowell, J., Browne, A., Ngo, L., and Zeng, Q. Assessing consumer health
vocabulary familiarity: an exploratory study. Journal of Medical Internet Research 9, 1 (2007), e5.
[145] Kim, J.-D., Ohta, T., Tateisi, Y., and Tsujii, J. GENIA corpus – a semantically annotated corpus for
bio-textmining. Bioinformatics 19, suppl 1 (2003), i180–i182.
[146] Kim, J.-D., Ohta, T., Tsuruoka, Y., Tateisi, Y., and Collier, N. Introduction to the bio-entity recog-
nition task at JNLPBA. In joint workshop on Natural Language Processing in Biomedicine and its
Applications, ACL (2004), 70–75.
[147] Kittur, A., Chi, E. H., and Suh, B. Crowdsourcing user studies with Mechanical Turk. In Human
Factors in Computing Systems, ACM (2008), 453–456.
[148] Klemm, P., Bunnell, D., Cullen, M., Soneji, R., Gibbons, P., and Holecek, A. Online cancer support
groups: a review of the research literature. Computers Informatics Nursing (2003).
[149] Kummervold, P. E., Gammon, D., Bergvik, S., Johnsen, J.-A. K., Hasvold, T., and Rosenvinge, J. H.
Social support in a wired world: use of online mental health forums in Norway. Nordic Journal of
Psychiatry 56, 1 (2002), 59–65.
[150] LaCoursiere, S. P., Knobf, M. T., and McCorkle, R. Cancer patients’ self-reported attitudes about
the Internet. Journal of Medical Internet Research 7, 3 (2005), e22.
[151] Lafferty, J., McCallum, A., and Pereira, F. C. Conditional random fields: Probabilistic models for
segmenting and labeling sequence data. In International Conference on Machine Learning, ACM
(2001), 282–289.
[152] Lamb, A., Paul, M. J., and Dredze, M. Separating fact from fear: Tracking flu infections on Twitter.
In North American Chapter of the ACL : Human Language Technologies, ACL (2013), 789–795.
[153] Lasker, J. N., Sogolow, E. D., and Sharim, R. R. The role of an online community for people with
a rare disease: content analysis of messages posted on a primary biliary cirrhosis mailing list.
Journal of Medical Internet Research 7, 1 (2005), e10.
BIBLIOGRAPHY 153
[154] Leaman, R., Wojtulewicz, L., Sullivan, R., Skariah, A., Yang, J., and Gonzalez, G. Towards
Internet-age pharmacovigilance: extracting adverse drug reactions from user posts to health-
related social networks. In workshop on Biomedical Natural Language Processing, ACL (2010),
117–125.
[155] Lembke, A. Humphreys, K. Self-Help Organizations for Substance Use Disorders. Oxford Univ.
Press, 2009.
[156] Lewis, T. Seeking health information on the Internet: lifestyle choice or bad attack of cyberchon-
dria? Media, Culture & Society 28, 4 (2006), 521–539.
[157] Liang, P. Semi-supervised learning for natural language. PhD thesis, Massachusetts Institute of
Technology, 2005.
[158] Lieberman, M. A., Golant, M., Giese-Davis, J., Winzlenberg, A., Benjamin, H., Humphreys, K.,
Kronenwetter, C., Russo, S., and Spiegel, D. Electronic support groups for breast carcinoma.
Cancer 97, 4 (2003), 920–925.
[159] MacLean, D. L., and Heer, J. Identifying medical terms in patient-authored text: a crowdsourcing-
based approach. Journal of the American Medical Informatics Association 20, 6 (2013), 1120–
1127.
[160] Malik, S. H., and Coulson, N. The male experience of infertility: a thematic analysis of an online
infertility support group bulletin board. Journal of Reproductive and Infant Psychology 26, 1 (2008),
18–30.
[161] Malik, S. H., and Coulson, N. S. Coping with infertility online: An examination of self-help mech-
anisms in an online infertility support group. Patient Education and Counseling 81, 2 (2010),
315–318.
[162] Maloney-Krichmar, D., and Preece, J. A multilevel analysis of sociability, usability, and community
dynamics in an online health community. ACM Transactions on Computer-Human Interaction 12,
2 (2005), 201–232.
[163] Mandl, K. D., Overhage, J. M., Wagner, M. M., Lober, W. B., Sebastiani, P., Mostashari, F., Pavlin,
J. A., Gesteland, P. H., Treadwell, T., Koski, E., et al. Implementing syndromic surveillance: a
BIBLIOGRAPHY 154
practical guide informed by the early experience. Journal of the American Medical Informatics
Association 11, 2 (2004), 141–150.
[164] Mankoff, J., Kuksenok, K., Kiesler, S., Rode, J. A., and Waldman, K. Competing online viewpoints
and models of chronic illness. In Human Factors in Computing Systems, ACM (2011), 589–598.
[165] Mayer, D. K., Terrin, N. C., Kreps, G. L., Menon, U., McCance, K., Parsons, S. K., and Mooney,
K. H. Cancer survivors information seeking behaviors: a comparison of survivors who do and do
not seek information about cancer. Patient Education and Counseling 65, 3 (2007), 342–350.
[166] Mayer, M., and Till, J. The Internet: a modern Pandora’s box? Quality of Life Research 5, 6
(1996), 568–571.
[167] McCray, A. T., Loane, R. F., Browne, A. C., and Bangalore, A. K. Terminology issues in user
access to web-based medical information. In American Medical Informatics Association Annual
Symposium, AMIA (1999), 107.
[168] McLellan, A. T. What is recovery? Revisiting the Betty Ford Institute consensus panel definition.
Journal of Substance Abuse Treatment (2010), 109–113.
[169] McLellan, A. T., Lewis, D. C., O’Brien, C. P., and Kleber, H. D. Drug dependence, a chronic medical
illness: implications for treatment, insurance, and outcomes evaluation. Journal of the American
Medical Association 284, 13 (2000), 1689–1695.
[170] McNeil, K., Brna, P., and Gordon, K. Epilepsy in the Twitter era: a need to re-tweet the way we
think about seizures. Epilepsy & Behavior 23, 2 (2012), 127–130.
[171] Medawar, C., Herxheimer, A., Bell, A., and Jofre, S. Paroxetine, panorama and user reporting of
adrs: Consumer intelligence matters in clinical practice and post-marketing drug surveillance. The
International Journal of Risk and Safety in Medicine 15, 3 (2002), 161–169.
[172] Medlineplus use by quarter. National Library of Medicine (2013). [Online: http://www.nlm.nih.
gov/medlineplus/usestatistics.html, accessed 25-August-2014].
[173] Meier, A., Lyons, E. J., Frydman, G., Forlenza, M., and Rimer, B. K. How cancer survivors provide
support on cancer-related Internet mailing lists. Journal of Medical Internet Research 9, 2 (2007),
e12.
BIBLIOGRAPHY 155
[174] Merrill, J. O., Rhodes, L. A., Deyo, R. A., Marlatt, G. A., and Bradley, K. A. Mutual mistrust in the
medical care of drug users. Journal of General Internal Medicine 17, 5 (2002), 327–333.
[175] Migneault, J. P., Adams, T. B., and Read, J. P. Application of the transtheoretical model to sub-
stance abuse: historical development and future directions. Drug and Alcohol Review 24, 5 (2005),
437–448.
[176] Miller, N. S., Sheppard, L. M., Colenda, C. C., and Magen, J. Why physicians are unprepared
to treat patients who have alcohol-and drug-related disorders. Academic Medicine 76, 5 (2001),
410–418.
[177] Mo, P. K., and Coulson, N. S. Exploring the communication of social support within virtual commu-
nities: A content analysis of messages posted to an online HIV/AIDS support group. Cyberpsy-
chology & Behavior 11, 3 (2008), 371–374.
[178] Morahan-Martin, J. M. How Internet users find, evaluate, and use online health information: a
cross-cultural review. CyberPsychology & Behavior 7, 5 (2004), 497–510.
[179] Mulveen, R., and Hepworth, J. An interpretative phenomenological analysis of participation in a
pro-anorexia Internet site and its relationship with disordered eating. Journal of Health Psychology
11, 2 (2006), 283–296.
[180] Murnane, E. L., and Counts, S. Unraveling abstinence and relapse: smoking cessation reflected
in social media. In Human Factors in Computing Systems, ACM (2014), 1345–1354.
[181] Murray, E., Lo, B., Pollack, L., Donelan, K., Catania, J., Lee, K., Zapert, K., and Turner, R. The
impact of health information on the Internet on health care and the physician-patient relationship:
national U.S. survey among 1.050 U.S. physicians. Journal of Medical Internet Research 5, 3
(2003).
[182] Nettleton, S., Burrows, R., and O’Malley, L. The mundane realities of the everyday lay use of the
Internet for health, and their consequences for media convergence. Sociology of Health & Illness
27, 7 (2005), 972–992.
[183] Nikfarjam, A., and Gonzalez, G. H. Pattern mining for extraction of mentions of adverse drug
reactions from user comments. In American Medical Informatics Association Annual Symposium,
AMIA (2011), 1019.
BIBLIOGRAPHY 156
[184] Noble, A., Best, D., Man, L.-H., Gossop, M., and Strang, J. Self-detoxification attempts among
methadone maintenance patients: what methods and what success? Addictive Behaviors 27, 4
(2002), 575–584.
[185] Nonnecke, B., and Preece, J. Shedding light on lurkers in online communities. Ethnographic Stud-
ies in Real and Virtual Environments: Inhabited Information Spaces and Connected Communities
(1999), 123–128.
[186] Nonnecke, B., and Preece, J. Lurker demographics: Counting the silent. In Human Factors in
Computing Systems, ACM (2000), 73–80.
[187] Olsen, Y., and Sharfstein, J. M. Confronting the stigma of opioid use disorder – and its treatment.
Journal of the American Medical Association 311, 14 (2014), 1393–1394.
[188] Owen, J. E., Giese-Davis, J., Cordova, M., Kronenwetter, C., Golant, M., and Spiegel, D. Self-
report and linguistic indicators of emotional expression in narratives as predictors of adjustment to
cancer. Journal of Behavioral Medicine 29, 4 (2006), 335–345.
[189] Owoputi, O., O’Connor, B., Dyer, C., Gimpel, K., Schneider, N., and Smith, N. A. Improved part-
of-speech tagging for online conversational text with word clusters. In North American Chapter of
the ACL : Human Language Technologies, ACL (2013), 380–390.
[190] Pagano, M. E., Friend, K. B., Tonigan, J. S., and Stout, R. L. Helping other alcoholics in Alcoholics
Anonymous and drinking outcomes: Findings from Project MATCH. Journal of Studies on Alcohol
65, 6 (2004), 766.
[191] Park, S., Lee, S. W., Kwak, J., Cha, M., and Jeong, B. Activities on Facebook reveal the depressive
state of users. Journal of Medical Internet Research 15, 10 (2013), e217.
[192] Parker, K., and Wang, W. Modern Parenthood. Pew Internet & American
Life Project, 2013. [Online: http://www.pewsocialtrends.org/2013/03/14/
modern-parenthood-roles-of-moms-and-dads-converge-as-they-balance-work, ac-
cessed 2-April-2013].
[193] Paul, M. J., and Dredze, M. A model for mining public health topics from Twitter. In Health, vol. 11
(2012), 16–6.
BIBLIOGRAPHY 157
[194] Peat, H. J., and Willett, P. The limitations of term co-occurrence data for query expansion in
document retrieval systems. JASIS 42, 5 (1991), 378–383.
[195] Pennebaker, J. W., Francis, M. E., and Booth, R. J. Linguistic inquiry and word count: LIWC 2001.
Mahway: Lawrence Erlbaum Associates 71 (2001).
[196] Pennebaker, J. W., Mehl, M. R., and Niederhoffer, K. G. Psychological aspects of natural language
use: Our words, our selves. Annual Review of Psychology 54, 1 (2003), 547–577.
[197] Ploderer, B., Smith, W., Howard, S., Pearce, J., and Borland, R. Patterns of support in an online
community for smoking cessation. In International Conference on Communities and Technologies,
ACM (2013), 26–35.
[198] Polgreen, P. M., Chen, Y., Pennock, D. M., Nelson, F. D., and Weinstein, R. A. Using Internet
searches for influenza surveillance. Clinical Infectious Diseases 47, 11 (2008), 1443–1448.
[199] Potts, H. W., and Wyatt, J. C. Survey of doctors’ experience of patients using the Internet. Journal
of Medical Internet Research 4, 1 (2002), e5.
[200] Powell, J., and Clarke, A. Internet information-seeking in mental health population survey. The
British Journal of Psychiatry 189, 3 (2006), 273–277.
[201] Pratt, W., and Yetisgen-Yildiz, M. A study of biomedical concept identification: Metamap vs. peo-
ple. In American Medical Informatics Association Annual Symposium, AMIA (2003), 529.
[202] Preece, J., Nonnecke, B., and Andrews, D. The top five reasons for lurking: improving community
experiences for everyone. Computers in Human Behavior 20, 2 (2004), 201–223.
[203] Prochaska, J. O., and Velicer, W. F. The transtheoretical model of health behavior change. Ameri-
can Journal of Health Promotion 12, 1 (1997), 38–48.
[204] Pyysalo, S., Ginter, F., Heimonen, J., Bjorne, J., Boberg, J., Jarvinen, J., and Salakoski, T. Bioinfer:
a corpus for information extraction in the biomedical domain. BMC Bioinformatics 8, 1 (2007), 50.
[205] Rainie, L., and Fox, S. The Online Health Care Revolution. Pew Internet &
American Life Project, 2000. [Online: http://www.pewinternet.org/2000/11/26/
the-online-health-care-revolution/, accessed 2-April-2013].
BIBLIOGRAPHY 158
[206] Ravert, R. D., Hancock, M. D., and Ingersoll, G. M. Online forum messages posted by adolescents
with type 1 diabetes. The Diabetes Educator 30, 5 (2003), 827–834.
[207] Reis, B. Y., and Mandl, K. D. Time series modeling for syndromic surveillance. BMC Medical
Informatics and Decision Making 3, 1 (2003), 2.
[208] Resnik, P., Garron, A., and Resnik, R. Using topic modeling to improve prediction of neuroti-
cism and depression. In Conference on Empirical Methods in Natural Language Processing, ACL
(2013), 1348–1353.
[209] Rideout, V. Generation Rx.com. what are young people really doing online? Marketing Health
Services 22, 1 (2002), 26.
[210] Risk, A., and Petersen, C. Health information on the Internet: quality issues and international
initiatives. Journal of the American Medical Association 287, 20 (2002), 2713–2715.
[211] Rodgers, S., and Chen, Q. Internet community group participation: Psychosocial benefits for
women with breast cancer. Journal of Computer-Mediated Communication 10, 4 (2005).
[212] Ruau, D., Mbagwu, M., Dudley, J. T., Krishnan, V., and Butte, A. J. Comparison of automated and
human assignment of MeSH terms on publicly-available molecular datasets. Journal of Biomedical
Informatics 44 (2011), S39–S43.
[213] Sadilek, A., Brennan, S., Kautz, H., and Silenzio, V. nEmesis: Which restaurants should you avoid
today? In Human Computation and Crowdsourcing, AAAI (2013).
[214] Saha, S. K., Sarkar, S., and Mitra, P. Feature selection techniques for maximum entropy based
biomedical named entity recognition. Journal of Biomedical Informatics 42, 5 (2009), 905–911.
[215] Salem, D. A., Bogat, G. A., and Reid, C. Mutual help goes on-line. Journal of Community Psy-
chology 25, 2 (1997), 189–207.
[216] Sanderson, M., and Croft, B. Deriving concept hierarchies from text. In Research and Develop-
ment in Information Retrieval, ACM SIGIR (1999), 206–213.
[217] Sanford, A. A. “I can air my feelings instead of eating them”: Blogging as social support for the
morbidly obese. Communication Studies 61, 5 (2010), 567–584.
BIBLIOGRAPHY 159
[218] Scanfeld, D., Scanfeld, V., and Larson, E. L. Dissemination of health information through social
networks: Twitter and antibiotics. American Journal of Infection Control 38, 3 (2010), 182–188.
[219] Schatz, B. R., Johnson, E. H., Cochrane, P. A., and Chen, H. Interactive term suggestion for
users of digital libraries: Using subject thesauri and co-occurrence lists for information retrieval. In
International Conference on Digital libraries, ACM (1996), 126–133.
[220] Seale, C., Ziebland, S., and Charteris-Black, J. Gender, cancer experience and Internet use: a
comparative keyword analysis of interviews and online cancer support groups. Social Science &
Medicine 62, 10 (2006), 2577–2590.
[221] Seifter, A., Schwarzwalder, A., Geis, K., and Aucott, J. The utility of Google Trends for epidemio-
logical research: Lyme disease as an example. Geospatial Health 4, 2 (2010), 135–137.
[222] Settles, B. Biomedical named entity recognition using conditional random fields and rich feature
sets. In joint workshop on Natural Language Processing in Biomedicine and its Applications, ACL
(2004), 104–107.
[223] Sheeren, M. The relationship between relapse and involvement in Alcoholics Anonymous. Journal
of Studies on Alcohol and Drugs 49, 1 (1988), 104.
[224] Shuyler, K. S., and Knight, K. M. What are patients seeking when they turn to the Internet?
Qualitative content analysis of questions asked by visitors to an orthopaedics web site. Journal of
Medical Internet Research 5, 4 (2003), e24.
[225] Sillence, E., Briggs, P., Harris, P. R., and Fishwick, L. How do patients evaluate and make use of
online health information? Social Science & Medicine 64, 9 (2007), 1853–1862.
[226] Smith, C. A., and Wicks, P. J. PatientsLikeMe: Consumer health vocabulary as a folksonomy. In
American Medical Informatics Association Annual Symposium, AMIA (2008), 682.
[227] Smyth, B., Barry, J., Keenan, E., and Ducray, K. Lapse and relapse following inpatient treatment
of opiate dependence. Irish Medical Journal 103, 6 (2010), 176–179.
[228] Snow, R., O’Connor, B., Jurafsky, D., and Ng, A. Y. Cheap and fast—but is it good? Evaluating
non-expert annotations for natural language tasks. In Empirical Methods in Natural Language
Processing, ACL (2008), 254–263.
BIBLIOGRAPHY 160
[229] Sproule, B., Brands, B., Li, S., and Catz-Biro, L. Changing patterns in opioid addiction – charac-
terizing users of oxycodone and other opioids. Canadian Family Physician 55, 1 (2009), 68–69.
[230] Strang, J., Babor, T., Caulkins, J., Fischer, B., Foxcroft, D., and Humphreys, K. Drug policy and
the public good: evidence for effective interventions. The Lancet 379, 9810 (2012), 71–83.
[231] Substance Abuse and Mental Health Services Administration. Drug Abuse Warning Network,
2011: National Estimates of Drug-Related Emergency Department Visits. HHS Publication No.
(SMA) 13-4760, DAWN Series D-39. Rockville, MD: Substance Abuse and Mental Health Services
Administration, 2013.
[232] Substance Abuse and Mental Health Services Administration, Center for Behavioral Health Statis-
tics and Quality. The N-SSATS report: Trends in the use of methadone and buprenorphine at
substance abuse treatment facilities: 2003 to 2011. Rockville, MD. 2013.
[233] Sullivan, C. F. Gendered cybersupport: A thematic analysis of two online cancer support groups.
Journal of Health Psychology 8, 1 (2003), 83–104.
[234] Sullivan, S. J., Schneiders, A. G., Cheang, C.-W., Kitto, E., Lee, H., Redhead, J., Ward, S., Ahmed,
O. H., and McCrory, P. R. What’s happening? A content analysis of concussion-related traffic on
Twitter. British Journal of Sports Medicine 46, 4 (2012), 258–263.
[235] Teodoro, R., and Naaman, M. Fitter with Twitter: Understanding personal health and fitness
activity in social media. In International Conference on Weblogs and Social Media (2013).
[236] Thomas, D. R. A general inductive approach for analyzing qualitative evaluation data. American
Journal of Evaluation 27, 2 (2006), 237–246.
[237] Tonigan, J. S., and Rice, S. L. Is it beneficial to have an Alcoholics Anonymous sponsor? Psy-
chology of Addictive Behaviors 24, 3 (2010), 397.
[238] Tsai, R. T.-H., Wu, S.-H., Chou, W.-C., Lin, Y.-C., He, D., Hsiang, J., Sung, T.-Y., and Hsu, W.-L.
Various criteria in the evaluation of biomedical named entity recognition. BMC Bioinformatics 7, 1
(2006), 92.
[239] Tsai, T.-h., Chou, W.-C., Wu, S.-H., Sung, T.-Y., Hsiang, J., and Hsu, W.-L. Integrating linguistic
knowledge into a conditional random fieldframework to identify biomedical named entities. Expert
Systems with Applications 30, 1 (2006), 117–128.
BIBLIOGRAPHY 161
[240] Turner-McGrievy, G. M., and Tate, D. F. Weight loss social support in 140 characters or less: use of
an online social network in a remotely delivered weight loss intervention. Translational Behavioral
Medicine 3, 3 (2013), 287–294.
[241] United States Department of Health and Human Services. Substance Abuse and Men-
tal Health Services Administration. Center for Behavioral Health Statistics and Quality.
Treatment Episode Data Set – Admissions (TEDS-A), 2011. ICPSR34876-v3. Ann Arbor,
MI: Inter-university Consortium for Political and Social Research [distributor], 2014-09-11.
http://doi.org/10.3886/ICPSR34876.v3.
[242] U.S. Department of Health and Human Services. Substance Abuse and Mental Health Services
Administration. Results from the 2010 National Survey on Drug Use and Health: Summary of
National Findings. [Online: http://www.samhsa.gov/data/nsduh/2k10nsduh/2k10results.
htm, accessed 15-Sept-2013].
[243] Ussher, J., Kirsten, L., Butow, P., and Sandoval, M. What do cancer support groups provide which
other supportive relationships do not? The experience of peer support groups for people with
cancer. Social Science & Medicine 62, 10 (2006), 2565–2576.
[244] Van Hout, M. C., and Bingham, T. Silk road, the virtual drug marketplace: a single case study of
user experiences. International Journal of Drug Policy 24, 5 (2013), 385–391.
[245] van Rijsbergen, C. J. A theoretical basis for the use of co-occurrence data in information retrieval.
Journal of Documentation 33, 2 (1977), 106–119.
[246] van Uden-Kraan, C. F., Drossaert, C. H., Taal, E., Seydel, E. R., and van de Laar, M. A. Self-
reported differences in empowerment between lurkers and posters in online patient support
groups. Journal of Medical Internet Research 10, 2 (2008), e18.
[247] Velicer, W. F., Prochaska, J. O., Fava, J. L., Norman, G. J., and Redding, C. A. Smoking ces-
sation and stress management: Applications of the transtheoretical model of behavior change.
Homeostasis in Health and Disease 38 (1998), 216–233.
[248] Vlahovic, T. A., Wang, Y.-C., Kraut, R. E., and Levine, J. M. Support matching and satisfaction
in an online breast cancer support community. In Human Factors in Computing Systems, ACM
(2014), 1625–1634.
BIBLIOGRAPHY 162
[249] Volkow, N. D. Prescription drugs: Abuse and addiction, 2005. [Online: http://www.drugabuse.
gov/sites/default/files/rxreportfinalprint.pdf, accessed 9/4/2014].
[250] Wang, Y.-C., Kraut, R., and Levine, J. M. To stay or leave? The relationship of emotional and
informational support to commitment in online health support groups. In Computer Supported
Cooperative Work, ACM (2012), 833–842.
[251] Warner, M., Chen, L. H., Makuc, D. M., Anderson, R. N., and Minino, A. M. Drug poisoning deaths
in the United States, 1980-2008. NCHS Data Brief, 81 (2011), 1–8.
[252] Wen, M., and Rose, C. P. Understanding participant behavior trajectories in online health support
groups using automatic extraction methods. In International Conference on Supporting Group
Work, ACM (2012), 179–188.
[253] West, R. Time for a change: putting the transtheoretical (stages of change) model to rest. Addic-
tion 100, 8 (2005), 1036–1039.
[254] White, R. W., and Horvitz, E. Cyberchondria: studies of the escalation of medical concerns in web
search. ACM Transactions on Information Systems 27, 4 (2009), 23.
[255] White, R. W., and Horvitz, E. Web to world: Predicting transitions from self-diagnosis to the pursuit
of local medical assistance in web search. In American Medical Informatics Association Annual
Symposium, AMIA (2010), 882.
[256] White, R. W., and Horvitz, E. Studies of the onset and persistence of medical concerns in search
logs. In Research and Development in Information Retrieval, ACM SIGIR (2012), 265–274.
[257] White, R. W., Tatonetti, N. P., Shah, N. H., Altman, R. B., and Horvitz, E. Web-scale pharmacovigi-
lance: listening to signals from the crowd. Journal of the American Medical Informatics Association
20, 1 (2013), 404–408.
[258] Wicks, P., Keininger, D. L., Massagli, M. P., la Loge, C. d., Brownstein, C., Isojarvi, J., and Hey-
wood, J. Perceived benefits of sharing health data between people with epilepsy on an online
platform. Epilepsy & Behavior 23, 1 (2012), 16–23.
[259] Wicks, P., Massagli, M., Frost, J., Brownstein, C., Okun, S., Vaughan, T., Bradley, R., and Hey-
wood, J. Sharing health data for better outcomes on PatientsLikeMe. Journal of Medical Internet
Research 12, 2 (2010), e19.
BIBLIOGRAPHY 163
[260] Wicks, P., Vaughan, T. E., Massagli, M. P., and Heywood, J. Accelerated clinical discovery using
self-reported patient data collected online and a patient-matching algorithm. Nature Biotechnology
29, 5 (2011), 411–414.
[261] Wilson, J. L., Peebles, R., Hardy, K. K., and Litt, I. F. Surfing for thinness: a pilot study of pro–
eating disorder web site usage in adolescents with eating disorders. Pediatrics 118, 6 (2006),
e1635–e1643.
[262] Wilson, K., and Brownstein, J. S. Early detection of disease outbreaks using the Internet. Cana-
dian Medical Association Journal 180, 8 (2009), 829–831.
[263] Wood, E., Samet, J. H., and Volkow, N. D. Physician education in addiction medicine. Journal of
the American Medical Association 310, 16 (2013), 1673–1674.
[264] Xu, R., Supekar, K., Morgan, A., Das, A., and Garber, A. Unsupervised method for automatic con-
struction of a disease dictionary from a large free text collection. In American Medical Informatics
Association Annual Symposium, AMIA (2008), 820.
[265] Yang, C. C., Jiang, L., Yang, H., and Tang, X. Detecting signals of adverse drug reactions from
health consumer contributed content in social media. In workshop on Health Informatics, ACM
SIGKDD (2012).
[266] Yang, C. C., Yang, H., Jiang, L., and Zhang, M. Social media mining for drug safety signal detec-
tion. In workshop on Smart Health and Wellbeing, ACM (2012), 33–40.
[267] Yang, Z., Lin, H., and Li, Y. Exploiting the contextual cues for bio-entity name recognition in
biomedical literature. Journal of Biomedical Informatics 41, 4 (2008), 580–587.
[268] Yates, A., and Goharian, N. ADRTrace: detecting expected and unexpected adverse drug re-
actions from user reviews on social media sites. In Advances in Information Retrieval. Springer,
2013, 816–819.
[269] Yates, A., Goharian, N., and Frieder, O. Extracting adverse drug reactions from forum posts and
linking them to drugs. In workshop on Health Search and Discovery, ACM SIGIR (2013).
[270] Ybarra, M. L., and Eaton, W. W. Internet-based mental health interventions. Mental Health Ser-
vices Research 7, 2 (2005), 75–87.
BIBLIOGRAPHY 164
[271] Yeh, A., Morgan, A., Colosimo, M., and Hirschman, L. BioCreAtIvE task 1A: gene mention finding
evaluation. BMC Bioinformatics 6, Suppl 1 (2005), S2.
[272] Zeng, Q., Kogan, S., Ash, N., Greenes, R., and Boxwala, A. Characteristics of consumer termi-
nology for health information retrieval. Methods of Information in Medicine 41, 4 (2002), 289–298.
[273] Zeng, Q. T., and Tse, T. Exploring and developing consumer health vocabularies. Journal of the
American Medical Informatics Association 13, 1 (2006), 24–29.
[274] Zeng, Q. T., Tse, T., Divita, G., Keselman, A., Crowell, J., Browne, A. C., Goryachev, S., and Ngo,
L. Term identification methods for consumer health vocabulary development. Journal of Medical
Internet Research 9, 1 (2007), e4.
[275] Ziebland, S., Chapple, A., Dumelow, C., Evans, J., Prinjha, S., and Rozmovits, L. How the Internet
affects patients’ experience of cancer: a qualitative study. British Medical Journal 328, 7439
(2004), 564.