genetic testing service adoption by users and their …

74
The Pennsylvania State University The Graduate School College of Information Sciences and Technology GENETIC TESTING SERVICE ADOPTION BY USERS AND THEIR DATA SHARING PREFERENCES A Thesis in Information Sciences and Technology by William R. Aurite © 2017 William R. Aurite Submitted in Partial Fulfillment of the Requirements for the Degree of Master of Science May 2017

Upload: others

Post on 01-Mar-2022

4 views

Category:

Documents


0 download

TRANSCRIPT

The Pennsylvania State University

The Graduate School

College of Information Sciences and Technology

GENETIC TESTING SERVICE ADOPTION BY USERS AND THEIR DATA SHARING

PREFERENCES

A Thesis in

Information Sciences and Technology

by

William R. Aurite

© 2017 William R. Aurite

Submitted in Partial Fulfillment

of the Requirements

for the Degree of

Master of Science

May 2017

ii

The thesis of William R. Aurite was reviewed and approved* by the following:

Andrea Tapia Director of Graduate Programs Associate Professor of Information Sciences and Technology Thesis Co-Advisor

Jessica Kropczynski Lecturer of Information Sciences and Technology Edward Glantz Senior Lecturer of Information Sciences and Technology Academic Program Coordinator, Information Systems MPS

Jens Grossklags Chair for Cyber Trust Professor of Informatics Technical University Munich

Thesis Co-Advisor

*Signatures are on file in the Graduate School

iii

Abstract

Direct-to-consumer genetic testing services have expanded alongside the proliferation of

the Web. Greatly simplified access to the Web allows consumers of these services to receive

detailed, personalized reports about their ancestry, health, phenotypic and genotypic information.

In addition to determining the test-taker’s genetic makeup, genetic details of the test-taker’s family

members are also indirectly revealed through direct-to-consumer genetic testing. As such, taking

a genetic test contains personal and interdependent privacy considerations, considerations that

serve as the main motivation for this thesis. We find that these considerations play important roles

in genetic test-taking service adoption, genetic test-taking service recommendation, and trust in

organizations or institutions receiving test-taker data.

We conduct two studies using the methodology of factorial vignette surveys. In study one,

we assess how attitudes and perceptions of interdependency influence genetic test service

adoption. Specifically, we examine the factors that make someone more or less likely to take a

genetic test, along with the factors that make an individual more or less likely to recommend a test.

Additionally, we judge to which degree psychological factors influence stated adoption choices

and privacy concerns by studying the influence of different thinking styles (construal level). In

study two, we investigate how variables of ethnicity, age, genetic markers, and association of data

with the individual’s name affect the likelihood of sharing data with different types of

organizations. We also investigate elements of personal and interdependent privacy concerns. We

document the significant role these factors have in the decision to share or not share genetic data

with a third party. We also propose a deterministic model that accounts for differences in sharing

preferences among individuals who share data with academic, medical, or governmental

organizations.

iv

Contents List of Figures ................................................................................................................................ vi  

List of Tables ................................................................................................................................ vii  

Chapter 1 ......................................................................................................................................... 1  

1.1 Introduction ........................................................................................................................... 1  

1.2 Research Questions ............................................................................................................... 2  

1.3 Structure of Thesis ................................................................................................................ 3  

Chapter 2: Background and Related Work ..................................................................................... 4  

2.1 A Brief Overview on the History of Genetics ....................................................................... 4  

2.2 Genomics of Kin ................................................................................................................... 5  

2.3 Security Considerations and Cryptographic Techniques ...................................................... 6  

2.4 Privacy Laws ......................................................................................................................... 7  

2.5 Opinions on the Importance of Privacy in Genetic Research ............................................... 8  

Chapter 3: Understanding Interdependent Privacy Concerns and Likely Use Factors for Genetic

Testing: A Vignette Study .............................................................................................................. 9  

3.1 Introduction ........................................................................................................................... 9  

3.2 Related Work ....................................................................................................................... 10  

3.3 Methodology ....................................................................................................................... 11  

3.3.1 Overview of Study ........................................................................................................ 11  

3.3.2 Design of Vignettes ...................................................................................................... 12  

3.3.3 Procedures and Participants .......................................................................................... 15  

3.4 Results ................................................................................................................................. 16  

3.5 Discussion ........................................................................................................................... 19  

3.6 Conclusion ........................................................................................................................... 20  

v

Chapter 4: A Vignette Study On Personal and Interdependent Privacy Concerns, and Sharing

Intentions for Genetic Data ........................................................................................................... 22  

4.1 Introduction ......................................................................................................................... 22  

4.2 Related Work ....................................................................................................................... 23  

4.3 Methodology ....................................................................................................................... 25  

4.3.1 Research Questions and Overview of Study ................................................................ 25  

4.3.2 Design of Vignettes ...................................................................................................... 26  

4.3.3 Demographic and Experience Factors .......................................................................... 29  

4.3.4 Procedures and Participants .......................................................................................... 29  

4.4 Results ................................................................................................................................. 30  

4.5 Discussion of Results, and Concluding Remarks ................................................................ 31  

Chapter 5: Amazon Mechanical Turk ........................................................................................... 34  

Chapter 6: Conclusions and Remarks ........................................................................................... 35  

Bibliography ................................................................................................................................. 36  

Appendix A ................................................................................................................................... 42  

Details of Survey 1: Understanding Interdependent Privacy Concerns and Likely Use Factors

for Genetic Testing: A Vignette Study ...................................................................................... 42  

Appendix B ................................................................................................................................... 57  

Details of Survey 2: A vignette study on personal and interdependent privacy concerns, and

sharing intentions for genetic data ............................................................................................ 57  

vi

List of Figures

1 Study 1: Dependent Variables (Non-Breach Scenarios)…………………………………16

2 Study 1: Comparison: Breach vs. Non-Breach…………………………………………..16

vii

List of Tables

1 Study 2: Multinomial logistic regression models for Vignette 1 for each dependent variable.………………………………………………………………………………31

2 Study 2: Multinomial logistic regression models for Vignette 2 for each dependent

variable……………………………………………………………………………….32

1

Chapter 1

1.1 Introduction The recent proliferation and advancement of science associated with personal and

personalized medicine has brought substantial benefit to those who seek information endemic to

such advances. Personalized genetic and genomic testing are two such examples of these recent

scientific and medical breakthroughs. Over the past several years, this science has been

commercialized, allowing individuals to purchase a genetic test that affords them the opportunity

to learn their ancestral history, carrier status, and trait characteristics. Doctors and patients are now

able to better understand the biological and environmental factors that effect an individual’s health

(Kaufman et al., 2009; Telenti et al., 2014). One of the key issues of the broad dissemination and

availability of genetic data is consideration of data privacy (Rodriguez et al., 2013; Erlich and

Narayanan, 2014).

Genetic data, the basis from which we learn about our genetics and genome, is a unique

data type that differentiates itself from other data types (Naveed et al., 2015). Genetic and genomic

data include personal information endemic to the individual (Naveed et al., 2015; Humbert et al.,

2014; Lemke et al., 2010). Furthermore, genetic data can provide information about the health of

of an individual’s family members. (Naveed et al., 2015). These characteristics make genetic data

valuable to an individual and their family (Naveed et al., Humbert et al., 2013; Pulley et al., 2008).

But the ability to tie genetic data back to an individual, or an individual’s family, is also the basis

upon which privacy concerns associated with genetic data have arisen (Hull et al., 2008; Pulley et

al., 2008; Haga et al., 2011; Naveed et al., 2015; Garrison et al., 2015).

Federal statues like the Genetic Information Nondiscrimination Act (GINA) make it illegal

for health insurance providers “to use or require genetic information to make decisions about a

person’s insurance eligibility or coverage” and make it illegal for employers “to use a person’s

genetic information when making decisions about hiring, promotion, and several other terms of

employment” (National Institutes of Health, 2016). However, GINA does not protect from genetic

discrimination in every circumstance. GINA does not apply when “an employer has fewer than 15

employees,” and does not protect individuals in the U.S. military “or those receiving health

benefits through the Veterans Health Administration or Indian Health Service” (National Institutes

of Health, 2016; Green et al., 2015). Furthermore, GINA does not protect from genetic

2

discrimination in forms of insurance other than health insurance (National Institutes of Health,

2016). Abuse of genetic data, including data leaks, computer intrusions, and non-disclosure

violations, leave individuals susceptible to blackmail and genetic discrimination practices

(Gottlieb, 2001; Naveed et al., 2015).

When an individual takes a genetic test, they are not only revealing information about

themselves, they are also revealing information about their family members (Naveed et al., 2015).

This qualifies this data type as having “interdependent privacy considerations” (Yu and

Grossklags, 2016). The interdependency of privacy, refers to the phenomenon that in an

interconnected setting, the privacy of individual users not only depends on their own behaviors,

but is also affected by the decisions of others (Yu and Grossklags, 2016). Personal and

interdependent privacy concerns, in relation to the sharing genetic data, form the main pillars of

this thesis. In this thesis, we explore how interdependent privacy concerns affect the willingness

of individuals to share their genetic data. We also examine the willingness to take, or the

willingness to recommend taking, a genetic test. The element of trust is also examined across both

papers presented in this thesis. To our knowledge, the varied conditions under which individuals

would be willing to share their genetic data, alongside scenarios under which individuals would

be willing to take a genetic test, have not been examined thoroughly. Additionally, the opinions

and perspectives of family members who have had relatives take a genetic test have not been

covered to our knowledge.

1.2 Research Questions Based on the current and rapid expansion of personalized medicine, and in regards to the

ever-increasing affordability and accessibility of genetic testing, we seek to understand the

variables and factors that influence genetic test service adoption, and isolate elements that affect

sharing preferences for individuals who have taken a genetic test. With this in mind, we present

the following research questions:

Study 1, Research Question 1: What is the impact of factors varying the context of a

genetic test scenario on the assessment of intention to use and intention to recommend a genetic

testing service, perceived contributions to health awareness, and trust in the service provider? Study 1, Research Question 2: What is the impact of factors varying the context of a

genetic test scenario and data breach scenario on the assessment of personal privacy and

interdependent privacy concerns?

3

Study 1, Research Question 3: Is a higher construal level of the portrayed decision-maker

associated with an increase in the assessment of the intention to use and intention to recommend a

genetic testing service, perceived contributions to health awareness, and trust in the service

provider?

Study 1, Research Question 4: Is a higher construal level of the portrayed decision-maker

associated with an increase in the assessment of personal privacy and interdependent privacy

concerns? Study 2, Research Question 1: To which degree do demographic characteristics and

factors related to genetic testing impact perceptions of trust, personal and interdependent privacy,

and the intention of sharing own genetic data with a third party?

Study 2, Research Question 2: To which degree do demographic characteristics and

factors related to genetic testing impact perceptions of trust, personal and interdependent privacy,

and the intention of recommending to another individual to share genetic data with a third party?

1.3 Structure of Thesis In presenting the contributions described above, the rest of this thesis is structured as

follows. Chapter 2 presents a review of previous work conducted that relates to genetics and

genetic data privacy. Chapter 3 describes our first study entitled Understanding Interdependent

Privacy Concerns and Likely Use Factors for Genetic Testing: A Vignette Study. Chapter 4

describes our second study entitled A Vignette Study On Personal and Interdependent Privacy

Concerns, and Sharing Intentions for Genetic Data. A discussion of the survey tool Amazon

Mechanical Turk is presented in chapter 5. Lastly, concluding remarks are presented in Chapter 6.

4

Chapter 2: Background and Related Work

2.1 A Brief Overview on the History of Genetics The completion of The Human Genome Project (HGP) in 2003 gave rise to the genomic

data era, and inspired a scientific revolution that has lead to advances in research and medical

techniques across a wide array of study (Guttmacher and Collins 2003; Naveed et al., 2015).

Significant contributions in the field of genetics and molecular biology influenced the proliferation

of knowledge associated with modern-day genomics and the human genome (Naveed et al., 2015;

Collins et al., 2003). These modern insights are based on numerous key achievements dating back

to Gregor Mendel.

Mendel, a scientist and Augustinian Friar, often cited as the father of modern genetics,

established several rules seen in the field of genetics today (Office of History, NIH, 2016). Mendel

discovered that certain desirable traits in pea plants of one generation, including height, color, and

seed shape, could be inherited by the next generation (Office of History, NIH). This phenomenon

became known as ‘Mendelian Inheritance,’ and helped shape the core of modern genetics (Office

of History, NIH). After Mendel, Oswald Avery made his own discovery that helped shape the

genomic era (Office of History, NIH). Avery discovered that the substance responsible for

“inheritable change” in a disease-causing bacteria was deoxyribonucleic acid (DNA) (Office of

History, NIH). In 1944, Avery and his colleagues MacLeod and McCarthy, suggested that DNA

was responsible for transferring genetic information (Office of History, NIH). Building off the

work produced by Avery, MacLeod, and McCarthy, Erwin Chargaff determined that DNA differed

from species to species, and that the basic building blocks of DNA consisted of nucleobases

adenine (A), thymine (T), cytosine (C) and guanine (G) (Office of History, NIH). Then, in the

early 1950’s James Watson and Francis Crick, working in conjunction with Maurice Wilkins and

Rosalind Franklin, developed their groundbreaking model of the DNA double-helix structure

(Office of History, NIH). After the finding of the DNA double-helix, other pioneering discoveries,

including the various mechanisms involved in translating DNA information into proteins, were

soon to follow. Such breakthroughs in the fields of genetics and molecular biology helped HGP

successfully determine the sequence of nucleotide base pairs that comprise human DNA (Office

of History, NIH).

5

To this day, HGP remains the world’s largest biological collaborative research project, and

represents the flash point for all genetic and genomic breakthroughs and accomplishments that

have come thereafter (Tripp and Grueber, 2011).

2.2 Genomics of Kin Genetic data feature several characteristics endemic to privacy and interdependency

concerns and considerations (Naveed et al., 2015; Humbert et al., 2013; Heeney et al., 2010).

Descriptive in nature, genetic data distinguishes itself from other data types in that the data can be

easily tied back to the provider, either directly or by inference (Humbert et al., 2013).

Characteristics fashionable to genetic data include 1) health, 2) traceability, 3), uniqueness, and 4)

kinship (Naveed et al., 2015). The first characteristic, health, denotes that genetic data can be used

to assess the health of an individual, and, by extension, the behavior of an individual (Naveed et

al., 2015). Direct-to-consumer genetic testing currently offer trait and carrier status that afford

individuals the opportunity to learn if they are a carrier of certain inherited disease, and what traits,

such as facial features and taste preference, are due to their unique DNA (23 and Me). The second

characteristic endemic to genetic data, traceability, refers to the fact that DNA does not change

much over time, and, by addendum, can be traced back to one individual (Naveed et al., 2015).

The ability to tie DNA to one individual is known as uniqueness, the third characteristic (Naveed

et al. 2015, Humbert et al., 2013, Telenti et al., 2014). Kinship, the fourth characteristic, refers to

the notion that DNA and genetic data contains information about an individual’s blood relatives

(Garrison et al., 2015; Naveed et al., 2015; NIH). Kinship is perhaps most important when

discussing interdependency and privacy concerns. One of the most famous examples of genetic

data interdependency is the case of Henrietta Lacks (Skloot, 2011). Although Ms. Lacks died of

cervical cancer in 1951, cells from her body were studied, and eventually used to derive important

cell lines (HeLa) that are still in medical use today (Skloot, 2011). The researchers, who studied

Ms. Lacks’ genomic profile, also published it without the consent of her family members, making

widely known segments of their genomic profiles as well (Skloot, 2011; Naveed et al., 2015). Ms.

Lacks’ family members remain concerned as to any potential possible discrimination they may

receive as a result of the publishing of Henrietta’s genome (Skloot, 2011; Naveed et al., 2015).

Genomic data can be made public through a variety of mechanisms (Humbert et al., 2013;

Telenti et al., 2014; Naveed et al., 2015). These means include leaking, stealing, or through

genome-sharing websites like openSNP (Greshake et al., 2014; Naveed et al., 2015). With widely

6

available means to acquire the genetic data of another individual, different privacy laws and

cryptographic techniques have been developed in an attempt to ensure the security of an

individual’s genetic data (Naveed et al., 2015).

2.3 Security Considerations and Cryptographic Techniques Once genetic data has been entered into a system, there are several ways for an individual

with malicious intent to obtain it. One of the most common ways for an individual to obtain the

genetic data of another is by inference attack (Humbert et al., 2013). An inference attack is a data

mining technique in which information can be inferred with a high level of confidence based on

other, publically or illegally obtained information (Telenti et al., 2014; Malin, 2005). It has been

estimated that 60% of the U.S. population can be identified through a combination of date of birth,

sex, and 5-digit zip code, making identity tracing of genetic data plausible (Sweeney, 2000; Golle,

2006; Erlich and Narayanan, 2014). Another study reported the identification of 30% of

individuals who participated in the Personal Genome Project (PGP) using inference methods

(Sweeney et al., 2013; Erlich and Narayanan, 2014).

This inference attack schema can gather initial information from social network websites,

such as Facebook and Twitter; ancestry and familial websites like Ancestry.com; research

databases; public records, such as the census and court dockets; and health and genome websites

like 23 and Me, and openSNP (Telenti et al., 2014). An attacker can then de-anonymize this

information by matching the victims phenotypic, demographic, and administrative information

across social media platforms and with the help of familial and pedigree trees (Telenti et al., 2014).

A simpler way for a person with malintent to directly access an individual’s genetic data would be

to break into the server hosting the information, assuming the data stored is not encrypted (Naveed

et al., 2015; Telenti et al., 2014).

In an effort to thwart this malicious activity, companies associated with the collection of

genetic data often implement a wide variety of techniques to either secure, or anonymize their data

(Lauter et al., 2014). These approaches include policy-based solutions, de-identification of data,

approximate query answering, and technological solutions based on cryptography (Lauter et al.,

2014). There are currently a number of databases housing privately and publically shared genomic

databases, included openSNP, 1,000 Genomes Project (TGP), the International Cancer Genome

Consortium (ICGC), and the Database of Genotypes and Phenotypes (dbG) (Lauter et al., 2014).

Homomorphic encryption is an emergent technique being deployed in some databases to protect

7

data (Lauter et al., 2014). This encryption technique allows for computation to be performed on

ciphertext, generating an encrypted result that matches the result of the operations executed on

plaintext (Yi et al., 2014). Homomorphic encryption allows for data to be exchanged between

different services without exposing the unencrypted data along the way (Stuntz, 2010). This makes

the use of homomorphic encryption an appealing option for many associated in the field of genetic

data. However, some researchers have argued that homomorphic and other cryptographic

techniques do not go far enough (Humbert et al., 2013). Modern cryptographic algorithms can be

broken within 30 years, leaving a non-guaranteed lifetime protection of genetic data (Humbert et

al., 2013). Because of the rapid-expansion and commoditization of direct-to-consumer genetic

testing, the need to protect genetic data from leaks, theft, and attacks remains a priority (Telenti et

al., 2014; Garrison et al., 2015).

2.4 Privacy Laws Technology and the rapid rise in popularity of genetic testing has been one force that has

shifted genetics into the public sphere, and therefore, the legislative process (Green et al., 2015;

NIH). After several years of promotion by scientists and genetics advocacy groups, The Genetic

Information Nondiscrimination Act (GINA) was finally passed into law in 2008 (Green et al.,

2015; NIH). This federal statue 1) prohibits the requirement of genetic data to make a decision

about hiring or promotional practices, and 2) prohibits health insurers from requiring genetic data

to make decisions about eligibility or coverage (NIH-Genetics Home Reference; Green et al.,

2015). GINA was the first federal anti-discrimination law which addressed an issue where no well-

documented history of discrimination or abuse existed (Green et al., 2015). Still, the law passed

with bi-partisan support (NIH). Prior to the passage of GINA, state legislature had been examining

ways to ensure the protection of a person’s genetic data (NCSL). At that time (i.e., 2007-2008),

states had written bills that required consent to: perform/acquire a genetic test (12 states),

obtain/access genetic information (7 states), retain genetic information (8 states), and disclose

genetic information (27 states) (NCSL). Until now, many more states have written and passed

genetic privacy laws that build and expand upon the laws that existed in 2007 (Health Lawyers).

Still, as to our knowledge, no law or laws exist today that require the additional consent of a family

member or relative.

8

2.5 Opinions on the Importance of Privacy in Genetic Research When ascertaining the opinions of biomedical research experts, one study sought to

determine the level of concern these experts had regarding privacy and security of genetic and

genomic data (Naveed et al., 2015). Of note, 20% of the respondents said that genetic data should

be treated no differently than other sensitive health data, 28% of the respondents said that

advantages of a genome-based healthcare system will justify any harm that comes from genetic

data breaches, and none of the respondents said that genomic privacy was irrelevant (Naveed et

al., 2015).

The issue of data identifiably and consent appear to be important to most participants of a

genetic study (Garrison et al., 2015). Regarding the identifiably of samples, one study found that

78% of Native Hawaiians and 66% of whites would prefer that their consent be required for

research if their data was identifiable (Fong et al., 2004). Another study found that 81% of

participants in a US telephone survey stated that they would want to be notified of the research

being conducted with their sample should it be identifiable, while 57% also stated that they would

require permission for their samples to be used if it should be identifiable (Hull et al., 2008).

Kaufman et al. found that privacy was a key determinant that influenced one’s decision to

participate in genetic testing (2009). This study, composed of 4,659 US adults, found that 90%

were concerned about privacy, 56% were concerned about researchers having their genetic

information, while an additional 37% would worry about their data being used against them

(Kaufman et al., 2009).

9

Chapter 3: Understanding Interdependent Privacy Concerns and

Likely Use Factors for Genetic Testing: A Vignette Study Calls to address concerns about genetic privacy have led to a growing number of studies

from diverse disciplines including medical science and computer security. We complement these

studies by investigating perceived personal and interdependent privacy concerns in the context of

two important genetic privacy scenarios by deploying a factorial vignette survey taking a third-

person perspective. In a genetic test adoption scenario, we investigate how privacy concerns are

shaped by contextual factors including genetic predisposition to ailment, money required to take a

test, whether or not the testing service is new or established, and whether genetic test data will

likely contribute to the finding of a cure. In a data breach scenario, we also consider the scale of

the data breach and whether leaked data contains test takers’ names. We further manipulate the

description of the thinking style of the portrayed third-person decision-maker to start a discussion

on the impact of psychological factors in the context of genetic privacy.

3.1 Introduction The proliferation and advancement of genetic testing capabilities has made the technology

more accessible and affordable for individuals. Genomic tests can be used to cover a broad array

of clinical and diagnostic procedures, including gene variant detection, and paternal relationships

(Naveed et al., 2015). While genetic tests are typically focused on helping to discern genetic and

physical traits of the test-taker, they also provide insights into the genetic material of the test-

taker’s family members (Greenbaum et al., 2011; Heeney et al., 2010; Naveed et al., 2015). As

such, genetic data triggers inherent interdependent privacy considerations (Humbert et al., 2014;

Naveed et al., 2015).

In this paper, we report the results of a factorial vignette survey (Aviram, 2012), in which

respondents evaluate two scenarios related to the potential utilization of genetic testing services.

The vignette technique was developed to study diverse problems in which evaluations are made

by individuals, ultimately uncovering mechanisms in human choice behavior. It can be used to

expose underlying valuations and preferences of individuals by presenting hypothetical situations

to which they are asked to respond (Aviram, 2012).

Our key objective is to examine and understand the role of personal and interdependent

privacy concerns in the context of genetic testing. Concretely, in a genetic test adoption scenario,

10

we investigate how privacy concerns are shaped by contextual factors including genetic

predisposition to ailment, money required to take a test, whether or not the testing service is new

or established, and whether genetic test data will likely contribute to the finding of a cure. In a data

breach scenario, we also consider the scale of the data breach and whether leaked data contains

test takers’ names.

We also contribute to the debate to which degree psychological factors influence genetic

test adoption and privacy concerns by studying the influence of different thinking styles (in

particular, construal levels (Trope et al., 2003)) in the given scenarios. An individual’s construal

level has been shown to influence the decision-making process, and to affect the subject matter’s

perceived level of detail and scope (Fleischmann et al., 2015).

In the following, we present the exploratory research questions which guide our work:

RQ1: What is the impact of factors varying the context of a genetic test scenario on the assessment

of intention to use and intention to recommend a genetic testing service, perceived contributions

to health awareness, and trust in the service provider?

RQ2: What is the impact of factors varying the context of a genetic test scenario and data breach

scenario on the assessment of personal privacy and interdependent privacy concerns?

RQ3: Is a higher construal level of the portrayed decision-maker associated with an increase in the

assessment of the intention to use and intention to recommend a genetic testing service, perceived

contributions to health awareness, and trust in the service provider?

RQ4: Is a higher construal level of the portrayed decision-maker associated with an increase in the

assessment of personal privacy and interdependent privacy concerns?

We proceed as follows. In Section 3.2, we review related work. In Section 3.3, we present

our study methodology, vignette survey design, and procedures for participant recruitment. We

analyze the collected data in Section 3.4. Finally, we discuss our results and offer concluding

remarks in Sections 3.5 & 3.6, respectively.

3.2 Related Work Genomic data has several challenging characteristics, specifically that it 1) contains

information about the health and behavior of an individual, 2) is static over time and inherently

unique to an individual, 3) can be easily distinguished, and 4) allows to either directly ascertain or

infer genetic information about blood relatives (Naveed et al., 2015). The fourth aforementioned

characteristic of genetic material is the key motivation to investigate interdependent genetic

11

privacy. Privacy concerns may be triggered as a result of data breaches or mere accumulation of

genetic information from data repositories, including genome sharing websites, health websites,

and even ancestral websites (Telenti et al., 2014). Gathered information can be utilized to

deanonymize the genome of an individual (Telenti et al., 2014). Moreover, once the identity of a

test-taker is uncovered, an interested entity (whether malicious or not) can learn and infer specific

genetic traits of the individual’s family members (Humbert et al., 2013). Various ways as to best

protect and secure genetic data have been the subject of recent research projects and effectively

address a number of security and privacy challenges (Naveed et al., 2015). Recent research has

also investigated the privacy practices of direct-to-consumer genetic testing services, such as

23AndMe, via company-issued privacy notices and terms of service statements (Phillips, 2015).

How privacy laws affect the proliferation of personalized medicine (in particular, genetic testing)

has been researched empirically finding that re-disclosure approaches encourage genetic testing,

while informed consent approaches deter it (Miller et al., 2009). This suggests a need to find the

right mix of incentives and regulation to improve the market for genetic testing services and the

ways how consumers’ data is stored, sold, and shared (Phillips, 2015).

Studies have also examined the opinions of biomedical researchers and genetic test-takers

alike, in regards to the complexities of genetic privacy and genetic research (Naveed et al., 2015).

Participants of genetic testing and biobank research appear to be more willing to share data with

other researchers and academic institutions, compared to sharing data with national databases or

federal repositories (Beskow et al., 2008; Brothers et al., 2011; Garrison et al., 2015; Kaufman et

al., 2009a; Kaufman et al., 2009b). Participants expressed concern about the government having

access to their samples and government involvement in medical database operation and

management (Brothers et al., 2011; Kaufman et al., 2009a; Kaufman et al., 2009b). Another study

examined factors correlated with preferring broad consent pertaining to genomic research, finding

that participants who were more likely to choose broad consent also believed that participating

would make them feel “like I was contributing to society,” and would accelerate medical

treatments and advancements (Platt et al., 2013).

3.3 Methodology

3.3.1 Overview of Study Our study was composed of three unique vignettes (i.e., hypothetical decision-making

situations). The first two vignettes focused on the decision-making regarding taking a genetic test

12

(or not), whereas the third vignette represented a security scenario with a data breach involving

genetic data. Across all three scenarios, we ask participants for their evaluation from the

perspective of a portrayed third-person decision-maker. Each vignette included scenario-

appropriate context manipulations and follow-up measures (i.e., dependent variables) with which

we aimed to explore our set of exploratory research questions addressing genetic test adoption

decision, privacy, and psychological factors.

Adequately assessing whether a test should be taken as well as understanding the involved privacy

considerations is challenging for individuals. To this effect, a major component of our vignettes

involved the variation of the description of the involved decision-maker to vary regarding their

construal levels (Trope et al., 2003). A high construal level indicates that decision-makers perceive

a higher psychological distance to the object/event of choice and consider the choice from an

abstract perspective. In contrast, a low construal level indicates a closer psychological distance

and an assessment more focused on concrete aspects of the situation (Trope et al., 2003). A key

prediction from the theory is that a higher construal level is assumed to be associated with a higher

focus on striving for longer term objectives in relative comparison to a lower construal level (Fujita

et al., 2006). We interpret the decision to seek a genetic test to be most impactful regarding long

term health. At the same time, privacy (whether personal or interpersonal) is also fundamentally

associated with consequences stretched over a longer time horizon (Acquisti and Grossklags, 2005;

Grossklags et al., 2014).

3.3.2 Design of Vignettes Vignettes 1 and 2: The first two vignettes evaluated the decision-making regarding taking

a genetic test (or not), personal and interdependent privacy consequences and to understand the

impact of construal level on the participants’ assessment. As context manipulations we designated

the following independent variables: 1) whether or not a genetic testing company was new or

established, 2) whether the protagonist was considering their immediate or extended family

members, 3) how low or high the protagonist’s proclivity for genetic ailments was, 4) how

expensive the genetic test was, and 5) how likely the chance is of finding a cure for a disease if

more genetic data is available. The manipulation of construal level followed closely the design

developed by Fleischmann et al. (2015). Vignettes 1 and 2 began with a manipulation of construal

level followed by a common text containing the context manipulations.

In our first vignette, we describe John and his personality based on characteristics ascribed

13

to a high construal level individual. In this instance, we describe John as a big picture decision-

maker, and a person who tends to make gut decisions, reflecting the high construal level

description of an individual who thinks abstractly, and is more focused on bigger picture concerns

(Fujita et al., 2006).

High Construal Level (Vignette 1:) John is middle-aged man who works as a mid-level

manager at Western Frontier Software. At his position, John is well respected for being a big

picture decision maker, who always establishes a solid overview of goals to solve complex

problems. Although it sometimes ends up going poorly, he also tends to make “gut decisions” and

focus on the essential information when making these choices.

For our second vignette, we describe a character named Carlton and his personality

ascribed to a low construal level. As such, he was described as an individual who is more

concerned with short-term interests and can be weighed down by (at times) excessive

nearsightedness (Fujita et al., 2006).

Low Construal Level (Vignette 2:) Carlton is middle-aged man who works as a mid-level

manager at the Cryodine Software Company. At his position, Carlton is well respected for being

someone who is a very diligent decision maker, who always wants to consider all of the details of

a problem before attempting to solve it. Although it sometimes causes his superiors some

frustration, he tends to decide everything in a very patient, rational manner, and is heavily detail

focused.

Context Manipulations (Vignettes 1 and 2:) Recently, John/Carlton has been thinking a

lot about his health, and has heard of several (new, unknown/established) companies that now

offer genetic tests. These genetic tests could be used to potentially review information about

himself that doctors may not be able to find until any medical problems emerge, if they ever did.

Carlton knows that several of his (immediate/extended) family members have a (low/high)

proclivity for genetic problems, such as cancer, deafness, and other diseases. Upon further

research into these companies, Carlton finds that he can take a genetic test for (under $100/over

$100). In addition to this, Carlton learns that after genetic tests have been conducted, each

company anonymizes the results and submits them to the National Institutes of Health. If enough

genetic tests are submitted, the NIH believes it has a (low/high) chance of finding cures for a

number of diseases.

Upon reading each of the first two vignettes, participants were asked to respond to several

14

follow-up questions on 5-point Likert scales, anchored at 1 = strongly disagree, and 5 = strongly

agree. The dependent variables were presented as follows. Likely usage: Carlton intends to sign

up for the service and have his genome tested rather than not sign up for the service; Carlton’s

intentions are to use the genetic service rather than any alternative means (such as see a geneticist,

or other specialist in-person). Recommended usage (for others): John intends to recommend the

service to his family and friends, rather than not recommend the service; John’s intentions are to

recommend a genetic testing service to his friends and family, rather than any other alternative

means. Along these lines, we also explicitly tested for health awareness: Using the genetic testing

services will enhance Carlton’s personal health awareness. To explore concepts of individual

privacy, the following was asked: Carlton should be concerned about his genetic privacy when he

submits his genetic data to the company. Interdependent privacy was measured with the following:

Carlton should be concerned about his relatives’ genetic privacy when he submits his genetic data

to the company; Carlton should be concerned about his genetic privacy if his relatives submit their

data to the company. Trust in a genetic testing company was measured through the following:

Carlton believes the genetic testing service will manage his data securely.

Vignette 3: In the third vignette, we introduced a new character who had recently received

an email from a genetic testing company which suffered a data breach. As contextual

manipulations we considered the following independent variables: 1) the number of records stolen,

2) whether or not names were attached to the data that was stolen, 3) what is the family relation to

the individual who had their records stolen, and 4) what level of severity the family’s genetic

propensity for diseases is.

Context Manipulations (Vignette 3:)

Dear Tom,

Last week, attackers were able to breach our servers obtaining access to user accounts. They stole

(1,000-10,000/ 15,000-100,000/ 225,000-1,000,000+) records with (names attached/not

attached). (Your/Your immediate family member’s/Your extended family member’s) data

was compromised in this breach. We are informing you of this breach because genetic information

reveals a great deal about the health history, medical pre-dispositions, and disease carrier status

of the test taker. Since family members share genetic information, and to some extent, propensity

for disease, this data breach could reveal some genetic information about (you/your immediate

family members/your extended family members) as well. Our records indicate that (you

15

have/your immediate family member has/your extended family member has) a history of

(mild/moderate/severe) genetic disorders. It is not yet clear what the party affiliated with the

breach might do with this data, but the potential exists for this information to reach insurers,

increasing the costs associated with long term, life, and disability insurance.

Upon reading the third vignette, participants were asked to respond to several follow-up

questions on 5-point Likert scales, with 1 = strongly disagree, and 5 = strongly agree. The

dependent variables were presented as follows. Insurance concerns: Tom believes that this data

breach has the potential to change his health insurance policy. Third Party Privacy: Tom should

be concerned about his genetic privacy becoming available to a third party.

Personal/Interdependent Privacy: Tom should be concerned about his relatives’/his genetic

privacy becoming available to a third party due to this data breach.

Control Variables: To control for our subjects’ previous knowledge of, or experience

with, genetic testing services, we prefaced our survey with items detailing how much/how little

our participants knew about genetic testing, as well as whether or not they had ever taken a genetic

test. As we wanted to control for interdependent relationships as well, we also asked whether or

not our participants had ever had a family member take a genetic test. To account for any biases,

we also measured our participants’ understanding of each scenario, their opinion on the realism of

that scenario, as well as the participants’ ability to relate to each scenario following every vignette.

We also collected the participants’ age, gender, and highest level of education achieved for

demographic purposes.

3.3.3 Procedures and Participants

We utilized Amazon’s Mechanical Turk for participant recruitment. Upon arriving at our

survey, participants were consented and told to read an introductory paragraph which contained

basic information about what genetic testing services exist, and why these types of tests are taken.

After providing demographic responses, the participants were then presented with the vignettes

and the questions capturing the dependent variables. Upon completing our survey, participants

were compensated with $0.50 via Mechanical Turk. The described experiment was run during

August 2016. Using filtering tools provided by Mechanical Turk, our participant pool was limited

to individuals living within the United States, with previous approval ratings of 95% or higher.

The survey was written and presented in English only. To further ensure reliable data, several

“check” questions were included in the survey. Our study was approved by The Pennsylvania State

16

University’s Internal Review Board (IRB).

A total of 500 individuals took part in our study. 16 participants were excluded from our

final analysis due to failing to answer check questions correctly. This placed the completion rate

at 96.8%. Of the 484 participants who successfully completed our survey, 207 were males, and

276 were females, with 1 reporting as other. The average age of our participants was 37.03

(SD=12.2) years. Our participants were split in their previous knowledge of genetic testing

services, with 46.7% being very uninformed to neither informed or uninformed, on a 5-point Likert

scale. Perhaps more interestingly, 10.3% of our participants reported having previously taken a

genetic test, and 13.8% reported having a relative who completed a genetic test.

Figure 1: Dependent Variables (Non-Breach Scenarios) Figure 2: Comparison: Non-Breach vs. Breach

3.4 Results Control Variables and Vignette Manipulation Checks: To confirm that our participants

were responding to our survey items in the context of the characters in the vignettes, we first aimed

to evaluate whether outside influences would likely impact the results. To this effect, we split the

participant pool with regard to our pre-survey variables of knowledge of genetic testing, and/or

having taken a genetic test. For participants who had previous knowledge of genetic testing, the

results (using one-way ANOVA) showed no significant differences between groups for likely

usage (F=0.809, p=0.520), recommended use (F=0.179, p=0.949), health awareness (F=1.607,

p=0.949), personal privacy (F=0.342, p=0.849), or interdependent privacy (F=0.469, p=0.758)

compared to those in our participant pool who had little to no knowledge of genetic testing. For

participants who had previously taken a genetic test, the results also showed no significant

differences between groups for likely usage (F=1.712, p=0.182), recommended use (F=0.301,

p=0.740), health awareness (F=0.855, p=0.426), personal privacy (F=0.129, p=0.879), or

17

interdependent privacy (F=1.859, p=0.157) as opposed to those participants who had not taken a

genetic test. We can conclude that our participants were firmly grounded in our scenarios and

external influences likely had no dominant impact.

For each of our three vignettes, we also examined the ability of our participants to

understand, believe to be realistic, and relate to each of our scenarios. Our participants reported

high scores on 5-point Likert scales for understanding (M=4.2; SD=0.721 / M=4.34; SD=0.711 /

M=4.45; SD=0.650), realism (M=4.17; SD=0.745 / M=4.16; SD=0.720 / M=4.20; SD=0.801), and

ability to relate (M=4.06; SD=0.859 / M=4.10; SD=0.835 / M=4.06; SD=0.924). In addition to

this, it was found that the varying construal levels did not reveal any significant differences for

scenario realism (F=0.253, p=0.615) or the ability to relate to the scenario (F=2.478, p=0.116);

however, we found a small but significant difference for understanding (F=12.046, p=.001). The

results suggest that participants were overall firmly situated in the developed scenarios.

Data Exploration: Figure 1 provides an overview of the measures for important dependent

variables (based on 5- point Likert scales) collected for the first two vignettes. The participants

reported a well-above average intention to use a genetic testing service (M=3.74, SD=0.70), and

trust in genetic testing companies (M=3.73, SD=0.49). We found that on average participants were

slightly less inclined to recommend a genetic testing service to a friend or relative (M=3.24,

SD=0.82). Additionally, our participants collectively reported having moderate concerns about

personal privacy (M=3.05, SD=1.20) and interdependent privacy (M=3.00, SD=1.19). Due to our

sample size, the comparison between one’s considerations of personal privacy and that of

interdependent privacy shows a small, but significant difference, in favor of individuals being more

concerned about their own privacy (t(484)=2.03, p<.04).

Research Questions 1 & 2: During the following analyses, we conducted one-way

ANOVAs to explore the inter- actions in each of our vignettes between our dependent and

independent variables. Beginning with vignette 1 (high construal level), we found that there were

no significant effects between new/established genetic testing services, family relationships

(immediate/extended), or cost of service with our independent variables. We observed a significant

interaction between predisposed genetic proclivity and enhanced health awareness (F=4.717,

p<.03), indicating that with a higher genetic proclivity for disease, it was more believed that taking

a genetic test would improve health awareness. In addition, there was a high level of interaction

between increasing the chance of a cure and dependent variables. Specifically, we find that as the

18

chance of finding a cure for a disease (by analyzing genetic test results) in- creases, people are

significantly more likely to use a genetic testing service (F=18.54, p<.001), recommend that others

use a genetic testing service (F=14.798, p<.001), report a higher enhanced health awareness

(F=6.664, p<.01), have less of a concern for personal privacy (F=3.952, p<.047), and are more

likely to trust the company offering genetic services (F=6.571, p<.01). The interaction with

interdependent privacy was not significant.

For vignette 2 (low construal level), we found no significant interactions between family

relationships (immediate/extended) with any of our independent variables. We observed an

interaction between new/established genetic testing services and concern for interdependent

privacy (F=6132, p=.014), indicating that the newer a service is, the more likely test-takers are

going to be concerned about the privacy of family members. There was also an interac- tion

between genetic proclivity for diseases (low/high) and dependent variables. These results show

that with higher proclivity for genetic ailments, these individuals are more likely to use a genetic

testing service (F=24.547, p<.001), recommend that others use a genetic testing service (F=15.425,

p<.001), report higher enhanced health awareness (F=15.929, p<.001), and are more likely to trust

a genetic testing service (F=5.931, p=.015). Similar to vignette 1, we also find a significant

interaction between increasing the chance of a cure (using data collected from genetic tests) and

dependent variables. This, again, indicates that as the chance of finding a cure for a genetic disease

by utilizing genetic tests increases, individuals are more likely to use (F=5.675, p=.018) and

recommend the service to others (F=7.035, p<.01), have a greater trust in the company running the

testing service (F=4.838, p<.028), and report a higher enhanced health awareness (F=9.637,

p<.01). Finally, within our second vignette, we did find an interaction between cost of a genetic

test with trust (F=5.775, p=.017) in the service. Interestingly, this suggests that as a genetic testing

service charges more money, individuals are more likely to trust said company.

For vignette 3, we found that as the number of stolen records increases, the greater concern

individuals have for their data being leaked to a third party (F=3.052, p<.05). We also observed

interactions between names being attached (or unattached) to any stolen data and dependent

variables. This indicates that if names are attached to stolen genetic in- formation, people are more

likely to be concerned about their insurers (F=18.532, p<.001) or third parties (F=32.182, p<.001)

gaining access to the information, and the concern for personal privacy (F=30.549, p<.001) and

interdependent privacy (F=15.356, p<.001) also increases. Finally, we found no significant

19

interactions between variations of the family relation (i.e., you, your immediate family, your

extended family) and any of our dependent variables.

To delve deeper into the concerns for privacy, we compared aggregate measures for

personal and interdependent privacy concerns, respectively, from our first two vignettes (which

did not capture a data breach), to our third vignette (see Figure 2). The analysis suggests that the

occurrence of a data breach has a profound impact on personal and interdependent privacy

concerns. We find that, in the breach scenario, concern for personal privacy is significantly greater

compared to the aggregate measure from vignettes 1 & 2 (t(483) = -16.54, p<.01). The same

finding applies to interdependent privacy (t(483)=-18.714, p<.001). Additionally, in the event of

a data breach, individuals tend to have a greater concern for the privacy of others than they do for

themselves (t(483)=-3.37, p=.001). This stands in contrast to the scenarios without a data breach,

in which individuals have a greater concern for their personal privacy than interdependent privacy

(t(483)=2.03, p=.04).

Research Questions 3 & 4 - Construal Level Variations: To identify interactions due to

varying construal levels across the first two vignettes, we conducted repeated measures ANOVA

analyses. We observed significant differences for each of our independent variables with the

exception of personal privacy (F=0.971, p=0.325) and interdependent privacy (F=0.248, p=0.618).

In the situation portraying a higher construal level, participants were more likely to use (F=51.12,

p<.001), recommend use (F=17.66, p<.001), and trust (F=27.87, p<.001) genetic testing services

as well as report higher health awareness (F=8.15, p<.01) compared to the lower construal level

scenario.

3.5 Discussion Within our first two vignettes (and without a data breach), we were able to show that the

following factors, in different contexts, can influence the adoption of use of genetic testing

services: 1) Whether or not the genetic testing service is new or established, 2) Whether or not

individuals have any proclivity towards genetic ailments, 3) the amount of money required to take

a genetic test, and 4) the chance that information gained from a genetic test can contribute to curing

a disease. In particular, any increases in the stated likelihood of finding a cure through the analysis

of aggregated genetic test data led to significant perceptional changes. The likelihood of

establishing a cure having such a strong interaction on the adoption of use of a genetic testing

service indicates that some form of altruism or selflessness is a major factor in these types of

20

decisions.

In the scenarios without a data breach, we observed that respondents exhibited moderate

personal and interdependent privacy concerns with a small, but significant, imbalance in favor of

personal privacy concerns. Other adoption-related factors were perceived as more important on

the common Likert-scale (see Figure 1). Further, both dimensions of privacy concerns were largely

unresponsive to the utilized contextual manipulations; except a significant decrease of personal

privacy concerns if the chance of a cure increased. In contrast, in the scenario with a data breach,

privacy considerations moved more into the foreground. From the contextual variables, the

manipulation whether data was leaked with personally identifying information attached was

significant. We further observed significant increases in both concern dimensions (compared to

the previous scenarios), as well as a reversal in the importance of personal and interdependent

privacy (see Figure 2). The latter result contributes to the ongoing debate whether individuals can

be described as “privacy egoists” (Pu et al., 2015; Pu et al., 2016).

Addressing our third and fourth research questions, when examining the impact of

construal level variations (in the first two vignettes), our data showed in scenarios with portrayals

of higher construal levels that respondents are more likely to use, recommend, and trust genetic

testing services. Overall, it was apparent that construal level was just as important as the contextual

manipulations within our vignettes, indicating that thinking style attributes may be a critical factor

for an individual choosing to partake or not partake in a genetic test. Considering the large

influence of the construal level on measures with regards to using, recommending, or trusting a

genetic testing service, it is somewhat surprising that personal and interdependent privacy concerns

are not impacted by this manipulation. While we expected that a higher construal level positively

impacts long-term considerations such as health and privacy, the desire to protect privacy may

appear in conflict with an individual’s long-term desire for health (Acquisti and Grossklags, 2005).

These long-term trade-offs are currently understudied and are motivation for follow-up research.

3.6 Conclusion Genomic information is a complex data type which not only offers genetic and physical

trait discernment of the test- taker, but can also offer insights to genetic predispositions of the test-

taker’s family members. As such, the decision to take part in a genetic test triggers interdependent

privacy considerations. In addition, many factors relating to the understanding how and why

people choose to take genetic tests are still insufficiently understood.

21

With this study, we demonstrate that cost, medical history, and the desire to contribute in

part to the finding of a cure all have a shared impact on determining whether or not an individual

may take a genetic test. In addition, personal and interdependent privacy concerns are shaping the

decision-making process; primarily in the context of a data breach. In summary, this work

represents a critical next step in dismantling the complex perceptional and behavioral challenges

of use rationale, and interdependent privacy as it relates to genomic health data. Our study also

serves as a methodological starting point for future work to investigate factors influencing the use,

continued use, or cessation of genetic testing services.

22

Chapter 4: A Vignette Study On Personal and Interdependent

Privacy Concerns, and Sharing Intentions for Genetic Data Genetics and genetic data have been the subject of recent scholarly work, with significant

attention paid towards understanding genetic data consent practices and genetic data security.

Attitudes and perceptions concerning the trustworthiness of governmental or national institutions

receiving test-taker data have been explored, with varied findings, but no robust models or

deterministic relationships have been established that account for these differences. These results

also do not explore in detail the perceptions regarding other types of organizations (e.g., private

corporations). Further, considerations of privacy interdependence arising from blood relative

relationships have been absent from the conversation regarding the sharing of genetic data. This

paper reports the results from a factorial vignette survey study in which we investigate how

variables of ethnicity, age, genetic markers, and association of data with the individual’s name

affect the likelihood of sharing data with different types of organizations. We also investigate

elements of personal and interdependent privacy concerns. We document the significant role these

factors have in the decision to share or not share genetic data with a third party. We support our

findings with a series of regression analyses.

4.1 Introduction The field of genetics has seen a rapid expansion ever since the sequencing of the human

genome was first completed in 2003 (Collins et al., 2003). The accumulation of knowledge

associated with the pursuit of such a grand challenge has led to many medical advances, treatments,

and cures. Progress in stem cell research, HIV treatment, targeted cancer therapies, and even

individualized genetic testing has benefited the lives of millions (Peck and Cox, 2009). However,

with the proliferation of genetic information, questions have emerged regarding how to best store,

protect, anonymize, and even share such data (Humbert et al., 2014; Miller et al., 2009; Naveed et

al., 2015; Heeney et al., 2010). While the degree to which individuals trust varying institutions

(whether medical, governmental, or academic) with their data has found some initial attention,

research is amiss to understand not only the personal and interdependent privacy considerations

that go into the decision to undergo a genetic test, but to also share genetic data (Kaufman et al.,

2009a; Garrison et al., 2015; Lemke et al., 2010). Previous studies have found that a majority of

individuals would be willing to share genetic data, but that these sharing decisions are associated

23

with significant concerns. For example, while a national survey found that 80% of respondents

would be willing to share their data with the government, 75% of the sample also expressed

concern over the government having access to their data (Kaufman et al., 2009a).

This chapter reports the results from a factorial vignette survey study in which we

investigate how variables of ethnicity, age, genetic markers, and association of data with the

individual’s name affect the willingness of sharing data with different types of organizations. We

also investigate elements of personal and interdependent privacy concerns. We document the

significant role these factors have in the decision to share or not share genetic data with a third

party. We support our findings with a series of regression analyses.

We proceed as follows. In Section 4.2, we present a concise overview of previous research

on genetic data sharing and genetic privacy. In Section 4.3, we continue with the development of

our research questions, and research methodology. We present our results in Section 4.4. We

discuss our findings and offer concluding remarks in Section 4.5.

4.2 Related Work Genetic data is a unique data type that has clear and defined characteristics that distinguish

it from other data types (Naveed et al., 2015). Although computational and descriptive in nature,

genetic data has certain immutable features that can be easily tied back to the provider of the data;

either directly or by inference (Humbert et al., 2013). These features can be categorized broadly

as: 1) Health/Behavior, 2) Static/Traceable, 3) Unique, 4) Value, and 5) Kinship (Naveed et al.,

2015). The first category stresses the point that DNA contains information about an individual’s

health, and by extension, behavior. The second factor highlights that DNA remains largely

unchanged over time. The third factor demonstrates that a given human’s DNA is easily

distinguishable from another human’s DNA. The fourth characteristic refers to the importance of

the information content in DNA and genetic data; which does not decline over time. The fifth,

kinship, refers to the fact that DNA contains information about blood relatives, something that this

paper investigates through the lens of interdependent privacy (Naveed et al., 2015).

Previous literature has identified and partially examined some of the following factors that

are relevant for genetic data sharing decisions: 1) trust in academic, medical, and government

institutions, 2) associated views of consent, and, in particular, broad consent, and 3) the recognition

of personal risk and group benefits in a data sharing context (Haga and O’Daniel, 2010; Garrison

et al., 2015; Kaufman et al., 2009a; Helft et al., 2007). From 2006-2015, over 20 studies were

24

conducted exploring genetic data sharing focusing either on trust in medical or academic

institutions, or trust in the government (Garrison et al., 2015; Pulley et al., 2008; Lemke et al.,

2010). For example, a relatively recent study surveying 4,659 U.S. adults found that approximately

92% would be generally willing to share their data with academic and medical research institutions

(Kaufman et al., 2009a). Another study by the same authors also found that 80% of 931 U.S.

veterans would be willing to share their data with these entities (Kaufman et al., 2009b). However,

research suggests that participants are more reserved about sharing their genetic data with

governmental or federal institutions (Kaufman et al., 2009a; Garrison et al., 2015; Rahm et al.,

2013; Platt et al., 2014). In a large study of nearly 5,000 U.S. adults, 80% said that they were

willing to share their data with government researchers (Kaufman et al., 2009a). However, 75% of

the same sample size expressed concern about the government having this information (Kaufman

et al., 2009a). Likewise, in a focus group study conducted in North Carolina, more than half of the

participants expressed concerns about the government gaining access to any biorepository

(Beskow et al., 2008).

Broad consent is another emergent area in the field of genetic data sharing (Garrison et al.,

2015). Broad consent is the practice in which participants agree to have their genomic data retained

for any future use or research by a biobank, medical, or governmental institution (Garrison et al.,

2015). Broad consent may be used in the form of opt-in and opt-out clauses (Garrison et al., 2015;

Simon et al., 2011). The former would require the permission from the participant for any entity

to use their information, while the latter would not (Hull et al., 2008; Garrison et al., 2015). To our

knowledge, five studies have shown that survey respondents favor the opt-in approach, however

studies exist that show a preference for the opt-out approach (Hull et al., 2008; Pulley et al., 2008;

Simon et al., 2011; Schwartz et al., 2001; Thiel et al., 2014; Pentz et al., 2006; Ludman et al.,

2010).

Concerns about data sharing may also be triggered by other factors. For example, a recent

study of 4,050 Vanderbilt University faculty and staff found that they were 18.5% more likely to

share data with a national database if their information were to be de-identified (Brothers et al.,

2011). Weidman et al. explored personal and interdependent privacy concerns (i.e., test-taking also

reveals insights into the genetic material of the test-taker’s blood relatives) in the decision-making

context on whether to take a genetic test (Weidman et al., 2016). They also investigated how

privacy concerns shift when data security is compromised by a breach of the genetic test institution.

25

However, we are unaware of any study considering the role of personal and interdependent privacy

concerns in the context of genetic data sharing.

4.3 Methodology

4.3.1 Research Questions and Overview of Study

The principle objective of our study is to examine the role of key factors in the genetic data

sharing decision-making process. Our exploratory research questions are motivated by the

following observations. Previous research has suggested that the willingness of test-takers to share

genetic data depends on the type of institution (Kaufman et al., 2009a). Further, trust in the

institution has been considered in previous work on genetic data sharing (McCarty, 2011).

However, we are unaware of any studies that conduct a direct comparison for such preferences

across institutional types. To this effect, we consider a comparison between the following

institutional types: private medical research organization, private academic research organization,

and governmental research organization.

Further, the relative importance of personal and interdependent privacy considerations has

only been studied in the context of the intention to take a test (Weidman et al., 2016), but not in

the related context of data sharing. In our work, we focus on privacy preferences in genetic data

sharing scenarios and measure individuals’ personal privacy concerns as well as interdependent

privacy concerns. In the latter case, we compare whether individuals can correctly differentiate

between the absence of interdependent privacy consequences (close friend; no blood relative) and

the presence of significant interdependent privacy consequences (sibling; i.e., close blood relative).

We further consider whether the test-takers genetic data will be shared in an anonymous or

identified fashion, which has previously been identified (chapter 3) as a factor in genetic data

scenarios (Brothers et al., 2011). In addition, privacy concerns and the intention to share genetic

data with a research institution likely depend on the presence or absence of a genetic condition

(genetic marker), which we consider in our study.

We further consider demographic characteristics including age, gender, ethnicity, income,

and education level, and relevant experience factors including whether the participant has taken a

genetic test, or personally knows a family member/friend who has taken a genetic test, which are

relevant to the genetic data sharing scenario.

Finally, we distinguish between two basic decision-making perspectives. First, we consider

a genetic data sharing situation in which the decision-maker has to consider the sharing of his/her

26

own data. In the second scenario, we place the decision-maker in the role of an adviser who has to

state a recommendation about the sharing of genetic data to another individual.

We summarize our approach in the following two research questions.

RQ1: To which degree do demographic characteristics and factors related to genetic

testing impact perceptions of trust, personal and interdependent privacy, and the intention of

sharing own genetic data with a third party?

RQ2: To which degree do demographic characteristics and factors related to genetic

testing impact perceptions of trust, personal and interdependent privacy, and the intention of

recommending to another individual to share genetic data with a third party?

To address our research questions, we embedded the aforementioned factors into relevant

scenarios, delivered by a vignette survey, as well as follow-up questions. The vignette survey

technique was designed to uncover the mechanisms and underlying choice preferences of the

respondent, in addition to revealing the valuations and considerations of each person (Aviram,

2012). Across each scenario, we asked participants for their evaluation of a hypothetical situation.

All vignettes included appropriate context manipulations and follow-up measures. The study was

approved by The Pennsylvania State University’s Internal Review Board (IRB).

4.3.2 Design of Vignettes

The first vignette displayed to the respondents was designed to evaluate the decision-

making process regarding personal and interdependent privacy concerns in the context of sharing

data with a third party. We designed the vignettes to be compared from the perspective of the test

taker (with hypothetical attributes, i.e. if this person carries a genetic disease), and from the

hypothetical perspective of the test taker’s close friend or sibling. The first vignette asks direct

questions to the respondent in reference to our context manipulations. The second vignette asks

the respondent to answer questions as if their friend or sibling were seeking advice in a scenario

in which they were asked to share their genetic data. We did this to investigate the effect of personal

privacy considerations versus the privacy considerations of and for, others. The independent

variables for vignette 1 were as follows: 1) whether or not the test taker carries a genetic disease,

2) the third party in which the data is being shared (government research group, private medical

research group, academic research group), and, 3) whether or not name affiliation is present with

the data. The independent variables for vignette 2 were as follows: 1) whether the person who took

the test is a close friend or sibling, 2) whether or not this person carries a genetic disease, 3) the

27

third party in which the data is being shared (government research group, private medical group,

or academic research group), and 4) whether or not name affiliation is present with the data.

4.3.2.1 Vignette 1 and Contextual Manipulations

You have been thinking about your health. In online searches about how to be healthier,

you came across and read about the recent advances in genetic testing. In your reading, you have

learned that genetic tests can inform you about possible genetic traits, conditions, or abnormalities

that you, or your family could have. After careful consideration, you decide to take a genetic test.

A few weeks later, you receive your results back, complete with information about your entire

genome, as well as information on familial traits and possible genetic abnormalities. Through

these results, you were informed that you (do/don’t carry a) genetic disease that could negatively

affect your life. A few days after receiving your results, you receive a verified email from the

genetic testing company stating that a(n) (academic research team/ private research group/

government research group) has asked you, and others, to share your genetic information,

specifically your genome with (your name anonymized/not anonymized). The (academic research

team/ private research group/ government research group) claims that sharing this information

could help them discover possible treatments, unexplored genetic interactions, and even possible

cures.

After reading vignette 1, respondents were asked to respond to several follow-up questions.

Dependent variables were presented as follows: Likelihood of submitting data: What is the

likelihood of you submitting your data to a(n) academic research team/ private research group/

government research group? A 5-point Likert scale followed this question with available responses

ranging from ‘extremely unlikely’ to ‘extremely likely.’ Likelihood of abuse: How likely is the

(academic research team/ private research group/ government research group) to abuse your data

in some way without your permission? A 5-point Likert scale followed this question with available

responses ranging from ‘extremely unlikely’ to ‘extremely likely.’ To explore concepts related to

personal privacy, we asked the following question: ‘If you would share your data, what would be

your level of concern for your own privacy?’ A 5-point Likert scale followed this question with

available responses ranging from ‘highly concerned’ to ‘not concerned at all.’ To explore aspects

of interdependent privacy, we asked the following questions: ‘If you would share your data, what

would be your level of concern for the personal privacy of your friends or colleagues (non-family

members)?’ and ‘If you would share your data, what would be your level of concern for the privacy

28

of your family members? A 5-point Likert scale followed both of these interdependent privacy

questions with available responses ranging from ‘highly concerned’ to ‘not concerned at all.’

Along the lines of interdependent privacy, we displayed the question ‘Do you think your family

member(s) would recommend that you share your genetic data?’ A 5-point Likert scale followed

this question with available responses ranging from ‘definitely not’ to ‘definitely yes.’ To

investigate concepts related to trust we asked the following question: ‘To what extent do you agree

that you could trust the (private medical research group /private research group/ government

research group) to manage your genetic data responsibly?’ A 5-point Likert scale followed this

question with available responses ranging from ‘strongly disagree’ to ‘strongly agree.’

4.3.2.2 Vignette 2 and Contextual Manipulations

After thinking about it for some time, your (close friend/sibling has just completed

(her/his) first genetic test. It was found that they (do/don’t carry) a genetic disease that could

negatively affect her/his life. Your (close friend/sibling) just let you know that they received a

verified email from the genetic testing company asking them, and others, to share her/his genetic

data with (an academic research group/private research group/government research group).

This (academic research group/private research group/government research group) claims that

sharing her/his genetic data could help them to discover possible treatments, unknown genetic

interactions, and even possible cures. The (academic research group/ private research group/

government research group) will (have/not have) your (close friend/sibling)’s name affiliated

with the data. Your (close friend/sibling) has come to you to ask for your opinion and advice on

this matter.

After reading vignette 2, respondents were asked to respond to several follow-up questions.

Dependent variables were presented as follows: Likelihood of abuse: How likely, in your opinion,

is the (academic research group/ private research group/ government research group) to abuse your

(close friend’s/sibling’s) data in some way (i.e. share to another institution without your

permission, publish your data publicly, etc.)? A 5-point Likert scale followed this question with

available responses ranging from ‘extremely unlikely’ to ‘extremely likely.’ Likelihood of

recommendation to share: Would you recommend that your (close friend/ sibling) should share

her/his genetic data?’ A 5-point Likert scale followed this question with available responses

ranging from ‘definitely not’ to ‘definitely yes.’ To explore concepts related to personal privacy,

in this context, we asked the following question: ‘If your close friend/ sibling shares her/his genetic

29

data, what would be your level of concern for your personal privacy?’ A 5-point Likert scale

followed this question with available responses ranging from ‘highly concerned’ to ‘not concerned

at all.’ To investigate aspects of interdependent privacy, we asked the following questions ‘If your

close friend/ sibling would share her/his data, what would be your level of concern for the personal

privacy of her/his other friends or colleagues (non-family members)?’ A 5-point Likert scale

followed this question with available responses ranging from ‘highly concerned’ to ‘not concerned

at all.’ Additionally, we asked ‘If your close friend/ sibling would share her/his genetic data, what

would be your level of concern for the personal privacy of her/his family members?’ A 5-point

Likert scale followed this question with available responses ranging from ‘highly concerned’ to

‘not concerned at all.’ Along these lines, we asked, ‘If your close friend/ sibling shares her/his

genetic data, what would be your level of concern for her/his personal privacy?’ A 5-point Likert

scale followed this question with available responses ranging from ‘highly concerned’ to ‘not

concerned at all.’ To investigate concepts related to trust, we displayed the question: ‘To what

extent do you agree that you could trust the academic research group/ private research group/

government research group to manage your close friend’s/ sibling’s genetic data responsibly?’ A

5-point Likert scale followed this question with available responses ranging from ‘strongly

disagree’ to ‘strongly agree.’

4.3.3 Demographic and Experience Factors

We prefaced our survey with a self-assessment question to determine how much the subject

knows about genetic testing. We also asked whether they had previously taken a genetic test. To

control for relevant experiences via relationships, we asked whether or not the subject have had a

family member or friend take a genetic test. We also collected participants’ age, gender,

race/ethnicity, income, and education level for demographic purposes.

4.3.4 Procedures and Participants

We utilized Amazon’s Mechanical Turk for recruitment of our participants, while

completion of our factorial vignette survey was hosted and deployed on Qualtrics. Upon arrival a

to our survey site, participants were presented with an introductory paragraph to ground everyone

with the same basic knowledge about genetic testing. This statement included a brief history of

genetic testing, examples of what genetic data analysis can identify and what test results can be

used for. Upon the completion of our survey, participants were compensated with 0.50$ US.

30

The survey was run and executed during the month of November 2016. Using filtering

tools provided by Mechanical Turk, we were able to limit our participant pool to those with

approval ratings above 95% (to ensure the quality of results), while also limiting responses to users

with American Internet Protocol addresses. To further ensure the quality of our results, “check”

questions were utilized to make sure participants were carefully reading our questions. Participants

had the vignettes presented to them in a random order. The survey was written in English. A total

of 559 participants took part in our survey. After controlling for incorrect check questions, 551

participants and their subsequent responses were used for further analysis (98.6% completion rate).

59.3% of these respondents were female, while 40.7% were male. 18.5% of participants reported

to have previously taken a genetic test, while 66.8% stated that they would consider taking a

genetic test.

4.4 Results We isolated our participant pool based on whether the participants themselves had taken

(or considered taking) a genetic test (18.3% and 66.2%, respectively), or if they had known a

family member (21.4%) or friend (31.6%) who had taken a genetic test. For our participants who

had previously taken a genetic test, the results (using one-way ANOVA) showed significant

differences for likelihood of sharing data (F=3.316, p<.05), and their perception whether their

families would recommend them to share their genetic data with third parties (F=4.855, p<.05).

More specifically, participants who previously took a genetic test were more likely to share their

data, and were more likely to believe that their family would recommend that they should share

their own data. For participants who had considered taking a genetic test (but had not yet taken

one), the results showed no significant differences for any of our outcomes. This was also the case

for participants who had family members who had taken a genetic test. Lastly, for participants who

know a friend who had taken a genetic test, their concern for the privacy of the described friend in

the vignette scenario was significantly greater than for those who had not known a friend who had

taken a genetic test (F=4.824, p<.01). The observed effects represent intuitive influences of

demographic and experience factors on outcome measures.

31

Table 1. Multinomial logistic regression models for Vignette 1 for each dependent variable. Independent variables are shown with their significance levels within each respective model. Model (Dependents) ethnicity age genMarker dataReq. isDataLabelled Likely Sharing .211 .011 .032 .063 .000 Trust .233 .004 .055 .004 .000 Family Privacy Concern .037 .066 .373 .010 .000 Friend Privacy Concern .027 .055 .767 .032 .002 Personal Privacy .011 .025 .080 .001 .000 Family Recommendation to

Share

.037 .330 .018 .230 .000

To explain our outcome variables in more depth, we followed an established method to

analyze vignette study data by applying multinomial logistic regressions (Lauber, et al., 2003;

McKinstry, 2000), using demographic and experience factors, as well as the factors of our vignette

manipulations. Exploratory analysis showed that the following factors had no significant impact

in our models: gender, income, education, having taken or considering to take a genetic test, having

a family member or friend who took a genetic test, and whether the participant had a sibling. It

follows, that we conducted the final regression analyses without these factors. This left us with the

following independent variables to form the basis of our regression analyses: ethnicity, age, family

relation (Vignette 2), whether an individual carries a genetic marker, and whether the submitted

data is labeled with the test taker’s name. The series of models to explain the different outcome

variables for the first vignette are shown in Table 1. We conducted equivalent analyses for the

second vignette, and the results for these models are shown in Table 2. We report significances in

these tables only, due to space constraints. Directionality of selected significant relationships are

discussed in the following section.

4.5 Discussion of Results, and Concluding Remarks Within our vignettes, we measured respondents’ likelihood of sharing data, along with

elements of trust, concern for family members’ privacy, concern for friends’ privacy, and personal

privacy, as well as recommendation to share by your family (Vignette 1). We explored how these

variables were related to ethnicity, age, genetic disease presence, the data requester (government

research groups, private academic research groups, and private medical groups), and name

affiliation. We observed several notable interactions. First, we want to highlight factors which

triggered effects across most of the investigated regression models. In particular, we observe that

if test takers’ names are affiliated with the data, it triggers a significant effect for all dependent

variables. Likewise, ethnicity (except for Caucasian) is shown to be a significant influence on most

32

measured dependents. Second, we briefly discuss additional effects for the different models across

the two vignettes. Individuals are more likely to engage in sharing of their genetic data if they are

older (Vignette 1) and have a genetic marker present. We also note that individuals report to trust

private academic research institutions the most, followed by private medical research groups, and

governmental research groups. These findings are supported by other papers exploring institutional

trust (Garrision et al., 2015; Kaufman et al., 2009b). Individuals’ concerns for a family member’s

privacy is significantly influenced by the type of entity requesting data (Vignette 1) and age

(Vignette 2). We also observed a highly significant interaction between concern for the privacy of

a friend/colleague and the type of entity requesting data (Vignette 1). For personal privacy, age is

of significant impact as well as the type of entity requesting data (Vignette 1). In addition, in

Vignette 2 the type of relationship to the portrayed decision-maker (close friend or sibling)

significantly influences the concern for personal privacy. In Vignette 1, we also measured whether

the participant believed that his family would recommend taking the test. We observe that this

perception is significantly influenced by whether a genetic marker is present. In Vignette 2, we

also measured the survey participants’ concern for the portrayed decision-maker, but find no

additional effects.

Table 2. Multinomial logistic regression models for Vignette 2 for each dependent variable. Independent variables are shown with their significance level within each respective model. Model (Dependents) ethnicity age familyRelation genMarker dataReq. isDataLabelled

Recommended Sharing

.144 .122 .212 .013 .633 .000

Trust .024 .304 .091 .038 .078 .000 Family Privacy Concern

.037 .001 .523 .969 .120 .000

Friend Privacy Concern

.145 .271 .234 .759 .429 .003

Personal Privacy .035 .047 .000 .737 .806 .011 Concern for Test-Taker

.019 .097 .377 .644 .151 .000

Our findings have implications for both medical and commercial groups and organizations.

As it relates to the acquisition of genetic data for medical research, we complement existing

research on data sharing, which may help to boost recruitment in such efforts. As it relates to

commercial organizations, we ascertain that our results are important to genetic testing service

33

providers. Our findings show that name affiliation of genetic data is significantly related to key

variables such as trust in the provider, personal privacy, and the likelihood of sharing.

Demonstration of effective technical assurances (i.e., anonymization of genetic data) and

institutional measures (e.g., comprehensive security and privacy policies and audits) to ensure

consumer privacy could go a long way in recruitment efforts.

Genetic data is a unique data type triggering explicit and unavoidable interdependent privacy

concerns. The ability to uniquely identify an individual, predict health-related issues, and even

learn about familial history, makes handling and sharing genetic data a unique challenge for

security and medical experts alike. In this study, we have shown that highly relevant factors such

as institutional trust, concern for family and friend’s privacy, and the likelihood of sharing genetic

data are closely associated to factors which we manipulated in the vignettes, such as carrying a

genetic marker and the institutional type of the data requester, as well as the demographic factors

age, and ethnicity. Our current work is an important stepping-stone to the development of a

comprehensive explanatory model for the intention to share genetic data which is currently missing

despite 20 years of research on genetic data sharing practices.

34

Chapter 5: Amazon Mechanical Turk Both chapter three and chapter four, which incorporate study one and study two,

respectively, use Amazon Mechanical Turk as the mechanism of data collection. Amazon

Mechanical Turk allows “requesters,” such as academics and scholars, to outsource tasks (so-

called Human Intelligence Tasks - HITs) as a means of collecting data. In the case of study one

and study two, we outsourced our survey to individuals who execute HITs on Amazon Mechanical

Turk. Amazon Mechanical Turk has been used in the past by researchers in the fields of privacy

and security, and demonstrates a novel way for researchers to collect data from a large pool of

subjects from across the United States or worldwide (Pu et al., 2016; Christin et al., 2012;

Grossklags et al., 2014). The demographic mix of “Turkers” appears to be greater than that of a

convenience university sample, allowing “requesters” to collect data that is more representative of

the target population (Cam et al., 2007). Furthermore, slightly over half of the “Turker” population

is female, which is reflective of the current female population percentage (Mason et al., 2012).

A critique of Amazon Mechanical Turk is that “requesters” pay “Turkers” money to

complete tasks, leading to questions of data quality. However, to address this concern, we

introduced “check” questions to each “Turker” to ensure quality of the data. To accomplish this,

we would ask simple questions such as “select number 77” or “what company did Tom work for?”

occasionally during the survey to ensure that each “Turker” was taking enough time to read and

comprehend our survey. A number of studies have been performed to evaluate the consistency and

quality of data originating from Amazon Mechanical Turk (Mason et al., 2012; Ross et al., 2010).

Horton et al. found no significant differences between online (Amazon Mechanical Turk) and

traditional (laboratory) versions of a behavioral study, despite the relative anonymity of

Mechanical Turk (2011).

35

Chapter 6: Conclusions and Remarks The vast and rapid expansion of personalized medicine, has lead to the tangible

improvement in the lives of many. Specifically, personalized medicine has allowed doctors and

patients alike to assess existing genetic variations and predispositions that could manifest into

disease (NCBI, 2007). This, in turn, has made it possible for better guided and informed decisions

in medical treatment. However, as highlighted in this thesis, the growth in genetic testing has given

rise to many concerns regarding breaches of data, and personal and interdependent privacy.

In our first study, we examined likely use factors for genetic testing under the lens of

interdependent privacy. It was shown that medical history, cost, and the desire to contribute in part

to the finding of a cure, all impacted the likelihood of an individual subscribing to a genetic test.

It was also found that personal and interdependent privacy concerns are likely shaping this

decision-making process. Our second study explored sharing intentions for genetic data, also under

the scope of interdependent privacy. In this second study, we document the significant role

ethnicity, age, presence of genetic markers, and name affiliation have in the decision to share or

not share genetic data.

We have contributed to the existing literature in the areas of privacy, genetics, and trust,

while also complementing existing studies examining trustworthiness of institutions receiving test-

taker data. However, more work is still needed to investigate the factors influencing the continued

use or termination of testing services. In addition, others have the opportunity to build off this

work by developing a model that details the intention to share genetic data.

Direct-to-consumer genetic testing companies, governmental and academic research

organizations might also find this work useful. We show which factors make an individual more

likely to take a genetic test, while also demonstrating which variables factor into the data sharing

process. These findings may assist said institutions in recruiting additional or future clientele.

As the field of personalized medicine and genetics continues to expand, we must remember

that genetic data is a uniquely sensitive and vulnerable data type. It has the potential to improve

the health of an individual, and, at the same time, it harbors the possibility of misuse and

exploitation. As this burgeoning field continues to evolve and advance, we must develop security

measures and data-sharing policy practices that evolve with it.

36

Bibliography 23andMe. (2017). Carrier Status Reports and Ancestry Reports. 23andMe. Retrieved from

https://www.23andme.com/dna-ancestry/ Acquisti, A., & Grossklags, J. (2005). Privacy and rationality in individual decision

making. IEEE Security & Privacy, 3(1), 26-33. Aviram, H. (2012). What would you do? Conducting web-based factorial vignette surveys.

In Handbook of survey methodology for the social sciences (pp. 463-473). Beskow, L. M., & Dean, E. (2008). Informed consent for biorepositories: assessing prospective

participants' understanding and opinions. Cancer Epidemiology and Prevention Biomarkers, 17(6), 1440-1451.

Braun, K. L., Tsark, J. U., Powers, A., Croom, K., Kim, R., Gachupin, F. C., & Morris, P.

(2014). Cancer patient perceptions about biobanking and preferred timing of consent. Biopreservation and biobanking, 12(2), 106-112.

Brothers, K. B., Morrison, D. R., & Clayton, E. W. (2011). Two large‐scale surveys on

community attitudes toward an opt‐out biobank. American Journal of Medical Genetics Part A, 155(12), 2982-2990.

Christin, N., Egelman, S., Vidas, T., & Grossklags, J. (2011, February). It’s all about the

Benjamins: An empirical study on incentivizing users to ignore security advice. In International Conference on Financial Cryptography and Data Security (pp. 16-30). Springer Berlin Heidelberg.

Collins, F. S., Morgan, M., & Patrinos, A. (2003). The Human Genome Project: lessons from

large-scale biology. Science, 300(5617), 286-290. Conley, J. M., Mitchell, R., Cadigan, R. J., Davis, A. M., Dobson, A. W., & Gladden, R. Q.

(2012). A trade secret model for genomic biobanking. The Journal of Law, Medicine & Ethics, 40(3), 612-629.

Erlich, Y., & Narayanan, A. (2014). Routes for breaching and protecting genetic privacy. Nature

Reviews Genetics, 15(6), 409-421. Fleischmann, M., Grupp, T., Amirpur, M., & Benlian, A. (2015). When Updates Make a User

Stick: Software Feature Updates and their Differential Effects on Users’ Continuance Intentions. Thirty Sixth International Conference on Information Systems

Fong, M., Braun, K. L., & Chang, R. M. (2004). Native Hawaiian preferences for informed

consent and disclosure of results from research using stored biological specimens. Pac Health Dialog, 11(2), 154-159.

37

Fujita, K., Trope, Y., Liberman, N., & Levin-Sagi, M. (2006). Construal levels and self-control. Journal of Personality and Social Psychology, 90(3), 351.

Nanibaa'A, G., Sathe, N. A., Antommaria, A. H. M., Holm, I. A., Sanderson, S. C., Smith, M. E.,

& Clayton, E. W. (2015). A systematic literature review of individuals' perspectives on broad consent and data sharing in the United States. Genetics in Medicine, 18(7), 663-671.

Golle, P. (2006). Revisiting the uniqueness of simple demographics in the US population.

In Proceedings of the 5th ACM workshop on Privacy in electronic society (pp. 77-80). ACM.

Green, R. C., Lautenbach, D., & McGuire, A. L. (2015). GINA, genetic discrimination, and

genomic medicine. New England Journal of Medicine, 372(5), 397-399. Greenbaum, D., Sboner, A., Mu, X. J., & Gerstein, M. (2011). Genomics and privacy:

implications of the new reality of closed data for the field. PLoS Comput Biol, 7(12), e1002278.

Greshake, B., Bayer, P. E., Rausch, H., & Reda, J. (2014). OpenSNP–a crowdsourced web

resource for personal genomics. PLoS One, 9(3), e89204. Grossklags, J., & Barradale, N. J. (2014, July). Social status and the demand for security and

privacy. In International Symposium on Privacy Enhancing Technologies Symposium (pp. 83-101).

Grossklags, J., & Reitter, D. (2014, July). How task familiarity and cognitive predispositions

impact behavior in a security game of timing. In Computer Security Foundations Symposium (CSF), 2014 IEEE 27th (pp. 111-122). IEEE.

Haga, S. B., & O’Daniel, J. (2011). Public perspectives regarding data-sharing practices in

genomics research. Public health genomics, 14(6), 319-324. Heeney, C., Hawkins, N., de Vries, J., Boddington, P., & Kaye, J. (2010). Assessing the privacy

risks of data sharing in genomics. Public health genomics, 14(1), 17-25. Helft, P. R., Champion, V. L., Eckles, R., Johnson, C. S., & Meslin, E. M. (2007). Cancer

patients' attitudes toward future research uses of stored human biological materials. Journal of Empirical Research on Human Research Ethics, 2(3), 15-22.

Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting

experiments in a real labor market. Experimental Economics, 14(3), 399-425. Hull, S. C., Sharp, R. R., Botkin, J. R., Brown, M., Hughes, M., Sugarman, J., & Wilfond, B. S.

(2008). Patients' views on identifiability of samples and informed consent for genetic research. The American Journal of Bioethics, 8(10), 62-70.

38

Humbert, M., Ayday, E., Hubaux, J. P., & Telenti, A. (2013, November). Addressing the

concerns of the lacks family: quantification of kin genomic privacy. In Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security (pp. 1141-1152). ACM.

Humbert, M., Ayday, E., Hubaux, J. P., & Telenti, A. (2015). Interdependent privacy games: the

case of genomics (No. EPFL-REPORT-203825). Kaufman, D., Murphy, J., Erby, L., Hudson, K., & Scott, J. (2009a). Veterans' attitudes regarding

a database for genomic research. Genetics in Medicine, 11(5), 329-337. Kaufman, D. J., Murphy-Bollinger, J., Scott, J., & Hudson, K. L. (2009b). Public opinion about

the importance of privacy in biobank research. The American Journal of Human Genetics, 85(5), 643-654.

Lauber, C., Nordt, C., Falcato, L., & Rössler, W. (2003). Do people recognise mental

illness?. European archives of psychiatry and clinical neuroscience, 253(5), 248-251. Lauter, K., López-Alt, A., & Naehrig, M. (2014, September). Private computation on encrypted

genomic data. In International Conference on Cryptology and Information Security in Latin America (pp. 3-27).

Lemke, A. A., Wolf, W. A., Hebert-Beirne, J., & Smith, M. E. (2010). Public and biobank

participant attitudes toward genetic research participation and data sharing. Public Health Genomics, 13(6), 368-377.

Ludman, E. J., Fullerton, S. M., Spangler, L., Trinidad, S. B., Fujii, M. M., Jarvik, G. P., &

Burke, W. (2010). Glad you asked: participants' opinions of re-consent for dbGap data submission. Journal of Empirical Research on Human Research Ethics, 5(3), 9-16.

Malin, B. A. (2005). An evaluation of the current state of genomic data privacy protection

technology and a roadmap for the future. Journal of the American Medical Informatics Association, 12(1), 28-34.

Mason, W., & Suri, S. (2012). Conducting behavioral research on Amazon’s Mechanical Turk.

Behavior research methods, 44(1), 1-23. McCarty, C. A., Garber, A., Reeser, J. C., & Fost, N. C. (2011). Study newsletters, community

and ethics advisory boards, and focus group discussions provide ongoing feedback for a large biobank. American journal of medical genetics Part A, 155(4), 737-741.

McGuire, A. L., Oliver, J. M., Slashinski, M. J., Graves, J. L., Wang, T., Kelly, P. A., ... &

Treadwell-Deering, D. (2011). To share or not to share: a randomized trial of consent for data sharing in genome research. Genetics in Medicine, 13(11), 948-955.

39

Miller, A. R., & Tucker, C. (2009). Privacy protection and technology diffusion: The case of electronic medical records. Management Science, 55(7), 1077-1093.

Naveed, M., Ayday, E., Clayton, E.W., Fellay, J., Gunter, C.A., Hubaux, J.P., Malin, B.A. and

Wang, X., (2015). Privacy in the genomic era. ACM Computing Surveys (CSUR), 48(1), p.6.

National Center for Biotechnology Information. (2007). Understanding Human Genetic

Variation. Retrieved from https://www.ncbi.nlm.nih.gov/books/NBK20363/ National Conference of State Legislatures. (2008). Genetic Privacy Laws. Retrieved from

http://www.ncsl.org/research/health/genetic-privacy-laws.aspx National Institutes of Health. (2016). Genetic Discrimination. Retrieved from

https://www.genome.gov/10002077/genetic-discrimination/ Office of History, National Institutes of Health (2016). Deciphering the Genetic Code. Retrieved

from https://history.nih.gov/exhibits/nirenberg/HS1_mendel.htm Oliver, J. M., Slashinski, M. J., Wang, T., Kelly, P. A., Hilsenbeck, S. G., & McGuire, A. L.

(2011). Balancing the risks and benefits of genomic data sharing: genome research participants’ perspectives. Public Health Genomics, 15(2), 106-114.

Peck, Peggy, and Lauren Cox (2009, December 17). The Top 10 Medical Advances of the

Decade. ABC News. Retrieved from http://abcnews.go.com/Health/Decade/genome-hormones-top-10-medical- advances-decade/story?id=9356853

Pentz, R. D., Billot, L., & Wendler, D. (2006). Research on stored biological samples: views of

African American and White American cancer patients. American Journal of Medical Genetics Part A, 140(7), 733-739.

Phillips, A. M. (2015, May). Genomic Privacy and Direct-to-Consumer Genetics: Big Consumer

Genetic Data--What's in that Contract?. In Security and Privacy Workshops (SPW), 2015 IEEE (pp. 60-64). IEEE.

Platt, J., Bollinger, J., Dvoskin, R., Kardia, S. L., & Kaufman, D. (2013). Public preferences

regarding informed consent models for participation in population-based genomic research. Genetics in Medicine, 16(1), 11-18.

Pu, Y., & Grossklags, J. (2016). Towards a Model on the Factors Influencing Social App Users’

Valuation of Interdependent Privacy. Proceedings on Privacy Enhancing Technologies, 2016(2), 61-81.

Pu, Y., & Grossklags, J. (2015). Using conjoint analysis to investigate the value of

interdependent privacy in social app adoption scenarios. International Conferences on Information Systems (ICIS).

40

Pulley, J. M., Brace, M. M., Bernard, G. R., & Masys, D. R. (2008). Attitudes and perceptions of

patients towards methods of establishing a DNA biobank. Cell and tissue banking, 9(1), 55-65.

Rahm, A. K., Wrenn, M., Carroll, N. M., & Feigelson, H. S. (2013). Biobanking for research: a

survey of patient population attitudes and understanding. Journal of community genetics, 4(4), 445-450.

Rodriguez, L. L., Brooks, L. D., Greenberg, J. H., & Green, E. D. (2013). The complexities of

genomic identifiability. Science, 339(6117), 275-276. Ross, J., Irani, L., Silberman, M., Zaldivar, A., & Tomlinson, B. (2010, April). Who are the

crowdworkers?: shifting demographics in mechanical turk. In CHI'10 extended abstracts on Human factors in computing systems (pp. 2863-2872). ACM.

Schwartz, M. D., Rothenberg, K., Joseph, L., Benkendorf, J., & Lerman, C. (2001). Consent to

the use of stored DNA for genetics research: a survey of attitudes in the Jewish population. American journal of medical genetics, 98(4), 336-342.

Simon, CM., L'heureux, J., Murray, JC., Winokur, P., Weiner, G., Newbury, G., Shinkunas, L.,

& Zimmerman, B. Active choice but not too active: public perspectives on biobank consent models. Genetics in Medicine 13, no. 9 (2011): 821-831.

Skloot, R. (2011). The immortal life of Henrietta Lacks. Broadway Books. Smith, S., Nielson, P., & Kennedy, B. (2011). Genetic Privacy Laws: 50 State Survey. Retrieved

from.https://www.healthlawyers.org/hlresources/Public%20Documents/50state_chart_final.pdf

Stuntz, C. What is Homomorphic Encryption, and Why Should I Care?, 2010. Sweeney, L. (2000). Simple demographics often identify people uniquely. Health (San

Francisco), 671, 1-34. Sweeney, L., Abu, A., & Winn, J. (2013). Identifying participants in the personal genome project

by name. Working Paper. Telenti, A., Ayday, E., & Hubaux, J. P. (2014). On genomics, kin, and

privacy. F1000Research, 3. Thiel, D. B., Platt, T., Platt, J., King, S. B., & Kardia, S. L. (2014). Community perspectives on

public health biobanking: an analysis of community meetings on the Michigan BioTrust for Health. Journal of community genetics, 5(2), 125-138.

Trope, Y., & Liberman, N. (2003). Temporal construal. Psychological review, 110(3), 403.

41

Vellinga, A., Cormican, M., Hanahoe, B., Bennett, K., & Murphy, A. W. (2011). Opt-out as an

acceptable method of obtaining consent in medical research: a short report. BMC medical research methodology, 11(1), 40.

Weidman, J., Aurite, WR., and Grossklags, J. (2016). Understanding Interdependent Privacy

Concerns and Likely Use Factors for Genetic Testing: A Vignette Study. 3rd International Workshop on Genome Privacy and Security (GenoPri’16).

Wendler, D., & Emanuel, E. (2002). The debate over research on stored biological samples: what

do sources think?. Archives of internal medicine, 162(13), 1457-1462. Yi, X., Paulet, R., & Bertino, E. (2014). Homomorphic encryption and applications (Vol. 3).

Berlin: Springer.

42

Appendix A

Details of Survey 1: Understanding Interdependent Privacy Concerns and

Likely Use Factors for Genetic Testing: A Vignette Study

Before beginning the survey, please consider the following: Over the past several years the rapid

advancement of technology has made the sequencing and analysis of an individual’s genome

widely available and affordable. The benefits of such genomic testing include: diagnosing disease,

identifying gene changes that are responsible for disease, and identifying gene changes that could

be passed to children. Unlike other types of data, genomic data describes an individual’s own

health and behavior, contains information about blood relatives, and remains unchanged over time.

These shared genetic properties are also shared with other members of the family as well. As such,

if one generation of your family has a pre-disposition to heart disease, it is likely that the next will

have a similar predisposition. Therefore, when you decide to share genetic information about

yourself, you are also deciding to share genetic information about your relatives as well. Hence,

the decision to receive a genomic test contains interdependent privacy considerations. That is, the

privacy of a given individual is bound to be affected by the decisions of others. With this in mind,

please read and carefully consider the following questions.

43

Q53 What is your age?

Q52 What is your gender?

m   Male (1) m   Female (2) m   Other (3) m   Prefer not to answer (4)

Q56 What is your highest completed level of education?

m   Less than high school (11) m   High school graduate (12) m   Some college (13) m   2 year degree (14) m   4 year degree (15) m   Professional degree (16) m   Doctorate (17)

Q58 How informed are you about genetic testing?

m   Very uninformed (1) m   Somewhat uninformed (2) m   Neither informed or uninformed (3) m   Somewhat informed (4) m   Very informed (5)

Q51 Have you ever had a genetic test conducted?

m   Yes (1) m   No (2) m   Prefer not to say (3)

Q55 Are you concerned or worried about being diagnosed with cancer or some kind of genetic

disorder?

m   Extremely concerned (1) m   Somewhat concerned (2) m   Neither concerned or unconcerned (3) m   Somewhat unconcerned (4) m   Extremely unconcerned (5)

44

Q50 Have any of your family members had a genetic test conducted?

m   Yes (1) m   No (2) m   I don't know (4) m   Prefer not to say (3)

Q49 Does cancer or any genetic predisposition/disorder run in your family?

m   Yes (1) m   No (2) m   I don't know (4) m   Prefer not to say (3)

Q54 Are you concerned or worried about being diagnosed with a disease that your family

member has had?

m   Yes (1) m   No (2) m   Prefer not to say (3)

Q57 How likely are you to ever participate in genomic testing services?

m   Extremely unlikely (23) m   Somewhat unlikely (24) m   Neither likely nor unlikely (25) m   Somewhat likely (26) m   Extremely likely (27)

45

Q70 Please read the following scenario below and answer the questions that follow.

John is middle-aged man who works as a mid-level manager at Western Frontier Software. At his

position, John is well respected for being a “big picture” decision maker, who always establishes

a solid overview of goals to solve complex problems. Although it sometimes ends up going poorly,

he also tends to make “gut decisions” and focus on the essential information when making these

choices. Recently, John has been thinking a lot about his health, and has heard of several (new,

unknown/established) companies that now offer genetic tests. These genetic tests could be used to

potentially review information about himself that doctors may not be able to find until any medical

problems emerge, if they ever did. John knows that several of his (immediate/extended) family

members have a (low/high) proclivity for genetic problems, such as cancer, deafness, and other

diseases. Upon further research into these companies, John finds that he can take a genetic test for

(under $100/over $100). In addition to this, John learns that after genetic tests have been

conducted, each company anonymizes the results and submits them to the National Institutes of

Health. If enough genetic tests are submitted, the NIH believes it has a (low/high) chance of finding

cures for a number of diseases.

Q2 John intends to sign up for the service and have his genome tested rather than not sign up for

the service.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q4 John intends to recommend the service to his family and friends, rather than not recommend

the service.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

46

Q40 What company does John work for?

m   McDonald's Corporation (1) m   Vineyard Vines (2) m   Western Frontier Software (3) m   Cryogenic Software (4) m   WestCoast Offense Software (5)

Q3 John’s intentions are to use the genetic testing service rather than any alternative means (such

as see a geneticist, or other specialist in-person).

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q37 Using the genetic testing services will enhance John’s personal health awareness.

m   Strongly disagree (17) m   Somewhat disagree (18) m   Neither agree nor disagree (19) m   Somewhat agree (20) m   Strongly agree (21)

Q48 For quality control, please select the number seventy-seven below.

m   1 (1) m   77 (2) m   153 (3) m   22 (4) m   305 (5)

Q6 John should be concerned about his genomic privacy when he submits his genomic data to

the company.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

47

Q7 John should be concerned about his relatives' genomic privacy when he submits his genomic

data to the company.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q8 John believes that the potential sacrifice of his personal genomic privacy is worth getting a

genomic test.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q5 John’s intentions are to recommend the genetic testing service to his friends and family,

rather than any other alternative means (such as see a geneticist, or other specialist in-person).

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q9 John believes the genetic testing service will manage his data securely.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

48

Q39 John should be concerned about his genomic privacy if his relatives submit their genomic

data to the company.

m   Strongly disagree (16) m   Somewhat disagree (17) m   Neither agree nor disagree (18) m   Somewhat agree (19) m   Strongly agree (20)

Q59 How well did you feel you understood the scenario above?

m   I did not understand the scenario at all (1) m   I had some trouble understanding the scenario (2) m   I found it neither difficult or easy to understand the scenario. (3) m   I was able to understand the scenario without much trouble. (4) m   I had no trouble understanding the scenario. (5)

Q60 How realistic did you find the scenario above to be?

m   Not realistic at all (1) m   Somewhat unrealistic (2) m   Neither realistic or unrealistic (3) m   Somewhat realistic (4) m   Extremely realistic (5)

Q61 How difficult or easy would it be for you to place yourself in the hypothetical scenario

above?

m   Extremely difficult (23) m   Somewhat difficult (24) m   Neither easy nor difficult (25) m   Somewhat easy (26) m   Extremely easy (27)

49

Q71 Please read the following scenario below and answer the questions that follow.

Q10 Carlton is middle-aged man who works as a mid-level manager at the Cryodine Software

Company. At his position, Carlton is well respected for being someone who is a very diligent

decision maker, who always wants to consider all of the details of a problem before attempting to

solve it. Although it sometimes causes his superiors some frustration, he tends to decide

everything in a very patient, rational manner, and is heavily detail focused.Recently, Carlton has

been thinking a lot about his health, and has heard of several (new, unknown/established)

companies that now offer genetic tests. These genetic tests could be used to potentially review

information about himself that doctors may not be able to find until any medical problems

emerge, if they ever did. Carlton knows that several of his (immediate/extended) family

members have a (low/high) proclivity for genetic problems, such as cancer, deafness, and other

diseases. Upon further research into these companies, Carlton finds that he can take a genetic test

for (under $100/over $100). In addition to this, Carlton learns that after genetic tests have been

conducted, each company anonymizes the results and submits them to the National Institutes of

Health. If enough genetic tests are submitted, the NIH believes it has a (low/high) chance of

finding cures for a number of diseases.

Q12 Carlton intends to sign up for the service and have his genome tested rather than not sign up

for the service.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

50

Q15 Carlton’s intentions are to recommend the genetic testing service to his friends and family,

rather than any other alternative means (such as see a geneticist, or other specialist in-person).

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q46 What company does Carlton work for?

m   Cyberdive Software (1) m   Corporate Software Inc. (2) m   Burger King (3) m   Cryptic Software (4) m   Cryodine Software (5)

Q13 Carlton’s intentions are to use the genetic testing service rather than any alternative

means, (such as see a geneticist, or other specialist in-person).

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q16 Carlton should be concerned about his genomic privacy when he submits his genomic data

to the company.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

51

Q47 For quality control, please select the number seven below.

m   22 (1) m   86 (2) m   3 (3) m   -80 (4) m   7 (5)

Q17 Carlton should be concerned about his relatives' genomic privacy when he submits his

genomic data to the company.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q43 John should be concerned about his genomic privacy if his relatives submit their genomic

data to the company.

m   Strongly disagree (16) m   Somewhat disagree (17) m   Neither agree nor disagree (18) m   Somewhat agree (19) m   Strongly agree (20)

Q18 Carlton believes that the potential sacrifice of his personal genomic privacy is worth getting

a genomic test.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

52

Q14 Carlton intends to recommend the service to his family and friends, rather than not

recommend the service.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q19 Carlton believes the genetic testing service will manage his data securely.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q42 Using the genetic testing services will enhance Carlton’s personal health awareness.

m   Strongly disagree (18) m   Somewhat disagree (19) m   Neither agree nor disagree (20) m   Somewhat agree (21) m   Strongly agree (22)

Q63 How well did you feel you understood the scenario above?

m   I did not understand the scenario at all (1) m   I had some trouble understanding the scenario (2) m   I found it neither difficult or easy to understand the scenario. (3) m   I was able to understand the scenario without much trouble. (4) m   I had no trouble understanding the scenario. (5)

Q64 How realistic did you find the scenario above to be?

m   Not realistic at all (1) m   Somewhat unrealistic (2) m   Neither realistic or unrealistic (3) m   Somewhat realistic (4) m   Extremely realistic (5)

53

Q65 How difficult or easy would it be for you to place yourself in the hypothetical scenario

above?

m   Extremely difficult (23) m   Somewhat difficult (24) m   Neither easy nor difficult (25) m   Somewhat easy (26) m   Extremely easy (27)

Q20 Imagine for a moment that Tom has received the following email from a genetic testing

services provider:Dear Tom,Last week, attackers were able to breach our servers obtaining access

to user accounts. They stole (NumRecords) records with (namesAttached/NotAttached).

(FamMember) data was compromised in this breach. We are informing you of this breach because

genetic information reveals a great deal about the health history, medical pre-dispositions, and

disease carrier status of the test taker. Since family members share genetic information, and to

some extent, propensity for disease, this data breach could reveal some genetic information about

(you/family members**) as well. Our records indicate that (FamMember) a history of (Disease

propensity) genetic disorders. It is not yet clear what the party affiliated with the breach might do

with this data, but the potential exists for this information to reach insurers, increasing the costs

associated with long term, life, and disability insurance.

Q21 Tom believes that this data breach has the potential to change his health insurance policy.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q23 Tom should be concerned about his genomic privacy becoming available to a third party

due to this data breach.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

54

Q24 Tom should be concerned about his relatives' genomic privacy becoming available to a third

party due to this data breach.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q44 Who is the email in this story addressed to?

m   Tom (1) m   Timmy (2) m   Theodore (3) m   Teddy (4) m   Tiberius (5)

Q25 Tom believes that the potential sacrifice of his personal genomic privacy was worth getting

a genomic test.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q26 Tom believes that the potential sacrifice of his family member’s genomic privacy was worth

it for them to receive a genomic test.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

55

Q27 Tom blames the genetic testing service provider for the data breach.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q45 In the story above, how many records were stolen?

m   0-1,000 (1) m   225,000-1,000,000+ (2) m   1,000,000+ (3) m   15,000-100,000 (4) m   1,000-10,000 (5)

Q28 Tom feels concern for himself because of the stolen data.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q29 Tom feels concern for his family members because of the stolen data.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q30 Tom blames the attackers for the data breach.

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

56

Q67 How well did you feel you understood the scenario above?

m   I did not understand the scenario at all (1) m   I had some trouble understanding the scenario (2) m   I found it neither difficult or easy to understand the scenario. (3) m   I was able to understand the scenario without much trouble. (4) m   I had no trouble understanding the scenario. (5)

Q68 How realistic did you find the scenario above to be?

m   Not realistic at all (1) m   Somewhat unrealistic (2) m   Neither realistic or unrealistic (3) m   Somewhat realistic (4) m   Extremely realistic (5)

Q69 How difficult or easy would it be for you to place yourself in the hypothetical scenario

above?

m   Extremely difficult (23) m   Somewhat difficult (24) m   Neither easy nor difficult (25) m   Somewhat easy (26) m   Extremely easy (27)

57

Appendix B

Details of Survey 2: A vignette study on personal and interdependent privacy

concerns, and sharing intentions for genetic data

Before beginning the survey, please consider the following: Over the past several years the rapid

advancement of technology has made the sequencing and analysis of an individual’s genome

widely available and affordable. The benefits of such genomic testing include: diagnosing disease,

identifying gene changes that are responsible for disease, and identifying gene changes that could

be passed to children. Unlike other types of data, genomic data describes an individual’s own

health and behavior, contains information about blood relatives, and remains unchanged over time.

These genetic properties are also shared with other members of the family as well. As such, if one

generation of your family has a pre-disposition to heart disease, it is likely that the next will have

a similar predisposition. Therefore, when you decide to share genetic information about yourself,

you are also deciding to share genetic information about your relatives. Hence, the decision to

receive a genomic test contains interdependent privacy considerations. That is, the privacy of a

given individual is bound to be affected by the decisions of others.

First, you will be asked to fill out a brief series of demographic questions. Following this, please

keep this introduction paragraph in mind when reading the presented scenarios.

58

Q1 What is your gender?

m   Male (1) m   Female (2) m   Other (3)

Q3 What is your race/ethnicity?

m   White (1) m   Black or African American (2) m   Hispanic (3) m   American Indian or Alaska Native (4) m   Asian (5) m   Native Hawaiian or Pacific Islander (6) m   Other (7)

Q4 What is your age?

m   Under 18 (1) m   18 - 24 (2) m   25 - 34 (3) m   35 - 44 (4) m   45 - 54 (5) m   55 - 64 (6) m   65 - 74 (7) m   75 - 84 (8) m   85 or older (9)

59

Q5 What is your annual income?

m   Less than $10,000 (1) m   $10,000 - $19,999 (2) m   $20,000 - $29,999 (3) m   $30,000 - $39,999 (4) m   $40,000 - $49,999 (5) m   $50,000 - $59,999 (6) m   $60,000 - $69,999 (7) m   $70,000 - $79,999 (8) m   $80,000 - $89,999 (9) m   $90,000 - $99,999 (10) m   $100,000 - $149,999 (11) m   More than $150,000 (12)

Q6 What is your level of education?

m   Less than high school (1) m   High school graduate (2) m   Some college (3) m   2 year degree (4) m   4 year degree (5) m   Professional degree (6) m   Doctorate (7)

Q7 Have you ever taken a genetic test?

m   Yes (1) m   No (2)

Q30 For quality control purposes, please select the number 'thirty-three' below.

m   72 (1) m   38 (2) m   9 (3) m   33 (4)

Q8 Have you ever considered taking a genetic test?

m   Yes (1) m   No (2)

60

Q9 Have you known anyone (non-family member) who has taken a genetic test?

m   Yes (1) m   No (2) m   I don't know (3)

Q10 To your knowledge, has another member of your family taken a genetic test?

m   Yes (1) m   No (2) m   I don't know (3)

Q11 Do you have a sibling?

m   Yes (1) m   No (2)

61

Q12 Please complete the following questions about things you may or may not have done in the

past.

62

  Never  (1)   Once  (2)   More  than  Once  (3)   Often  (4)   Very  Often  (5)  

I have covered the

purchase of lunch

for a friend. (1) m   m   m   m   m  

I have given

directions to a

stranger. (2) m   m   m   m   m  

I have donated

goods or clothes

to a charity. (3) m   m   m   m   m  

I have done

volunteer work for

a charity or other

organization (4)

m   m   m   m   m  

I have delayed an

elevator and held

the door open for

a stranger. (5)

m   m   m   m   m  

I have donated

blood. (6) m   m   m   m   m  

I have helped a

friend or

acquaintance

move

homes/apartments.

(7)

m   m   m   m   m  

I have given

money to a

charity. (8) m   m   m   m   m  

63

I have allowed

someone to go

ahead of me in a

lineup (at a coffee

shop, in the

supermarket). (9)

m   m   m   m   m  

I have offered my

seat on a bus or

train to a stranger

who was elderly.

(10)

m   m   m   m   m  

Q13 Scenario X: You have been thinking about your health. In online searches about how to be

healthier, you came across and read about the recent advances in genetic testing. In your reading,

you have learned that genetic tests can inform you about possible genetic traits, conditions, or

abnormalities that you, or your family could have. After careful consideration, you decide to take

a genetic test. A few weeks later, you receive your results back, complete with information about

your entire genome, as well as information on familial traits and possible genetic abnormalities.

Through these results, you were informed that you (you carry/don’t carry- GEM) a(any) genetic

disease that could negatively affect your life. A few days after receiving your results, you receive

a verified email from the genetic testing company stating that a(n) (WDS) has asked you, and

others, to share your genetic information, specifically your genome with (IDL). The (WDS) claims

that sharing this information could help them discover possible treatments, unexplored genetic

interactions, and even possible cures.

Q28 After reading the above scenario carefully, please respond to the questions below. Feel free

to reference this story again at any time to complete your responses.

64

Q14 To what extent do you agree that you could trust the (WDS) to manage your genetic data

responsibly?

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q15 How likely is the (WDS) to abuse your data in some way (i.e. share with another institution

without your permission, publish your data publicly without your permission, etc.)?

m   Extremely unlikely (1) m   Somewhat unlikely (2) m   Neither likely nor unlikely (3) m   Somewhat likely (4) m   Extremely likely (5)

Q16 What is the likelihood of you submitting your data to the (WDS)?

m   Extremely unlikely (1) m   Somewhat unlikely (2) m   Neither likely nor unlikely (3) m   Somewhat likely (4) m   Extremely likely (5)

Q17 If you would share your data, what would be your level of concern for the privacy of your

family members?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

Q31 For quality control reasons, please select the answer that is labeled as 'yellow' below.

m   Yellow (1) m   Purple (2) m   Yams (3) m   Green (4)

65

Q18 If you would share your data, what would be your level of concern for the personal privacy

of your friends or colleagues (non-family members)?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

Q19 If you would share your data, what would be your level of concern for your own privacy?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

Q32 Do you think your family member(s) would recommend that you share your genetic data?

m   Definitely not (1) m   Probably not (2) m   Might or might not (3) m   Probably yes (4) m   Definitely yes (5)

Q20 Scenario Z: After thinking about it for some time, your (WTT) has just completed (her/his)

first genetic test. It was found that they (carry/don’t carry-GEM) a(any) genetic disease that could

negatively affect her/his life. Your (WTT) just let you know that they received a verified email

from the genetic testing company asking them, and others, to share her/his genetic data with

(WDS). This (WDS) claims that sharing her/his genetic data could help them to discover possible

treatments, unknown genetic interactions, and even possible cures. The (WDS) will (have/not

have- IDL) your (WTT)’s name affiliated with the data. Your (WTT) has come to you to ask for

your opinions and advice on this matter.

Q29 After reading the above scenario carefully, please respond to the questions below. Feel free

to reference this story again at any time to complete your responses.

66

Q21 To what extent do you agree that you could trust the (WDS) to manage your (WTT)’s

genetic data responsibly?

m   Strongly disagree (1) m   Somewhat disagree (2) m   Neither agree nor disagree (3) m   Somewhat agree (4) m   Strongly agree (5)

Q22 How likely, in your opinion, is the (WDS) to abuse your (WTT)’s data in some way (i.e.

share to another institution without your permission, publish your data publicly, etc.)?

m   Extremely unlikely (1) m   Somewhat unlikely (2) m   Neither likely nor unlikely (3) m   Somewhat likely (4) m   Extremely likely (5)

Q23 Would you recommend that your (WTT) should share her/his genetic data?

m   Definitely not (1) m   Probably not (2) m   Might or might not (3) m   Probably yes (4) m   Definitely yes (5)

Q24 If your (WTT) shares her/his genetic data, what would be your level of concern for your

personal privacy?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

67

Q25 If your (WTT) shares her/his genetic data, what would be your level of concern for

her/his personal privacy?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

Q26 If your (WTT) would share her/his genetic data, what would be your level of concern for the

personal privacy of her/his family members?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)

Q27 If your (WTT) would share her/his data, what would be your level of concern for the

personal privacy of her/his other friends or colleagues (non-family members)?

m   Highly concerned (1) m   Slightly concerned (2) m   Neither concerned or not unconcerned (3) m   Slightly unconcerned (4) m   Not concerned at all (5)