equity weighting in the economic evaluation of healthcare€¦ · converted into equity weights for...
TRANSCRIPT
Equity Weighting in the Economic Evaluation of Healthcare
Richard Norman
Centre for Health Economics Research and Evaluation (CHERE), University of
Technology, Sydney, PO BOX 123, Broadway, Sydney 2007
Contact details: Telephone (02) 9514 4732; Email [email protected]
Abstract
Outcome measurement in economic evaluation of healthcare considers outcomes
independent of to whom they accrue. This paper reports on a discrete choice
experiment eliciting population preferences regarding the allocation of health gain
between groups of potential patients. A random-effects probit model is estimated, and
converted into equity weights for use in economic evaluation. On average, the
modelling predicts relatively high social value on health gains for non-smokers,
carers, those with a low income and those with an expected age of death less than 45
years. For decision-makers, whether a formal equity weighting system represents an
improvement on more informal approaches to weighing up equity and efficiency
concerns remains uncertain.
Preface
Thesis title: Outcome valuation in the economic evaluation of healthcare
Supervisors: Professor Jane Hall, A/Professor Rosalie Viney, Professor Debbie Street
Economic evaluation of healthcare interventions (such as pharmaceuticals, medical devices and technologies) considers both the effect of the intervention on patients, and the costs borne by the government and often the individual themselves. This simultaneous consideration of costs and benefits is now standard practice in reimbursement decisions, both in Australia and elsewhere. This thesis focuses on the assessment of benefits, specifically how we place a value on the health changes patients experience as a result of a health care intervention.
There is a well-established framework for how outcomes are valued in health care, but this framework is built on a number of contentious assumptions. For example, health is assumed to be the sole outcome of a healthcare system, and society is assumed to be inequality-neutral. This thesis identifies and explains these assumptions and then focuses on testing two of them in the empirical chapters. The empirical chapters in this thesis consider these issues, using a discrete choice experiment (DCE).
The thesis demonstrates how these concerns might be overcome by augmenting the existing decision-making framework with relatively easily-collected stated preference data, and offers a template for other analyses exploring other parts of how health outcomes should be valued.
The thesis takes the following form:
Chapter I: The measurement of outcomes in economic evaluation of health interventions
Chapter II: Measuring health-related quality of life – standard and novel approaches
Chapter III: Discrete choice experiments: Principles and application for health gain
Chapter IV: Some principles for designing discrete choice experiments
Chapter V: Using a discrete choice experiment to value health profiles in the SF-6D
Chapter VI: Equity weights for use in economic evaluation
Chapter VII: Conclusions and implications
This paper presents the findings from Chapter VI, reporting a DCE exploring preferences around distribution of health, and suggesting how analysts might use these results in policy decisions.
Introduction
Economic evaluation of new healthcare interventions is increasingly mandated in
decisions relating to government and insurer reimbursement. Unlike neo-classical
welfare economics, it is often the case that health rather than utility is considered to be
the central outcome used to evaluate the appropriateness of possible health
expenditure. The motivation for health to be considered as distinct from other areas of
economic evaluation is reflected in specific egalitarianism of Tobin (1970),
“This is the view that certain specific scarce commodities should be
distributed less unequally than the ability to pay for them. Candidates for
such sentiments include basic necessities of life, health, and citizenship.”
(p.263)
The dominant approach to economic evaluation employs a criterion similar to Kaldor-
Hicks, but places health as the central desideratum. It assumes the policy maker will
choose the course of action which maximises total health. This is noteworthy in that it
entails that the standard application of the Kaldor-Hicks criterion is redundant, in that
a utility-maximising intervention does not necessarily maximise health. The retention
of the Kaldor-Hicks criterion requires an assumption that the marginal utility of health
is constant and equal across individuals. Simple maximisation of health is relatively
straightforward to apply, but represents a considerable constraint on preferences, even
if we are satisfied to adopt health as the central concept of importance.
Williams (1997) identified the ‘Fair Innings’ argument and its application in equity
weighting in his seminal 1997 paper. He highlighted the fundamental difference
between death at 25 and death at 85, and proposed weighting gains that accrue to
those unlikely to make some threshold life expectancy (a Fair Innings) more highly.
This work has been built on, with a series of studies showing that the assumptions
implicit in the health-maximising QALY (or life year (LY)) model are not consistent
with the choices demonstrated when community respondents are surveyed. This paper
addresses one of these concerns, namely that the value people place on an outcome
(be it LY, QALY or something else) does not depend on the recipient to whom it
accrues (see Dolan (2005) for a review of the broader area).One solution to this is to
weight these outcomes according to the person receiving them. If society values the
health gains accruing to a particular group relatively highly, the equity weight for that
group exceeds one, all outcomes gained by that group are multiplied by that factor,
and an incremental cost-effectiveness ratio (ICER) for an intervention in that group
would be relatively lower than without equity weights. There are two distinct (but
linked) sources of explanation for equity weights which can best be expressed using
indifference curves as shown in Figure 1, namely aversion to inequality and a
preference for discrimination independent from inequality.
Figure 1: Indifference curves under different sets of societal preferences
In a standard application of the LY or QALY model, the societal indifference curve
would be blind to the distribution of health, instead focusing on total health. This is
represented by the straight black indifference curve. However, this may not best
reflect the views of the community. Firstly, it is likely that the average member of
society is inequality averse (Williams, 1997). If this is the case, the indifference curve
in terms of expected age of death or total life expectancy (LE) is convex with respect
to the origin. Thus, the indifference curve moves from the black line (at which the
distribution of outcome is irrelevant to societal utility) to the dotted black line where
society is willing to sacrifice some total LE across the two groups for less inequality,
represented by the area between the two curves. At this point, it is possible to
constrain the utility function to be non-monotonic. This would imply that health gain
to a group cannot be negative in terms of social utility. However, in this study, I chose
not to make such an assumption as a non-monotonic utility function might simply
reflect extreme inequality aversion.
QALE (Group B)
QALE (Group A) Line of equality
g
h
Under this dotted black indifference curve, the society is still indifferent to the
characteristics of the individual beyond their expectation of total (quality adjusted)
life expectancy. Thus, point g and h, which have the same total health endowment and
degree of inequality, are valued equally. However, there may be characteristics which
impact on societal preferences for allocation of health gain beyond this, such as non-
health characteristics such as gender, income etc. If society is willing to discriminate
on these other characteristics, the solid grey indifference curve is possible in which
societal aversion to inequality depends on who is relatively disadvantaged (i.e. it is
non-symmetrical around the line of equality).
Olsen et al. (2003) identified existing studies which give a variety of individual
characteristics that may impact on the societal valuation of the health gain accruing to
that individual. It should be noted that this is somewhat limited as the characteristics
given by Olsen et al. have generally been considered in isolation; respondents may be
imputing other characteristics of the healthcare recipients which are correlated with
the characteristic of interest. For example, people might be willing to discriminate
against women as they assume that women live longer (and not because they
intrinsically prefer outcomes accruing to males). Discrete choice experiments (DCE)
are a useful tool to investigate societal preferences for health allocation. If the DCE is
constructed appropriately, it can be used to identify the impact of individual
characteristics (e.g. gender, age etc) independent of all others.
Existing evidence on societal value of health gains accounting for this (by
investigating multiple attributes simultaneously) is limited. In Australia, formal
inclusion of equity weighting into economic evaluation for reimbursement decisions
in not mandated, perhaps reflecting the paucity of evidence in the area. One study,
limited to a set of undergraduate student respondents (Schwappach, 2003). A more
recent study has used a discrete choice experiment (DCE) to generate distributional
weights (Lancsar, et al., 2011). Interestingly, this study concluded that the weighting
of outcomes is generally not advisable, except in a small number of situations and in
those cases the impact of using weights is relatively small. The conclusions from
DCEs are potentially sensitive to the way the question is asked, the choice of
dimensions and levels, the sample, and the method of analysis. As our work differs in
each of these dimensions, it is of interest to identify if the finding of Lancsar et al. is
replicated.
Thus, the aim of the paper is to identify some key characteristics which might impact
on the societal valuation of health gains accruing to groups with those characteristics;
to conduct a discrete choice experiment which can identify the effect of each of the
characteristics independent of the others; to explore (and potentially explain)
heterogenous responses to the discrete choice experiment; and to produce a set of
equity weights for these characteristics for use in economic evaluation.
Methods
Dimensions and levels
Identification of appropriate dimensions and levels took place through considering the
dimensions with some published evidence, and piloting some suggested levels in a
small discrete choice experiment (Norman and Gallego, 2008). The major source of
dimensions with supporting evidence was a review by Olsen et al. (2003) which
suggested a number of possible dimensions that might be important dividing into
those relating to a person’s relation to others, those relating to their illness, and those
relating to their self. From these, a smaller set were selected with the aim of including
some from each of these three categories. The selected dimensions were gender,
smoking status, income (or socio-economic status), whether the individual maintained
a healthy lifestyle, carer status and total life expectancy. This is not an exhaustive set
of characteristics over which people might discriminate, only that these are a
convenient and obvious set which can help to identify the degree to which people
agree or disagree with the standard health-maximising approach. It was decided to
limit the results to this subset as including extra dimensions has the potential to
significantly impact on the number of choice sets required. Therefore, the results
presented here should not be interpreted as claiming that there are no other
characteristics which might impact on preferences, only that, over a set of obvious
candidates, respondents either do or do not correspond to the assumptions of the
QALY model.
Two prominent dimensions are not included in the experiment, namely severity and
current age. Severity was omitted to limit the complexity of the experiment for the
respondent. The choice sets in the experiment were cognitively challenging and it was
decided that the benefit of including an additional health profile would be outweighed
by the increased difficulty of the task. Therefore, it was implicitly assumed that
attitudes to inequality in life expectancy are translatable into attitudes to inequality in
quality-adjusted life expectancy.
Age was excluded as the interaction between age and life expectancy meant that there
would either be a considerable number of implausible health states (e.g. old age, high
remaining life expectancy), or a very narrow range of levels for one or both of age and
remaining life expectancy. It was decided that what mattered most was the
expectation of total health (simplified to life expectancy), and ignored the proportion
of that expected endowment that had already been used. Thus, only life expectancy
was included in the experiment.
Generally, binary levels for each dimension were adopted. Thus, smoking status,
health lifestyle, carer status, gender and income were defined as yes/no (or higher than
median / lower than median for income, or male / female for gender). To allow more
detailed investigation of the impact of total life expectancy, four levels (30 years, 45
years, 60 years, 75 years) were used, and the increase in total life expectancy from the
hypothetical program was specified to be one of 1 year, 3 years, 6 years and 10 years.
Experimental design
An 2^5 orthogonal main effects plan of strength 4 in 16 runs was selected, and paired
with each combination of two four-level attributes representing current total life
expectancy and gain in life expectancy associated with the healthcare program. These
256 health profiles were used as the starting design for the construction of a shifted
discrete choice experiment. To produce the other health profile in each choice set we
defined the shift by a generator, as outlined by Street and Burgess (2007). The three
generators that were used were (1,1,1,1,0,0,0), (1,1,0,0,1,1,0), and (1,0,1,0,1,0,1).
These were selected to allow estimation of main effects and all two-factor
interactions, for reasons that will be explained in the analysis section. After duplicate
choice sets were removed, this left 640 choice sets. There is evidence to suggest that
up to 16 choice sets is both acceptable to respondents and does not significantly affect
responses (Coast, et al., 2006, Hall, et al., 2006). Therefore, the 640 choice sets were
divided into 40 versions of 16 choice sets, to which the respondents were randomly
assigned (although the total number of respondents in each block was controlled to be
equal). Which option was presented as Programme A or Programme B was
randomised to prevent position bias.
Survey administration and sample recruitment
The survey was administered through an electronic data collection website. The data
collection occurred in May 2010. An example choice set is provided in Figure 2.
Figure 2: An Example Choice Set
An online panel of respondents recruited by Pure Profile Pty was used for the survey.
The panel provider ensures that the members of the panel are broadly representative
of the Australian population. Each survey respondent was paid a small sum
(approximately $15), dependent on the time they spent answering the choice sets, to
complete the survey. They used a web link to access the survey, so were able to self-
complete at their convenience. To aid the respondent, a thorough description of the
task was provided at the beginning of the survey and a help button was available
throughout the task. This provided information on how to respond, but deliberately
did not provide any advice specific to the characteristics presented in the choice sets.
Respondents then completed the task for the 16 choice sets to which they had been
allocated. Following this, they answered a series of personal questions including gross
household income, smoking status, ethnicity, country of birth, number of dependents,
level of education, age and gender. Finally, they were asked how difficult they had
found the task, selecting one of five levels of difficulty ranging from very difficult to
very easy. They were also given the opportunity to provide a free-text response
outlining their impression of the survey.
Analysis
An additive utility function with gain in total life expectancy and the characteristics of
the potential respondents would be inappropriate because, as the gain from the
hypothetical health program tends to zero, the utility of the program should similarly
tend to zero. This is analogous to the zero condition implicit in the QALY model
(Bleichrodt and Johannesson, 1997, Bleichrodt, et al., 1997). Therefore, an amended
utility model was adopted, (denoted by Utility Function 1) in which the utility of
option j in choice set s for survey respondent i was assumed to be
isjiisjisjisjisj GAINXGAINU ενβα ++′+= , (1)
where GAIN is the gain in total life expectancy accruing to the hypothetical
population group if the intervention were implemented and isjX ′ is a set of
characteristics of the hypothetical population group (current total life expectancy
(dummy coded), gender, smoking status, carer status, whether they lead a health
lifestyle, income). The error term ( isji εν + ) consists of a person-specific error term
distributed iid normal and a conventional random error term distributed iid normal.
An important point to note is that the characteristics of potential patients are
investigated through two-factor interaction terms rather than through the main effect.
Thus, while the experimental design allows for two-factor interactions in the strict
sense, the utility function required for this type of investigation means that
interactions between patient characteristics (e.g. smoking x carer status) can not
necessarily be estimated in an unbiased way.
For the derivation of equity weights from this utility function, it is necessary to
identify the marginal utility of GAIN, which (dropping the subscript) is
XGAIN
U ′+= βαδδ . (2)
To account for the possible non-linearity of utility with respect to gain in total life
expectancy, a more flexible utility function denoted as Utility Function 2 was
investigated
isjiisjisjisjisjisjisjisj GAINXGAINXGAINGAINU εντβρα +++++= 2''2 , (3)
with a corresponding marginal utility of GAIN being
)(2 'XGAINXGAIN
U τρβαδδ
++′+= . (4)
The linearity of utility with respect to time is relaxed, by introducing the 2GAINρ
term in Utility Function 2. In addition, the assumption that the change in total utility
associated with the health gain being received by a different group of hypothetical
respondents is independent of the total gain was relaxed by introducing
the 2' GAINX isjτ term. This is analogous to relaxing the assumption of risk neutrality
over life years in the QALY model (Bleichrodt, et al., 1997).
A random-effects (RE) probit was used to model data. As this contains a person-
specific error term iν choices made by an individual will not be independent. Models
resulting from Utility Functions 1 and 2 are compared using the Akaike and Bayesian
Information Criteria which contrast the model fit accounting for the number of
parameters estimated (Akaike, 1974, Schwarz, 1978).
There has been considerable discussion regarding the appropriate techniques for
deriving welfare measures from stated preference experiments, with the leading
candidates being marginal rates of substitution (MRS) (McIntosh and Ryan, 2002)
and the Hicksian compensating variation (CV) (Lancsar, et al., 2007, Lancsar and
Savage, 2004, Ryan, 2004, Santos Silva, 2004). In this study, an approach similar to
the MRS was employed using GAIN as the numeraire. Thus, the value of an
additional year of life for a hypothetical group is divided by the value of an additional
year for some reference group. For convenience, this reference group was selected to
be the ‘average’ group in society, under the assumptions that 50% of people in society
are female, that 50% have above average income, that 50% have a healthy lifestyle,
that 20% are smokers, that 40.8% are carers, and that the average person has a total
life expectancy of 75). The carer figure is a composite term including the 2.6 million
Australians estimated by the Australian Bureau of Statistics to provide assistance to
those who needed help because of disability or old age, the 2.363 million couple
families with children (so 4.726 million parents) and the 1.944 million single parents
(both parenting statistics are taken from the 2006 census (Australian Bureau of
Statistics, 2006)), divided by the estimated total population as of 12th
Australian Bureau of Statistics, 2011
October 2011 of
22.731 million ( ). Thus, the marginal utility of
GAIN for this reference group under Utility Function 1 is '' Xβα + , and the equity
weight E for the hypothetical group g is then
''
''
XX
E g
βαβα
+
+= . (5)
Confidence intervals for each of the equity weights were bootstrapped using 50
replications.
Clearly, this approach can be replicated under the more flexible Utility Function 2
replacing the marginal utility term for GAIN with that stated in Equation 4. Due to the
large number of combinations of individuals and values of GAIN, four hypothetical
groups are constructed and weights are generated for two different time points (5
years and 10 years). The four hypothetical groups are (i) the most favoured type of
person; (2) the least favoured type of person; (3) a high earning female who is a
smoker but otherwise has a healthy lifestyle, with a life expectancy of 45; and (4) a
low-earning male who is a non-smoker but leads an unhealthy lifestyle, is not a carer,
with a life expectancy of 75 years.
The advantage of anchoring equity weights such that the mean respondent is valued at
1 is that, if an intervention increases life expectancy by a fixed amount for all
members of society (and hence does not affect differentials between individuals), it
has the same incremental QALYs whether the QALYs are equity-weighted or not.
Additionally, ‘rules-of-thumb’ relating to acceptable ICERs (such as NICE’s £20,000-
£30,000 per QALY) remain relevant under this equity weights system (National
Institute for Health and Clinical Excellence, 2008).
Observable heterogeneity
To investigate heterogeneity in responses, most of the demographic characteristics of
respondents were able to be matched with the dimensions and levels within the DCE
(the exception being total life expectancy which could not be easily determined for the
survey respondent). Thus, for example, it was possible to determine if smokers
differed in their response pattern to non-smokers or ex-smokers, both in terms of
smoking and other dimensions. While blocking can impact on the reliability of
subgroup analysis, the subgroups were large enough to ensure that all blocks were
included in all subgroups. Using the simpler Utility Function 1, the RE probit was re-
run with the sample split as per the dimensions in the experiment. Thus, the results of
male respondents were contrasted with the response of female respondents for
example. To facilitate comparison between samples, it was necessary to account for
scale effects by dividing through by one of the coefficients; this is analogous to
comparing willingness to pay estimates. In this case, the coefficient on GAIN was
used as numeraire. A likelihood ratio test was administered for each analysis, with the
more constrained model being that estimated in Equation 1, and the less constrained
one adding interaction terms between the survey respondent characteristic of interest
and each of the parameters estimated in the model.
Results
Seven hundred and forty nine people entered the survey and were eligible to
participate. Thirty-two of these were excluded as the sample had reached its
maximum quota. Of the remaining 717 respondents, 616 answered at least one choice
set (i.e. they did not withdraw before the task began) Of these, 553 completed all
choice sets within the survey, giving a completion rate of 89.8% relative to those that
started the task (and were therefore randomised to a block), and 77.1% relative to the
population who entered the task and were willing to participate. The free text
responses generally suggested the respondents understood the task, and provided
some reasons for the choices they made in the experiment. Of these 553, one
respondent completed the choice task (and formed part of the analysis set) but did not
complete the demographic section. Table 5 outlines some basic characteristics of the
sample of 552 relative to the general Australian population.
Table 1: Representativeness of DCE Sample Characteristic Value / Range Sample Population2
Gender Female 56.16% 56.09% Age (years) 16-29 26.63% 21.33%
30-44 34.96% 23.98% 45-59 23.01% 22.40% 60-74 11.05% 14.00% 75+ 0.54% 18.29%
Highest level of education
No further / higher education
33.69% 60.51%
Trade certificate 30.43% 22.24% Bachelor’s degree or above 35.87% 17.26%
Gross household income
<$20,000 1
7.84% 15.77%
$20,000 - $40,000 15.88% 23.02% $40,001 - $60,000 20.59% 17.64% $60,001 - $80,000 17.84% 13.87% $80,001 - $100,000 15.29% 11.03% $100,001 + 22.55% 18.67%
Smoking Current smoker 18.66% 23.00% Past smoker 26.99% 30.00% Never smoker 54.35% 47.00%
Carer Unpaid family carers / total Australian population
19.75% 11.89%
1 Australian Bureau of Statistics, 2002 All data sourced from ABS ( , Australian Bureau of Statistics, 2005, Australian Bureau of Statistics, 2006, Australian Bureau of Statistics, 2007), other than the proportion of carers (Carers Australia website)
The representativeness of the sample differs by characteristic. The gender breakdown
is close to the population. Those over 75 years old are under-represented, which is a
problem for generalisability in that group. People in the sample are relatively over-
educated and have a higher income than average.
Considering choice sets in which the gain differed between groups, the proportion in
which the option producing the fewer years of additional life expectancy was selected
was 32.3%. Thus, gain is important, but not the sole determinant of choice in the
experiment.
Similarly, of the 553 complete respondents, 106 never selected an option involving
the fewer number of additional years of life. This means that the remaining 447 were
willing to trade aggregate life years in order to focus health gain towards specific
members of society.
Table 2: RE Probit Results
Mean (SE) Utility Function 1 Utility Function 2 Constant -0.0350(0.0139)** -0.0351(0.0140)**
Gain (years) 0.1092(0.0068)*** 0.2089(0.0282)*** Gain x female 0.0035(0.0024) -0.0043(0.0095)
Gain x high income -0.0079(0.0028)*** -0.0252(0.0103)** Gain x smoker -0.0739(0.0033)*** -0.1851(0.0136)***
Gain x healthy life 0.0154(0.0046)*** 0.0487(0.0163)*** Gain x carer 0.0317(0.0027)*** 0.1041(0.0108)*** Gain x LE45 0.0140(0.0054)** 0.0240(0.0198) Gain x LE60 0.0097(0.0062) 0.0347(0.0223) Gain x LE75 -0.0094(0.0055)* -0.0211(0.0199) Gain2 (years) -0.0096(0.0027)***
Gain2 x female 0.0009(0.0011) Gain2 x high income 0.0020(0.0011)*
Gain2 x smoker 0.0139(0.0016)*** Gain2 x healthy life -0.0037(0.0017)**
Gain2 x carer -0.0088(0.0013)*** Gain2 x LE45 -0.0011(0.0021) Gain2 x LE60 -0.0027(0.0023) Gain2 x LE75 0.0013(0.0021)
Lnsig2 -13.2502(10.0324) u -13.5833(11.083) Sigma u 0.0013(0.0067) 0.0011(0.0062)
Log likelihood -5570 -5496 AIC 11161 11032 BIC 11239 11174
Levels of statistical significance: *=10%; **=5%; ***=1%
The RE probit results are presented in Table 2. Under Utility Function 1, respondents
were willing to discriminate in favour of programmes with a greater health gain, and
to recipients who had a lower income, were non-smokers, were carers, or had life
expectancies of 45 (relative to those with the base total life expectancy of 30 years).
Under Utility Function 2, the coefficients on the linear component of GAIN show a
similar pattern to that in Utility Function 1, but cannot be easily compared as further
interaction terms are estimated. The quadratic terms are statistically significant at the
5% level for the main effect on GAIN (suggesting diminishing marginal utility of
time), and on smoking (positive), healthy lifestyles and carer status (both negative).
Thus, the discrimination against smokers exhibited throughout is relatively larger for
smaller values of GAIN, while the discrimination in favour of those with healthy
lifestyles or with dependents was relatively larger for smaller values of GAIN. Under
both Information Criteria, Utility Function 2 is preferred. A set of equity weights
based on this utility function can be generated, but are necessarily time specific (as
shown in Equation 4). Therefore, the relative ranking of two different hypothetical
recipients of health care depends on the number of years they would receive under
different values of GAIN. In other words, if x extra years of life expectancy for group
A is preferred to x years extra life expectancy for group B, this relative preference
does not necessarily hold if each group receives y years. This is a significant
complication to the operationalisation of an equity weights system, but can be
generated in the same way using Equations 4 and 5. However, the additional
complexity of such an approach lead us to recommend the results from Utility
Function 1 to be the preferred option. The equity weights based on Utility Function 1
are presented in Table 3, with the weights plotted in a histogram in Figure 3.
Table 3: Equity Weights
Income Smoker Healthy
life? Carer Life
Expectancy Male Equity Weight
(95% CI) Female Equity
Weight (95% CI) High Yes Yes Yes 30 0.72 (0.58-0.86) 0.75 (0.62-0.89) High Yes Yes Yes 45 0.86 (0.71-1.00) 0.89 (0.73-1.05) High Yes Yes Yes 60 0.81 (0.71-0.92) 0.85 (0.72-0.98) High Yes Yes Yes 75 0.63 (0.54-0.71) 0.66 (0.59-0.74) High Yes Yes No 30 0.41 (0.30-0.53) 0.45 (0.32-0.58) High Yes Yes No 45 0.55 (0.43-0.67) 0.58 (0.45-0.72) High Yes Yes No 60 0.51 (0.38-0.64) 0.54 (0.44-0.64) High Yes Yes No 75 0.32 (0.23-0.42) 0.36 (0.26-0.45) High Yes No Yes 30 0.57 (0.44-0.70) 0.61(0.49-0.72) High Yes No Yes 45 0.71 (0.57-0.85) 0.74 (0.63-0.85) High Yes No Yes 60 0.67 (0.55-0.78) 0.70 (0.60-0.80) High Yes No Yes 75 0.48 (0.39-0.57) 0.51 (0.43-0.60) High Yes No No 30 0.27 (0.15-0.38) 0.30 (0.16-0.44) High Yes No No 45 0.40 (0.28-0.52) 0.43 (0.28-0.59) High Yes No No 60 0.36 (0.26-0.46) 0.39 (0.28-0.51) High Yes No No 75 0.17 (0.05-0.30) 0.21 (0.07-0.35) High No Yes Yes 30 1.43 (1.27-1.60) 1.47 (1.27-1.66) High No Yes Yes 45 1.57 (1.39-1.75) 1.60 (1.42-1.78) High No Yes Yes 60 1.53 (1.35-1.71) 1.56 (1.40-1.72) High No Yes Yes 75 1.34 (1.25-1.43) 1.38 (1.32-1.44) High No Yes No 30 1.13 (0.99-1.26) 1.16 (1.05-1.28) High No Yes No 45 1.26 (1.13-1.40) 1.30 (1.12-1.47) High No Yes No 60 1.22 (1.09-1.35) 1.26 (1.11-1.40) High No Yes No 75 1.04 (0.98-1.09) 1.07 (1.01-1.14) High No No Yes 30 1.29 (1.16-1.42) 1.32 (1.19-1.45) High No No Yes 45 1.42 (1.27-1.57) 1.46 (1.30-1.62)
High No No Yes 60 1.38 (1.23-1.53) 1.41 (1.25-1.56) High No No Yes 75 1.20 (1.13-1.26) 1.23 (1.17-1.29) High No No No 30 0.98 (0.87-1.09) 1.01 (0.88-1.15) High No No No 45 1.12 (1.00-1.23) 1.15 (1.02-1.28) High No No No 60 1.07 (0.93-1.22) 1.11 (0.97-1.24) High No No No 75 0.89 (0.82-0.96) 0.92 (0.85-1.00) Low Yes Yes Yes 30 0.80 (0.67-0.93) 0.83 (0.70-0.96) Low Yes Yes Yes 45 0.93 (0.79-1.07) 0.97 (0.82-1.11) Low Yes Yes Yes 60 0.89 (0.77-1.01) 0.92 (0.77-1.07) Low Yes Yes Yes 75 0.71 (0.63-0.78) 0.74 (0.65-0.83) Low Yes Yes No 30 0.49 (0.37-0.61) 0.52 (0.42-0.63) Low Yes Yes No 45 0.63 (0.51-0.74) 0.66 (0.55-0.77) Low Yes Yes No 60 0.58 (0.47-0.69) 0.62 (0.49-0.75) Low Yes Yes No 75 0.40 (0.30-0.50) 0.43 (0.33-0.53) Low Yes No Yes 30 0.65 (0.52-0.78) 0.68 (0.57-0.80) Low Yes No Yes 45 0.78 (0.64-0.92) 0.82 (0.69-0.94) Low Yes No Yes 60 0.74 (0.64-0.85) 0.78 (0.67-0.88) Low Yes No Yes 75 0.56 (0.46-0.65) 0.59 (0.50-0.69) Low Yes No No 30 0.34 (0.23-0.45) 0.38 (0.25-0.50) Low Yes No No 45 0.48 (0.35-0.60) 0.51 (0.39-0.64) Low Yes No No 60 0.44 (0.32-0.55) 0.47 (0.36-0.58) Low Yes No No 75 0.25 (0.11-0.39) 0.28 (0.19-0.38) Low No Yes Yes 30 1.51 (1.35-1.67) 1.55 (1.39-1.70) Low No Yes Yes 45 1.65 (1.43-1.86) 1.68 (1.50-1.86) Low No Yes Yes 60 1.61 (1.43-1.78) 1.64 (1.49-1.78) Low No Yes Yes 75 1.42 (1.34-1.50) 1.45 (1.37-1.53) Low No Yes No 30 1.20 (1.08-1.33) 1.24 (1.11-1.37) Low No Yes No 45 1.34 (1.17-1.51) 1.37 (1.20-1.55) Low No Yes No 60 1.30 (1.18-1.42) 1.33 (1.19-1.47)
Low No Yes No 75 1.11 (1.05-1.17) 1.15 (1.08-1.21) Low No No Yes 30 1.36 (1.19-1.53) 1.40 (1.26-1.53) Low No No Yes 45 1.50 (1.34-1.66) 1.53 (1.35-1.72) Low No No Yes 60 1.46 (1.32-1.59) 1.49 (1.32-1.66) Low No No Yes 75 1.27 (1.21-1.33) 1.31 (1.23-1.38) Low No No No 30 1.06 (0.95-1.16) 1.09 (0.98-1.20) Low No No No 45 1.19 (1.06-1.33) 1.23 (1.07-1.38) Low No No No 60 1.15 (1.02-1.27) 1.18 (1.07-1.30) Low No No No 75 0.97 (0.89-1.03) 1.00 (0.94-1.06)
The reference group was selected to be the ‘average’ group in society, under the assumptions that 50% of people in society are female, that 50% have above average income,
that 50% have a healthy lifestyle, that 20% are smokers, that 40.8% are carers, and that the average person has a total life expectancy of 75.
Figure 3: Distribution of Equity Weights
0.2
.4.6
.8D
ensi
ty
0 .5 1 1.5 2Equity Weight
Under Utility Function 2, the four hypothetical groups are presented alongside their
utility weight for a GAIN of 5 and 10 years in Table 4.
Table 4: Some Selected Equity Weights under Utility Function 2 Person Gain = 5 years Gain = 10
years Most favoured (female, low income, non-smoker, healthy lifestyle, carer, life expectancy of 60)
1.623 1.150
Person 2 (Female, high income, smoker, healthy lifestyle, carer, life expectancy of 45)
0.828 0.805
Person 3 (male, low income, non-smoker, unhealthy lifestyle, non-carer, life expectancy of 30)
0.958 0.848
Least favoured (male, high income, smoker, unhealthy lifestyle, non-carer, life expectancy of 75)
0.092 0.402
The results under Utility Function 4 suggest that the equity weight would be sensitive
to the value of GAIN for some combinations of individuals (for example, the most
and least favoured). The sensitivity will depend on the relative size of the coefficients
on the quadratic and linear terms in GAIN that apply to a group; as the former
becomes relatively large relative to the latter, the equity weight becomes more
variable over GAIN. Over all 128 hypothetical groups, the tendency is for smaller
equity weights for higher values of GAIN. This can be explained by the tendency for
the signs of the linear and quadratic terms to be opposite to one another. If the
coefficient on the linear term (e.g. Gain x female) is negative (positive) in Utility
Function 2, the coefficient on the quadratic term (e.g. Gain2
Observable heterogeneity
x female) appears to tend
to be positive (negative). Thus, as GAIN increases, equity weights tend towards one,
and hence towards a more conventional QALY-type model.
The results of Utility Function 1 if the sample is divided by gender of the survey
respondent are provided graphically in Figure 4.
Figure 4: Results by gender of respondent
The major difference between male and female responses lies in their attitudes
towards gains accruing to hypothetical cohorts of people of different genders. Both
discriminate heavily in favour of their own gender (p<0.01 in both cases). Also, male
responders were more likely to demonstrate the Fair Innings type argument,
discriminating against those expected to live until 75 relative to those not expected to
have such longevity. While the coefficients across other dimensions appear to follow
a similar pattern, a likelihood ratio test rejects the null hypothesis, meaning the more
constrained model which acknowledges the difference in survey respondent by gender
is a better fit (p<0.01).
The graphical comparison of responses by smoking status considers three sub-groups,
namely smokers, former smokers and people who have never been smokers. These
results are illustrated in Figure 5.
Figure 5: Results by smoking status
Again, the respondents tended to display quite different preferences in the dimension
over which the sample was split. On average, smokers did not strongly favour either
smokers or non-smokers. However, the other two groups strongly discriminate against
smokers. As with the case of gender, the LR test rejects the null hypothesis (p<0.01)
Finally, as younger people were over-represented in the sample, a comparison
between the oldest 25% of the sample (aged 51 and older) was compared with the
remainder of the sample. These results are presented in Figure 6.
Figure 6: Results by age of respondent
The point estimates suggest that the older cohort discriminate against older
respondents less; indeed, ceteris paribus, they would prefer a health program for
people with a total life expectancy of 75 relative to one for people with a total life
expectancy of 30. However, the LR test accepts the null hypothesis (p=0.1673),
meaning that distinguishing the responses of older people from the rest does not
improve model fit.
Conclusion and Discussion
This paper has demonstrated that simple maximisation of total health is not the
criterion on which people make health allocation decisions, emphasising the results
reported by Dolan (2005) but allowing for characteristics of the respondent to be
considered simultaneously (while the studies reported by Dolan tend to focus on a
smaller subset of characteristics, often only one). The average survey respondent was
willing to target health gain towards carers and non-smokers, even if it reduces total
life expectancy across the population. Willingness to discriminate based on the other
attributes was less strong. Additionally, some characteristics of respondents were
strong predictors of their responses. In general, people relatively favoured health gains
accruing to people with similar characteristics to themselves. The patterns regarding
total life expectancy were less clear cut, but suggested that gains accruing to people
who can expect a typical ‘Fair Innings’ are valued less than gains accruing to people
who might not receive this allocation.
The conclusion in this study that equity weights can differ significantly from 1
contrasts with that of Lancsar et al. who argue that weighting QALYs is generally not
appropriate, and would be unlikely to significantly impact on the scale of the gain
accruing from a healthcare intervention. There are five possible explanations for this
divergence. Firstly, it might be that respondents in the two countries (Australia and
the UK) hold different views. Secondly, the way the question was posed may drive
the result. Inadvertent emphasis of certain aspects of the choice may cause results in
different experiments to differ; a recent example in the case of colorectal cancer
screening investigated this phenomenon (Howard and Salkeld, 2009). Thirdly, the two
studies consider quite different dimensions, particularly as those considered here
allow for a non-symmetrical indifference curve as illustrated in Figure 1.
Additionally, in dimensions that are common to both experiments (such as total life
expectancy), the levels were different. It might be argued that the dimensions selected
by Lancsar et al. are ones over which preferences are not strong (and likely to over-
ride the conventional maximisation of LYs or QALYs). Fourthly, it might be that the
method presented here for converting regression results into equity weights produces
different weights to those that would have been produced using the compensating
variation, as employed by Lancsar et al. If the CV is applied to our results, the impact
is actually fairly small. Weights derived through CV are higher, but by a maximum of
0.126. Finally, the non-linear utility function estimated here suggested that longer
periods of GAIN moved the equity weighting system towards a QALY-type model.
Since Lancsar et al. consider a larger spread of time in their experiment, it might be
that the two studies are simply reporting different parts of a common preference
curve.
The study has a number of potential limitations. While the panel was broadly
representative of the Australian population, it is arguable that membership of an
online panel is correlated with certain unobservable characteristics. With regard to
observable characteristics of respondents, our sample was generally younger than the
Australian population, although it was concluded that younger and older cohorts of
respondents displayed similar patterns of responses (in that the LR test failed to reject
the null).
As with many choice experiments, it is plausible to argue that the characteristics
investigated in this study form only a subset of those which might be important. The
trade-off between the number of choice sets in the experiment and the range of issues
that could be considered means this problem will be omnipresent; future research
might consider other areas in which people might discriminate when demonstrating
preferences for allocation of health. Future work should consider the recently
developed principles for the identification of appropriate dimensions and levels
identified by Coast et al. (2011)
The experimental design used in this work did not explicitly allow for the
consideration of interactions between characteristics of the hypothetical patients. It is
plausible that society may, for example, be willing to discriminate against smokers,
but only if those smokers had a high income. This area of research should be explored
using an experimental design which is constructed for this purpose. While the RE-
probit does account for the panel nature of the data, it should be noted that more
flexible ways of characterising heterogeneity have been proposed, and demonstrate
considerable promise (Fiebig, et al., 2010). However, in the context of generating
societal preferences for health gains, the mean response is the most important issue;
identifying the degree of agreement or disagreement with this mean is of interest, but
any attempt to implement equity weights in practice is likely to have ample practical
obstacles to overcome already.
Equity weights are conceptually straightforward, but have proven difficult to generate,
or to employ in economic evaluation. Additionally, there are ethical issues which have
to be addressed before advocating the use of equity weights. If societal preferences
are defined by a subset of the population, and they favour health gains accruing to
themselves above health gains accruing to others, that is clearly unsatisfactory,
because the distributional weights are dependent on the choice of subset. If however,
societal preferences reflect all individual preferences equally (notwithstanding the
difficulty of doing so), it has been argued that these preferences ought to have to
satisfy some ethical constraints (Broome, 1991, Richardson and McKie, 2005). While
this idea is appealing, the specification of these ethical constraints is difficult.
Arguably, identifying a set of ethical constraints which is broadly acceptable to
society means that these constraints will never act on the resource allocation decisions
they were designed for. In this paper, these issues are not explicitly considered;
however any attempt to operationalise equity weights for economic evaluation needs
to consider these issues.
References
Akaike H. 1974. A new look at the statistical model identification. IEEE Transactions
on Automatic Control 19: 716-723.
Australian Bureau of Statistics, 2006 Census Data by Location. Available via
http://www.censusdata.abs.gov.au. Accessed 12th October 2011
Australian Bureau of Statistics, Population Clock. Available via
http:www.abs.gov.au/ausstats/abs. Accessed 12th October 2011
Bleichrodt H, Johannesson M. 1997. The validity of QALYs: An experimental test of
constant proportional tradeoff and utility independence. Medical Decision
Making 17: 21-32.
Bleichrodt N, Wakker P, Johannesson M. 1997. Characterizing QALYs by risk
neutrality. Journal of Risk and Uncertainty 15: 107-114.
Broome J. 1991. Weighting goods. Blackwell: Oxford.
Coast J, Al-Janabi H, Sutton EJ, Horrocks SA, Vosper AJ, Swancutt DR, et al. 2011.
Using qualitative methods for attribute development for discrete choice
experiments: Issues and recommendations. (doi: 10.1002/hec.1739.). Health
Economics.
Coast J, Flynn TN, Salisbury C, Louviere J, Peters TJ. 2006. Maximising responses to
discrete choice experiments: A randomised trial. Appl Health Econ Health
Policy 5: 249-260.
Dolan P, Shaw R, Tsuchiya A, Williams A. 2005. QALY maximisation and people's
preferences: A methodological review of the literature. Health Economics 14:
197-208.
Fiebig D, Keane M, Louviere J, Wasi N. 2010. The generalized multinomial logit
model: Accounting for scale and coefficient heterogeneity. Marketing Science
29: 393-421.
Hall J, Fiebig DG, King MT, Hossain I, Louviere JJ. 2006. What influences
participation in genetic carrier testing? Results from a discrete choice
experiment. Journal of Health Economics 25: 520-537.
Howard K, Salkeld G. 2009. Does attribute framing in discrete choice experiments
influence willingness to pay? Results from a discrete choice experiment in
screening for colorectal cancer. Value in Health 12: 354-363.
Lancsar E, Louviere J, Flynn T. 2007. Several methods to investigate relative attribute
impact in stated preference experiments. Social Science and Medicine 64:
1738-1753.
Lancsar E, Savage E. 2004. Deriving welfare measures from discrete choice
experiments: Inconsistency between current methods and random utility and
welfare theory. Health Economics 13: 901-907.
Lancsar E, Wildman J, Donaldson C, Ryan M, Baker R. 2011. Deriving distributional
weights for QALYs through discrete choice experiments. Journal of Health
Economics 30: 466-478.
McIntosh E, Ryan M. 2002. Using discrete choice experiments to derive welfare
estimates for the provision of elective surgery: Implications for discontinuous
preferences. Journal of Economic Psychology 23: 367-382.
National Institute for Health and Clinical Excellence. Social value judgements:
Principles for the development of NICE guidance (2nd edition). NICE:
London, 2008.
Norman R, Gallego G. 2008. Equity weights for economic evaluation: An Australian
discrete choice experiment, CHERE Working Paper 2008/5. CHERE: Sydney.
Olsen JA, Richardson J, Dolan P, Menzel P. 2003. The moral relevance of personal
characteristics in setting health care priorities. Social Science and Medicine
57: 1163-1172.
Richardson J, McKie J. 2005. Empiricism, ethics and orthodox economic theory:
What is the appropriate basis for decision-making in the health sector? Social
Science and Medicine 60: 265-275.
Ryan M. 2004. Deriving welfare measures in discrete choice experiments: A comment
to Lancsar and Savage (1). Health Economics 13: 909-912; discussion 919-
924.
Santos Silva JM. 2004. Deriving welfare measures in discrete choice experiments: A
comment to Lancsar and Savage (2). Health Economics 13: 913-918;
discussion 919-924.
Schwappach DLB. 2003. Does it matter who you are or what you gain? An
experimental study of preferences for resource allocation. Health Economics
12: 255-267.
Schwarz GE. 1978. Estimating the dimensions of a model. Annals of Statistics 6: 461-
464.
Street DJ, Burgess L. 2007. The construction of optimal stated choice experiments:
Theory and methods. Wiley: Hoboken, New Jersey.
Tobin J. 1970. On limiting the domain of inequality. Journal of Law and Economics
13: 263-277.
Williams A. 1997. Intergenerational equity: An exploration of the 'fair innings'
argument. Health Economics 6: 117-132.