equity weighting in the economic evaluation of healthcare€¦ · converted into equity weights for...

Equity Weighting in the Economic Evaluation of Healthcare

Richard Norman

Centre for Health Economics Research and Evaluation (CHERE), University of

Technology, Sydney, PO BOX 123, Broadway, Sydney 2007

Contact details: Telephone (02) 9514 4732; Email [email protected]

Abstract

Outcome measurement in economic evaluation of healthcare considers outcomes

independent of to whom they accrue. This paper reports on a discrete choice

experiment eliciting population preferences regarding the allocation of health gain

between groups of potential patients. A random-effects probit model is estimated, and

converted into equity weights for use in economic evaluation. On average, the

modelling predicts relatively high social value on health gains for non-smokers,

carers, those with a low income and those with an expected age of death less than 45

years. For decision-makers, whether a formal equity weighting system represents an

improvement on more informal approaches to weighing up equity and efficiency

concerns remains uncertain.

mailto:[email protected]�

Preface

Thesis title: Outcome valuation in the economic evaluation of healthcare

Supervisors: Professor Jane Hall, A/Professor Rosalie Viney, Professor Debbie Street

Economic evaluation of healthcare interventions (such as pharmaceuticals, medical devices and technologies) considers both the effect of the intervention on patients, and the costs borne by the government and often the individual themselves. This simultaneous consideration of costs and benefits is now standard practice in reimbursement decisions, both in Australia and elsewhere. This thesis focuses on the assessment of benefits, specifically how we place a value on the health changes patients experience as a result of a health care intervention.

There is a well-established framework for how outcomes are valued in health care, but this framework is built on a number of contentious assumptions. For example, health is assumed to be the sole outcome of a healthcare system, and society is assumed to be inequality-neutral. This thesis identifies and explains these assumptions and then focuses on testing two of them in the empirical chapters. The empirical chapters in this thesis consider these issues, using a discrete choice experiment (DCE).

The thesis demonstrates how these concerns might be overcome by augmenting the existing decision-making framework with relatively easily-collected stated preference data, and offers a template for other analyses exploring other parts of how health outcomes should be valued.

The thesis takes the following form:

Chapter I: The measurement of outcomes in economic evaluation of health interventions

Chapter II: Measuring health-related quality of life – standard and novel approaches

Chapter III: Discrete choice experiments: Principles and application for health gain

Chapter IV: Some principles for designing discrete choice experiments

Chapter V: Using a discrete choice experiment to value health profiles in the SF-6D

Chapter VI: Equity weights for use in economic evaluation

Chapter VII: Conclusions and implications

This paper presents the findings from Chapter VI, reporting a DCE exploring preferences around distribution of health, and suggesting how analysts might use these results in policy decisions.

Introduction

Economic evaluation of new healthcare interventions is increasingly mandated in

decisions relating to government and insurer reimbursement. Unlike neo-classical

welfare economics, it is often the case that health rather than utility is considered to be

the central outcome used to evaluate the appropriateness of possible health

expenditure. The motivation for health to be considered as distinct from other areas of

economic evaluation is reflected in specific egalitarianism of Tobin (1970),

“This is the view that certain specific scarce commodities should be

distributed less unequally than the ability to pay for them. Candidates for

such sentiments include basic necessities of life, health, and citizenship.”

(p.263)

The dominant approach to economic evaluation employs a criterion similar to Kaldor-

Hicks, but places health as the central desideratum. It assumes the policy maker will

choose the course of action which maximises total health. This is noteworthy in that it

entails that the standard application of the Kaldor-Hicks criterion is redundant, in that

a utility-maximising intervention does not necessarily maximise health. The retention

of the Kaldor-Hicks criterion requires an assumption that the marginal utility of health

is constant and equal across individuals. Simple maximisation of health is relatively

straightforward to apply, but represents a considerable constraint on preferences, even

if we are satisfied to adopt health as the central concept of importance.

Williams (1997) identified the ‘Fair Innings’ argument and its application in equity

weighting in his seminal 1997 paper. He highlighted the fundamental difference

between death at 25 and death at 85, and proposed weighting gains that accrue to

those unlikely to make some threshold life expectancy (a Fair Innings) more highly.

This work has been built on, with a series of studies showing that the assumptions

implicit in the health-maximising QALY (or life year (LY)) model are not consistent

with the choices demonstrated when community respondents are surveyed. This paper

addresses one of these concerns, namely that the value people place on an outcome

(be it LY, QALY or something else) does not depend on the recipient to whom it

accrues (see Dolan (2005) for a review of the broader area).One solution to this is to

weight these outcomes according to the person receiving them. If society values the

health gains accruing to a particular group relatively highly, the equity weight for that

group exceeds one, all outcomes gained by that group are multiplied by that factor,

and an incremental cost-effectiveness ratio (ICER) for an intervention in that group

would be relatively lower than without equity weights. There are two distinct (but

linked) sources of explanation for equity weights which can best be expressed using

indifference curves as shown in Figure 1, namely aversion to inequality and a

preference for discrimination independent from inequality.

Figure 1: Indifference curves under different sets of societal preferences

In a standard application of the LY or QALY model, the societal indifference curve

would be blind to the distribution of health, instead focusing on total health. This is

represented by the straight black indifference curve. However, this may not best

reflect the views of the community. Firstly, it is likely that the average member of

society is inequality averse (Williams, 1997). If this is the case, the indifference curve

in terms of expected age of death or total life expectancy (LE) is convex with respect

to the origin. Thus, the indifference curve moves from the black line (at which the

distribution of outcome is irrelevant to societal utility) to the dotted black line where

society is willing to sacrifice some total LE across the two groups for less inequality,

represented by the area between the two curves. At this point, it is possible to

constrain the utility function to be non-monotonic. This would imply that health gain

to a group cannot be negative in terms of social utility. However, in this study, I chose

not to make such an assumption as a non-monotonic utility function might simply

reflect extreme inequality aversion.

QALE (Group B)

QALE (Group A) Line of equality

g

h

Under this dotted black indifference curve, the society is still indifferent to the

characteristics of the individual beyond their expectation of total (quality adjusted)

life expectancy. Thus, point g and h, which have the same total health endowment and

degree of inequality, are valued equally. However, there may be characteristics which

impact on societal preferences for allocation of health gain beyond this, such as non-

health characteristics such as gender, income etc. If society is willing to discriminate

on these other characteristics, the solid grey indifference curve is possible in which

societal aversion to inequality depends on who is relatively disadvantaged (i.e. it is

non-symmetrical around the line of equality).

Olsen et al. (2003) identified existing studies which give a variety of individual

characteristics that may impact on the societal valuation of the health gain accruing to

that individual. It should be noted that this is somewhat limited as the characteristics

given by Olsen et al. have generally been considered in isolation; respondents may be

imputing other characteristics of the healthcare recipients which are correlated with

the characteristic of interest. For example, people might be willing to discriminate

against women as they assume that women live longer (and not because they

intrinsically prefer outcomes accruing to males). Discrete choice experiments (DCE)

are a useful tool to investigate societal preferences for health allocation. If the DCE is

constructed appropriately, it can be used to identify the impact of individual

characteristics (e.g. gender, age etc) independent of all others.

Existing evidence on societal value of health gains accounting for this (by

investigating multiple attributes simultaneously) is limited. In Australia, formal

inclusion of equity weighting into economic evaluation for reimbursement decisions

in not mandated, perhaps reflecting the paucity of evidence in the area. One study,

limited to a set of undergraduate student respondents (Schwappach, 2003). A more

recent study has used a discrete choice experiment (DCE) to generate distributional

weights (Lancsar, et al., 2011). Interestingly, this study concluded that the weighting

of outcomes is generally not advisable, except in a small number of situations and in

those cases the impact of using weights is relatively small. The conclusions from

DCEs are potentially sensitive to the way the question is asked, the choice of

dimensions and levels, the sample, and the method of analysis. As our work differs in

each of these dimensions, it is of interest to identify if the finding of Lancsar et al. is

replicated.

Thus, the aim of the paper is to identify some key characteristics which might impact

on the societal valuation of health gains accruing to groups with those characteristics;

to conduct a discrete choice experiment which can identify the effect of each of the

characteristics independent of the others; to explore (and potentially explain)

heterogenous responses to the discrete choice experiment; and to produce a set of

equity weights for these characteristics for use in economic evaluation.

Methods

Dimensions and levels

Identification of appropriate dimensions and levels took place through considering the

dimensions with some published evidence, and piloting some suggested levels in a

small discrete choice experiment (Norman and Gallego, 2008). The major source of

dimensions with supporting evidence was a review by Olsen et al. (2003) which

suggested a number of possible dimensions that might be important dividing into

those relating to a person’s relation to others, those relating to their illness, and those

relating to their self. From these, a smaller set were selected with the aim of including

some from each of these three categories. The selected dimensions were gender,

smoking status, income (or socio-economic status), whether the individual maintained

a healthy lifestyle, carer status and total life expectancy. This is not an exhaustive set

of characteristics over which people might discriminate, only that these are a

convenient and obvious set which can help to identify the degree to which people

agree or disagree with the standard health-maximising approach. It was decided to

limit the results to this subset as including extra dimensions has the potential to

significantly impact on the number of choice sets required. Therefore, the results

presented here should not be interpreted as claiming that there are no other

characteristics which might impact on preferences, only that, over a set of obvious

candidates, respondents either do or do not correspond to the assumptions of the

QALY model.

Two prominent dimensions are not included in the experiment, namely severity and

current age. Severity was omitted to limit the complexity of the experiment for the

respondent. The choice sets in the experiment were cognitively challenging and it was

decided that the benefit of including an additional health profile would be outweighed

by the increased difficulty of the task. Therefore, it was implicitly assumed that

attitudes to inequality in life expectancy are translatable into attitudes to inequality in

quality-adjusted life expectancy.

Age was excluded as the interaction between age and life expectancy meant that there

would either be a considerable number of implausible health states (e.g. old age, high

remaining life expectancy), or a very narrow range of levels for one or both of age and

remaining life expectancy. It was decided that what mattered most was the

expectation of total health (simplified to life expectancy), and ignored the proportion

of that expected endowment that had already been used. Thus, only life expectancy

was included in the experiment.

Generally, binary levels for each dimension were adopted. Thus, smoking status,

health lifestyle, carer status, gender and income were defined as yes/no (or higher than

median / lower than median for income, or male / female for gender). To allow more

detailed investigation of the impact of total life expectancy, four levels (30 years, 45

years, 60 years, 75 years) were used, and the increase in total life expectancy from the

hypothetical program was specified to be one of 1 year, 3 years, 6 years and 10 years.

Experimental design

An 2^5 orthogonal main effects plan of strength 4 in 16 runs was selected, and paired

with each combination of two four-level attributes representing current total life

expectancy and gain in life expectancy associated with the healthcare program. These

256 health profiles were used as the starting design for the construction of a shifted

discrete choice experiment. To produce the other health profile in each choice set we

defined the shift by a generator, as outlined by Street and Burgess (2007). The three

generators that were used were (1,1,1,1,0,0,0), (1,1,0,0,1,1,0), and (1,0,1,0,1,0,1).

These were selected to allow estimation of main effects and all two-factor

interactions, for reasons that will be explained in the analysis section. After duplicate

choice sets were removed, this left 640 choice sets. There is evidence to suggest that

up to 16 choice sets is both acceptable to respondents and does not significantly affect

responses (Coast, et al., 2006, Hall, et al., 2006). Therefore, the 640 choice sets were

divided into 40 versions of 16 choice sets, to which the respondents were randomly

assigned (although the total number of respondents in each block was controlled to be

equal). Which option was presented as Programme A or Programme B was

randomised to prevent position bias.

Survey administration and sample recruitment

The survey was administered through an electronic data collection website. The data

collection occurred in May 2010. An example choice set is provided in Figure 2.

Figure 2: An Example Choice Set

An online panel of respondents recruited by Pure Profile Pty was used for the survey.

The panel provider ensures that the members of the panel are broadly representative

of the Australian population. Each survey respondent was paid a small sum

(approximately $15), dependent on the time they spent answering the choice sets, to

complete the survey. They used a web link to access the survey, so were able to self-

complete at their convenience. To aid the respondent, a thorough description of the

task was provided at the beginning of the survey and a help button was available

throughout the task. This provided information on how to respond, but deliberately

did not provide any advice specific to the characteristics presented in the choice sets.

Respondents then completed the task for the 16 choice sets to which they had been

allocated. Following this, they answered a series of personal questions including gross

household income, smoking status, ethnicity, country of birth, number of dependents,

level of education, age and gender. Finally, they were asked how difficult they had

found the task, selecting one of five levels of difficulty ranging from very difficult to

very easy. They were also given the opportunity to provide a free-text response

outlining their impression of the survey.

Analysis

An additive utility function with gain in total life expectancy and the characteristics of

the potential respondents would be inappropriate because, as the gain from the

hypothetical health program tends to zero, the utility of the program should similarly

tend to zero. This is analogous to the zero condition implicit in the QALY model

(Bleichrodt and Johannesson, 1997, Bleichrodt, et al., 1997). Therefore, an amended

utility model was adopted, (denoted by Utility Function 1) in which the utility of

option j in choice set s for survey respondent i was assumed to be

isjiisjisjisjisj GAINXGAINU ενβα ++′+= , (1)

where GAIN is the gain in total life expectancy accruing to the hypothetical

population group if the intervention were implemented and isjX ′ is a set of

characteristics of the hypothetical population group (current total life expectancy

(dummy coded), gender, smoking status, carer status, whether they lead a health

lifestyle, income). The error term ( isji εν + ) consists of a person-specific error term

distributed iid normal and a conventional random error term distributed iid normal.

An important point to note is that the characteristics of potential patients are

investigated through two-factor interaction terms rather than through the main effect.

Thus, while the experimental design allows for two-factor interactions in the strict

sense, the utility function required for this type of investigation means that

interactions between patient characteristics (e.g. smoking x carer status) can not

necessarily be estimated in an unbiased way.

For the derivation of equity weights from this utility function, it is necessary to

identify the marginal utility of GAIN, which (dropping the subscript) is

XGAIN

U ′+= βαδδ . (2)

To account for the possible non-linearity of utility with respect to gain in total life

expectancy, a more flexible utility function denoted as Utility Function 2 was

investigated

isjiisjisjisjisjisjisjisj GAINXGAINXGAINGAINU εντβρα +++++= 2''2 , (3)

with a corresponding marginal utility of GAIN being

)(2 'XGAINXGAIN

U τρβαδδ

++′+= . (4)

The linearity of utility with respect to time is relaxed, by introducing the 2GAINρ

term in Utility Function 2. In addition, the assumption that the change in total utility

associated with the health gain being received by a different group of hypothetical

respondents is independent of the total gain was relaxed by introducing

the 2' GAINX isjτ term. This is analogous to relaxing the assumption of risk neutrality

over life years in the QALY model (Bleichrodt, et al., 1997).

A random-effects (RE) probit was used to model data. As this contains a person-

specific error term iν choices made by an individual will not be independent. Models

resulting from Utility Functions 1 and 2 are compared using the Akaike and Bayesian

Information Criteria which contrast the model fit accounting for the number of

parameters estimated (Akaike, 1974, Schwarz, 1978).

There has been considerable discussion regarding the appropriate techniques for

deriving welfare measures from stated preference experiments, with the leading

candidates being marginal rates of substitution (MRS) (McIntosh and Ryan, 2002)

and the Hicksian compensating variation (CV) (Lancsar, et al., 2007, Lancsar and

Savage, 2004, Ryan, 2004, Santos Silva, 2004). In this study, an approach similar to

the MRS was employed using GAIN as the numeraire. Thus, the value of an

additional year of life for a hypothetical group is divided by the value of an additional

year for some reference group. For convenience, this reference group was selected to

be the ‘average’ group in society, under the assumptions that 50% of people in society

are female, that 50% have above average income, that 50% have a healthy lifestyle,

that 20% are smokers, that 40.8% are carers, and that the average person has a total

life expectancy of 75). The carer figure is a composite term including the 2.6 million

Australians estimated by the Australian Bureau of Statistics to provide assistance to

those who needed help because of disability or old age, the 2.363 million couple

families with children (so 4.726 million parents) and the 1.944 million single parents

(both parenting statistics are taken from the 2006 census (Australian Bureau of

Statistics, 2006)), divided by the estimated total population as of 12th

Australian Bureau of Statistics, 2011

October 2011 of

22.731 million ( ). Thus, the marginal utility of

GAIN for this reference group under Utility Function 1 is '' Xβα + , and the equity

weight E for the hypothetical group g is then

''

''

XX

E g

βαβα

+

+= . (5)

Confidence intervals for each of the equity weights were bootstrapped using 50

replications.

Clearly, this approach can be replicated under the more flexible Utility Function 2

replacing the marginal utility term for GAIN with that stated in Equation 4. Due to the

large number of combinations of individuals and values of GAIN, four hypothetical

groups are constructed and weights are generated for two different time points (5

years and 10 years). The four hypothetical groups are (i) the most favoured type of

person; (2) the least favoured type of person; (3) a high earning female who is a

smoker but otherwise has a healthy lifestyle, with a life expectancy of 45; and (4) a

low-earning male who is a non-smoker but leads an unhealthy lifestyle, is not a carer,

with a life expectancy of 75 years.

The advantage of anchoring equity weights such that the mean respondent is valued at

1 is that, if an intervention increases life expectancy by a fixed amount for all

members of society (and hence does not affect differentials between individuals), it

has the same incremental QALYs whether the QALYs are equity-weighted or not.

Additionally, ‘rules-of-thumb’ relating to acceptable ICERs (such as NICE’s £20,000-

£30,000 per QALY) remain relevant under this equity weights system (National

Institute for Health and Clinical Excellence, 2008).

Observable heterogeneity

To investigate heterogeneity in responses, most of the demographic characteristics of

respondents were able to be matched with the dimensions and levels within the DCE

(the exception being total life expectancy which could not be easily determined for the

survey respondent). Thus, for example, it was possible to determine if smokers

differed in their response pattern to non-smokers or ex-smokers, both in terms of

smoking and other dimensions. While blocking can impact on the reliability of

subgroup analysis, the subgroups were large enough to ensure that all blocks were

included in all subgroups. Using the simpler Utility Function 1, the RE probit was re-

run with the sample split as per the dimensions in the experiment. Thus, the results of

male respondents were contrasted with the response of female respondents for

example. To facilitate comparison between samples, it was necessary to account for

scale effects by dividing through by one of the coefficients; this is analogous to

comparing willingness to pay estimates. In this case, the coefficient on GAIN was

used as numeraire. A likelihood ratio test was administered for each analysis, with the

more constrained model being that estimated in Equation 1, and the less constrained

one adding interaction terms between the survey respondent characteristic of interest

and each of the parameters estimated in the model.

Results

Seven hundred and forty nine people entered the survey and were eligible to

participate. Thirty-two of these were excluded as the sample had reached its

maximum quota. Of the remaining 717 respondents, 616 answered at least one choice

set (i.e. they did not withdraw before the task began) Of these, 553 completed all

choice sets within the survey, giving a completion rate of 89.8% relative to those that

started the task (and were therefore randomised to a block), and 77.1% relative to the

population who entered the task and were willing to participate. The free text

responses generally suggested the respondents understood the task, and provided

some reasons for the choices they made in the experiment. Of these 553, one

respondent completed the choice task (and formed part of the analysis set) but did not

complete the demographic section. Table 5 outlines some basic characteristics of the

sample of 552 relative to the general Australian population.

Table 1: Representativeness of DCE Sample Characteristic Value / Range Sample Population2

Gender Female 56.16% 56.09% Age (years) 16-29 26.63% 21.33%

30-44 34.96% 23.98% 45-59 23.01% 22.40% 60-74 11.05% 14.00% 75+ 0.54% 18.29%

Highest level of education

No further / higher education

33.69% 60.51%

Trade certificate 30.43% 22.24% Bachelor’s degree or above 35.87% 17.26%

Gross household income

<$20,000 1

7.84% 15.77%

$20,000 - $40,000 15.88% 23.02% $40,001 - $60,000 20.59% 17.64% $60,001 - $80,000 17.84% 13.87% $80,001 - $100,000 15.29% 11.03% $100,001 + 22.55% 18.67%

Smoking Current smoker 18.66% 23.00% Past smoker 26.99% 30.00% Never smoker 54.35% 47.00%

Carer Unpaid family carers / total Australian population

19.75% 11.89%

1 Australian Bureau of Statistics, 2002 All data sourced from ABS ( , Australian Bureau of Statistics, 2005, Australian Bureau of Statistics, 2006, Australian Bureau of Statistics, 2007), other than the proportion of carers (Carers Australia website)

The representativeness of the sample differs by characteristic. The gender breakdown

is close to the population. Those over 75 years old are under-represented, which is a

problem for generalisability in that group. People in the sample are relatively over-

educated and have a higher income than average.

Considering choice sets in which the gain differed between groups, the proportion in

which the option producing the fewer years of additional life expectancy was selected

was 32.3%. Thus, gain is important, but not the sole determinant of choice in the

experiment.

Similarly, of the 553 complete respondents, 106 never selected an option involving

the fewer number of additional years of life. This means that the remaining 447 were

willing to trade aggregate life years in order to focus health gain towards specific

members of society.

Table 2: RE Probit Results

Mean (SE) Utility Function 1 Utility Function 2 Constant -0.0350(0.0139)** -0.0351(0.0140)**

Gain (years) 0.1092(0.0068)*** 0.2089(0.0282)*** Gain x female 0.0035(0.0024) -0.0043(0.0095)

Gain x high income -0.0079(0.0028)*** -0.0252(0.0103)** Gain x smoker -0.0739(0.0033)*** -0.1851(0.0136)***

Gain x healthy life 0.0154(0.0046)*** 0.0487(0.0163)*** Gain x carer 0.0317(0.0027)*** 0.1041(0.0108)*** Gain x LE45 0.0140(0.0054)** 0.0240(0.0198) Gain x LE60 0.0097(0.0062) 0.0347(0.0223) Gain x LE75 -0.0094(0.0055)* -0.0211(0.0199) Gain2 (years) -0.0096(0.0027)***

Gain2 x female 0.0009(0.0011) Gain2 x high income 0.0020(0.0011)*

Gain2 x smoker 0.0139(0.0016)*** Gain2 x healthy life -0.0037(0.0017)**

Gain2 x carer -0.0088(0.0013)*** Gain2 x LE45 -0.0011(0.0021) Gain2 x LE60 -0.0027(0.0023) Gain2 x LE75 0.0013(0.0021)

Lnsig2 -13.2502(10.0324) u -13.5833(11.083) Sigma u 0.0013(0.0067) 0.0011(0.0062)

Log likelihood -5570 -5496 AIC 11161 11032 BIC 11239 11174

Levels of statistical significance: *=10%; **=5%; ***=1%

The RE probit results are presented in Table 2. Under Utility Function 1, respondents

were willing to discriminate in favour of programmes with a greater health gain, and

to recipients who had a lower income, were non-smokers, were carers, or had life

expectancies of 45 (relative to those with the base total life expectancy of 30 years).

Under Utility Function 2, the coefficients on the linear component of GAIN show a

similar pattern to that in Utility Function 1, but cannot be easily compared as further

interaction terms are estimated. The quadratic terms are statistically significant at the

5% level for the main effect on GAIN (suggesting diminishing marginal utility of

time), and on smoking (positive), healthy lifestyles and carer status (both negative).

Thus, the discrimination against smokers exhibited throughout is relatively larger for

smaller values of GAIN, while the discrimination in favour of those with healthy

lifestyles or with dependents was relatively larger for smaller values of GAIN. Under

both Information Criteria, Utility Function 2 is preferred. A set of equity weights

based on this utility function can be generated, but are necessarily time specific (as

shown in Equation 4). Therefore, the relative ranking of two different hypothetical

recipients of health care depends on the number of years they would receive under

different values of GAIN. In other words, if x extra years of life expectancy for group

A is preferred to x years extra life expectancy for group B, this relative preference

does not necessarily hold if each group receives y years. This is a significant

complication to the operationalisation of an equity weights system, but can be

generated in the same way using Equations 4 and 5. However, the additional

complexity of such an approach lead us to recommend the results from Utility

Function 1 to be the preferred option. The equity weights based on Utility Function 1

are presented in Table 3, with the weights plotted in a histogram in Figure 3.

Table 3: Equity Weights

Income Smoker Healthy

life? Carer Life

Expectancy Male Equity Weight

(95% CI) Female Equity

Weight (95% CI) High Yes Yes Yes 30 0.72 (0.58-0.86) 0.75 (0.62-0.89) High Yes Yes Yes 45 0.86 (0.71-1.00) 0.89 (0.73-1.05) High Yes Yes Yes 60 0.81 (0.71-0.92) 0.85 (0.72-0.98) High Yes Yes Yes 75 0.63 (0.54-0.71) 0.66 (0.59-0.74) High Yes Yes No 30 0.41 (0.30-0.53) 0.45 (0.32-0.58) High Yes Yes No 45 0.55 (0.43-0.67) 0.58 (0.45-0.72) High Yes Yes No 60 0.51 (0.38-0.64) 0.54 (0.44-0.64) High Yes Yes No 75 0.32 (0.23-0.42) 0.36 (0.26-0.45) High Yes No Yes 30 0.57 (0.44-0.70) 0.61(0.49-0.72) High Yes No Yes 45 0.71 (0.57-0.85) 0.74 (0.63-0.85) High Yes No Yes 60 0.67 (0.55-0.78) 0.70 (0.60-0.80) High Yes No Yes 75 0.48 (0.39-0.57) 0.51 (0.43-0.60) High Yes No No 30 0.27 (0.15-0.38) 0.30 (0.16-0.44) High Yes No No 45 0.40 (0.28-0.52) 0.43 (0.28-0.59) High Yes No No 60 0.36 (0.26-0.46) 0.39 (0.28-0.51) High Yes No No 75 0.17 (0.05-0.30) 0.21 (0.07-0.35) High No Yes Yes 30 1.43 (1.27-1.60) 1.47 (1.27-1.66) High No Yes Yes 45 1.57 (1.39-1.75) 1.60 (1.42-1.78) High No Yes Yes 60 1.53 (1.35-1.71) 1.56 (1.40-1.72) High No Yes Yes 75 1.34 (1.25-1.43) 1.38 (1.32-1.44) High No Yes No 30 1.13 (0.99-1.26) 1.16 (1.05-1.28) High No Yes No 45 1.26 (1.13-1.40) 1.30 (1.12-1.47) High No Yes No 60 1.22 (1.09-1.35) 1.26 (1.11-1.40) High No Yes No 75 1.04 (0.98-1.09) 1.07 (1.01-1.14) High No No Yes 30 1.29 (1.16-1.42) 1.32 (1.19-1.45) High No No Yes 45 1.42 (1.27-1.57) 1.46 (1.30-1.62)

High No No Yes 60 1.38 (1.23-1.53) 1.41 (1.25-1.56) High No No Yes 75 1.20 (1.13-1.26) 1.23 (1.17-1.29) High No No No 30 0.98 (0.87-1.09) 1.01 (0.88-1.15) High No No No 45 1.12 (1.00-1.23) 1.15 (1.02-1.28) High No No No 60 1.07 (0.93-1.22) 1.11 (0.97-1.24) High No No No 75 0.89 (0.82-0.96) 0.92 (0.85-1.00) Low Yes Yes Yes 30 0.80 (0.67-0.93) 0.83 (0.70-0.96) Low Yes Yes Yes 45 0.93 (0.79-1.07) 0.97 (0.82-1.11) Low Yes Yes Yes 60 0.89 (0.77-1.01) 0.92 (0.77-1.07) Low Yes Yes Yes 75 0.71 (0.63-0.78) 0.74 (0.65-0.83) Low Yes Yes No 30 0.49 (0.37-0.61) 0.52 (0.42-0.63) Low Yes Yes No 45 0.63 (0.51-0.74) 0.66 (0.55-0.77) Low Yes Yes No 60 0.58 (0.47-0.69) 0.62 (0.49-0.75) Low Yes Yes No 75 0.40 (0.30-0.50) 0.43 (0.33-0.53) Low Yes No Yes 30 0.65 (0.52-0.78) 0.68 (0.57-0.80) Low Yes No Yes 45 0.78 (0.64-0.92) 0.82 (0.69-0.94) Low Yes No Yes 60 0.74 (0.64-0.85) 0.78 (0.67-0.88) Low Yes No Yes 75 0.56 (0.46-0.65) 0.59 (0.50-0.69) Low Yes No No 30 0.34 (0.23-0.45) 0.38 (0.25-0.50) Low Yes No No 45 0.48 (0.35-0.60) 0.51 (0.39-0.64) Low Yes No No 60 0.44 (0.32-0.55) 0.47 (0.36-0.58) Low Yes No No 75 0.25 (0.11-0.39) 0.28 (0.19-0.38) Low No Yes Yes 30 1.51 (1.35-1.67) 1.55 (1.39-1.70) Low No Yes Yes 45 1.65 (1.43-1.86) 1.68 (1.50-1.86) Low No Yes Yes 60 1.61 (1.43-1.78) 1.64 (1.49-1.78) Low No Yes Yes 75 1.42 (1.34-1.50) 1.45 (1.37-1.53) Low No Yes No 30 1.20 (1.08-1.33) 1.24 (1.11-1.37) Low No Yes No 45 1.34 (1.17-1.51) 1.37 (1.20-1.55) Low No Yes No 60 1.30 (1.18-1.42) 1.33 (1.19-1.47)

Low No Yes No 75 1.11 (1.05-1.17) 1.15 (1.08-1.21) Low No No Yes 30 1.36 (1.19-1.53) 1.40 (1.26-1.53) Low No No Yes 45 1.50 (1.34-1.66) 1.53 (1.35-1.72) Low No No Yes 60 1.46 (1.32-1.59) 1.49 (1.32-1.66) Low No No Yes 75 1.27 (1.21-1.33) 1.31 (1.23-1.38) Low No No No 30 1.06 (0.95-1.16) 1.09 (0.98-1.20) Low No No No 45 1.19 (1.06-1.33) 1.23 (1.07-1.38) Low No No No 60 1.15 (1.02-1.27) 1.18 (1.07-1.30) Low No No No 75 0.97 (0.89-1.03) 1.00 (0.94-1.06)

The reference group was selected to be the ‘average’ group in society, under the assumptions that 50% of people in society are female, that 50% have above average income,

that 50% have a healthy lifestyle, that 20% are smokers, that 40.8% are carers, and that the average person has a total life expectancy of 75.

Figure 3: Distribution of Equity Weights

0.2

.4.6

.8D

ensi

ty

0 .5 1 1.5 2Equity Weight

Under Utility Function 2, the four hypothetical groups are presented alongside their

utility weight for a GAIN of 5 and 10 years in Table 4.

Table 4: Some Selected Equity Weights under Utility Function 2 Person Gain = 5 years Gain = 10

years Most favoured (female, low income, non-smoker, healthy lifestyle, carer, life expectancy of 60)

1.623 1.150

Person 2 (Female, high income, smoker, healthy lifestyle, carer, life expectancy of 45)

0.828 0.805

Person 3 (male, low income, non-smoker, unhealthy lifestyle, non-carer, life expectancy of 30)

0.958 0.848

Least favoured (male, high income, smoker, unhealthy lifestyle, non-carer, life expectancy of 75)

0.092 0.402

The results under Utility Function 4 suggest that the equity weight would be sensitive

to the value of GAIN for some combinations of individuals (for example, the most

and least favoured). The sensitivity will depend on the relative size of the coefficients

on the quadratic and linear terms in GAIN that apply to a group; as the former

becomes relatively large relative to the latter, the equity weight becomes more

variable over GAIN. Over all 128 hypothetical groups, the tendency is for smaller

equity weights for higher values of GAIN. This can be explained by the tendency for

the signs of the linear and quadratic terms to be opposite to one another. If the

coefficient on the linear term (e.g. Gain x female) is negative (positive) in Utility

Function 2, the coefficient on the quadratic term (e.g. Gain2

Observable heterogeneity

x female) appears to tend

to be positive (negative). Thus, as GAIN increases, equity weights tend towards one,

and hence towards a more conventional QALY-type model.

The results of Utility Function 1 if the sample is divided by gender of the survey

respondent are provided graphically in Figure 4.

Figure 4: Results by gender of respondent

The major difference between male and female responses lies in their attitudes

towards gains accruing to hypothetical cohorts of people of different genders. Both

discriminate heavily in favour of their own gender (p<0.01 in both cases). Also, male

responders were more likely to demonstrate the Fair Innings type argument,

discriminating against those expected to live until 75 relative to those not expected to

have such longevity. While the coefficients across other dimensions appear to follow

a similar pattern, a likelihood ratio test rejects the null hypothesis, meaning the more

constrained model which acknowledges the difference in survey respondent by gender

is a better fit (p<0.01).

The graphical comparison of responses by smoking status considers three sub-groups,

namely smokers, former smokers and people who have never been smokers. These

results are illustrated in Figure 5.

Figure 5: Results by smoking status

Again, the respondents tended to display quite different preferences in the dimension

over which the sample was split. On average, smokers did not strongly favour either

smokers or non-smokers. However, the other two groups strongly discriminate against

smokers. As with the case of gender, the LR test rejects the null hypothesis (p<0.01)

Finally, as younger people were over-represented in the sample, a comparison

between the oldest 25% of the sample (aged 51 and older) was compared with the

remainder of the sample. These results are presented in Figure 6.

Figure 6: Results by age of respondent

The point estimates suggest that the older cohort discriminate against older

respondents less; indeed, ceteris paribus, they would prefer a health program for

people with a total life expectancy of 75 relative to one for people with a total life

expectancy of 30. However, the LR test accepts the null hypothesis (p=0.1673),

meaning that distinguishing the responses of older people from the rest does not

improve model fit.

Conclusion and Discussion

This paper has demonstrated that simple maximisation of total health is not the

criterion on which people make health allocation decisions, emphasising the results

reported by Dolan (2005) but allowing for characteristics of the respondent to be

considered simultaneously (while the studies reported by Dolan tend to focus on a

smaller subset of characteristics, often only one). The average survey respondent was

willing to target health gain towards carers and non-smokers, even if it reduces total

life expectancy across the population. Willingness to discriminate based on the other

attributes was less strong. Additionally, some characteristics of respondents were

strong predictors of their responses. In general, people relatively favoured health gains

accruing to people with similar characteristics to themselves. The patterns regarding

total life expectancy were less clear cut, but suggested that gains accruing to people

who can expect a typical ‘Fair Innings’ are valued less than gains accruing to people

who might not receive this allocation.

The conclusion in this study that equity weights can differ significantly from 1

contrasts with that of Lancsar et al. who argue that weighting QALYs is generally not

appropriate, and would be unlikely to significantly impact on the scale of the gain

accruing from a healthcare intervention. There are five possible explanations for this

divergence. Firstly, it might be that respondents in the two countries (Australia and

the UK) hold different views. Secondly, the way the question was posed may drive

the result. Inadvertent emphasis of certain aspects of the choice may cause results in

different experiments to differ; a recent example in the case of colorectal cancer

screening investigated this phenomenon (Howard and Salkeld, 2009). Thirdly, the two

studies consider quite different dimensions, particularly as those considered here

allow for a non-symmetrical indifference curve as illustrated in Figure 1.

Additionally, in dimensions that are common to both experiments (such as total life

expectancy), the levels were different. It might be argued that the dimensions selected

by Lancsar et al. are ones over which preferences are not strong (and likely to over-

ride the conventional maximisation of LYs or QALYs). Fourthly, it might be that the

method presented here for converting regression results into equity weights produces

different weights to those that would have been produced using the compensating

variation, as employed by Lancsar et al. If the CV is applied to our results, the impact

is actually fairly small. Weights derived through CV are higher, but by a maximum of

0.126. Finally, the non-linear utility function estimated here suggested that longer

periods of GAIN moved the equity weighting system towards a QALY-type model.

Since Lancsar et al. consider a larger spread of time in their experiment, it might be

that the two studies are simply reporting different parts of a common preference

curve.

The study has a number of potential limitations. While the panel was broadly

representative of the Australian population, it is arguable that membership of an

online panel is correlated with certain unobservable characteristics. With regard to

observable characteristics of respondents, our sample was generally younger than the

Australian population, although it was concluded that younger and older cohorts of

respondents displayed similar patterns of responses (in that the LR test failed to reject

the null).

As with many choice experiments, it is plausible to argue that the characteristics

investigated in this study form only a subset of those which might be important. The

trade-off between the number of choice sets in the experiment and the range of issues

that could be considered means this problem will be omnipresent; future research

might consider other areas in which people might discriminate when demonstrating

preferences for allocation of health. Future work should consider the recently

developed principles for the identification of appropriate dimensions and levels

identified by Coast et al. (2011)

The experimental design used in this work did not explicitly allow for the

consideration of interactions between characteristics of the hypothetical patients. It is

plausible that society may, for example, be willing to discriminate against smokers,

but only if those smokers had a high income. This area of research should be explored

using an experimental design which is constructed for this purpose. While the RE-

probit does account for the panel nature of the data, it should be noted that more

flexible ways of characterising heterogeneity have been proposed, and demonstrate

considerable promise (Fiebig, et al., 2010). However, in the context of generating

societal preferences for health gains, the mean response is the most important issue;

identifying the degree of agreement or disagreement with this mean is of interest, but

any attempt to implement equity weights in practice is likely to have ample practical

obstacles to overcome already.

Equity weights are conceptually straightforward, but have proven difficult to generate,

or to employ in economic evaluation. Additionally, there are ethical issues which have

to be addressed before advocating the use of equity weights. If societal preferences

are defined by a subset of the population, and they favour health gains accruing to

themselves above health gains accruing to others, that is clearly unsatisfactory,

because the distributional weights are dependent on the choice of subset. If however,

societal preferences reflect all individual preferences equally (notwithstanding the

difficulty of doing so), it has been argued that these preferences ought to have to

satisfy some ethical constraints (Broome, 1991, Richardson and McKie, 2005). While

this idea is appealing, the specification of these ethical constraints is difficult.

Arguably, identifying a set of ethical constraints which is broadly acceptable to

society means that these constraints will never act on the resource allocation decisions

they were designed for. In this paper, these issues are not explicitly considered;

however any attempt to operationalise equity weights for economic evaluation needs

to consider these issues.

References

Akaike H. 1974. A new look at the statistical model identification. IEEE Transactions

on Automatic Control 19: 716-723.

Australian Bureau of Statistics, 2006 Census Data by Location. Available via

http://www.censusdata.abs.gov.au. Accessed 12th October 2011

Australian Bureau of Statistics, Population Clock. Available via

http:www.abs.gov.au/ausstats/abs. Accessed 12th October 2011

Bleichrodt H, Johannesson M. 1997. The validity of QALYs: An experimental test of

constant proportional tradeoff and utility independence. Medical Decision

Making 17: 21-32.

Bleichrodt N, Wakker P, Johannesson M. 1997. Characterizing QALYs by risk

neutrality. Journal of Risk and Uncertainty 15: 107-114.

Broome J. 1991. Weighting goods. Blackwell: Oxford.

Coast J, Al-Janabi H, Sutton EJ, Horrocks SA, Vosper AJ, Swancutt DR, et al. 2011.

Using qualitative methods for attribute development for discrete choice

experiments: Issues and recommendations. (doi: 10.1002/hec.1739.). Health

Economics.

Coast J, Flynn TN, Salisbury C, Louviere J, Peters TJ. 2006. Maximising responses to

discrete choice experiments: A randomised trial. Appl Health Econ Health

Policy 5: 249-260.

Dolan P, Shaw R, Tsuchiya A, Williams A. 2005. QALY maximisation and people's

preferences: A methodological review of the literature. Health Economics 14:

197-208.

Fiebig D, Keane M, Louviere J, Wasi N. 2010. The generalized multinomial logit

model: Accounting for scale and coefficient heterogeneity. Marketing Science

29: 393-421.

Hall J, Fiebig DG, King MT, Hossain I, Louviere JJ. 2006. What influences

participation in genetic carrier testing? Results from a discrete choice

experiment. Journal of Health Economics 25: 520-537.

http://www.censusdata.abs.gov.au/�

Howard K, Salkeld G. 2009. Does attribute framing in discrete choice experiments

influence willingness to pay? Results from a discrete choice experiment in

screening for colorectal cancer. Value in Health 12: 354-363.

Lancsar E, Louviere J, Flynn T. 2007. Several methods to investigate relative attribute

impact in stated preference experiments. Social Science and Medicine 64:

1738-1753.

Lancsar E, Savage E. 2004. Deriving welfare measures from discrete choice

experiments: Inconsistency between current methods and random utility and

welfare theory. Health Economics 13: 901-907.

Lancsar E, Wildman J, Donaldson C, Ryan M, Baker R. 2011. Deriving distributional

weights for QALYs through discrete choice experiments. Journal of Health

Economics 30: 466-478.

McIntosh E, Ryan M. 2002. Using discrete choice experiments to derive welfare

estimates for the provision of elective surgery: Implications for discontinuous

preferences. Journal of Economic Psychology 23: 367-382.

National Institute for Health and Clinical Excellence. Social value judgements:

Principles for the development of NICE guidance (2nd edition). NICE:

London, 2008.

Norman R, Gallego G. 2008. Equity weights for economic evaluation: An Australian

discrete choice experiment, CHERE Working Paper 2008/5. CHERE: Sydney.

Olsen JA, Richardson J, Dolan P, Menzel P. 2003. The moral relevance of personal

characteristics in setting health care priorities. Social Science and Medicine

57: 1163-1172.

Richardson J, McKie J. 2005. Empiricism, ethics and orthodox economic theory:

What is the appropriate basis for decision-making in the health sector? Social

Science and Medicine 60: 265-275.

Ryan M. 2004. Deriving welfare measures in discrete choice experiments: A comment

to Lancsar and Savage (1). Health Economics 13: 909-912; discussion 919-

924.

Santos Silva JM. 2004. Deriving welfare measures in discrete choice experiments: A

comment to Lancsar and Savage (2). Health Economics 13: 913-918;

discussion 919-924.

Schwappach DLB. 2003. Does it matter who you are or what you gain? An

experimental study of preferences for resource allocation. Health Economics

12: 255-267.

Schwarz GE. 1978. Estimating the dimensions of a model. Annals of Statistics 6: 461-

464.

Street DJ, Burgess L. 2007. The construction of optimal stated choice experiments:

Theory and methods. Wiley: Hoboken, New Jersey.

Tobin J. 1970. On limiting the domain of inequality. Journal of Law and Economics

13: 263-277.

Williams A. 1997. Intergenerational equity: An exploration of the 'fair innings'

argument. Health Economics 6: 117-132.

equity weighting in the economic evaluation of healthcare€¦ · converted into equity weights for...

Documents