sp experimental designs - theoretical background and … · sp experimental designs - theoretical...

SP Experimental Designs - TheoreticalBackground and Case Study

Basil Schmid

IVTETH Zurich

Measurement and ModelingFS2016

Outline

1. Introduction

2. Orthogonal and fractional factorial designs

3. Efficient designs

4. Pivot designs

5. Testing a design: A case study

6. Conclusions

SP Experimental Designs 2

Introduction

Explain how the variation of certain attributes affects the outcomeof interest (causal relationship), applying a statistically efficientand effective framework (maximum amount of information withminimum amount of resources)

Kuhfeld (1994): ”The best approach to design creation is to usethe computer as a tool along with traditional design skills, not as asubstitute for thinking about the problem”


A brief history...

• 1747: While serving as surgeon on HMS Salisbury, James Lindcarried out a systematic clinical trial to compare patients withscurvy (lack of vitamin C disease)

• Entry requirements to reduce exogenous variation• 12 seamen were assigned to 6 treatment groups, each

receiving a different diet over a 2-week-period• Other examples: Agriculture, marketing, economics• Sir Ronald Fisher (1935): Experiments are ”experience

carefully planned in advance, and designed to form a securebasis of new knowledge”:

– manipulation/variation of (existing) attributes– formation of attribute levels– observation/measurement of outcomes


Experimental design

• In contrast to revealed preference (RP) data, statedpreference (SP) data are generated by some systematic andplanned design process → SP data may provide insights into ahypothetical market for which no RP data is available

• Formulation of statistical hypotheses to be tested• Specification of the number of experimental units

(observations) required and the population from which theywill be sampled

• Specification of the randomization procedure for assigning theexperimental units to the attribute levels: Sources of variationamong the units are distributed over the entire experiment

• Determination of the statistical analysis that will beperformed (discrete choice, multivariate regression, ...)


Orthogonal designs

• x ⊥ y : Two attribute vectors x and y are said to be (strictly)orthogonal if the inner product is zero ←→cov(x , y) = E (x − E (x)) ∗ E (y − E (y)) = 0

• Correlations between attributes are zero and attribute levelsappear equally often in combination with all other attributelevels (balance) → the effects of interest can be estimatedefficient and stochastically independent

• Full factorial orthogonal design with 2 attributes x and y a 3levels (32 possible combinations; orthogonally coded):

Choice set x y

1 -1 -12 -1 03 -1 14 0 -15 0 06 0 17 1 -18 1 09 1 1


Fractional factorial designs

• Full factorial design: Experiment size explodes with increasingattributes and levels. E.g. 10 attributes with 3 levels: 310

possible attribute combinations (= 59049 degrees of freedom)• Full factorial designs are, by definition, perfectly orthogonal in

all main-effects and higher order interactions• Use an ”optimal” subset of a full factorial• Orthogonality can be maintained under the assumption that

some effects (often higher order interactions) are zero• However, interactions might be highly correlated with main

effects:

Ui = α ∗ Xtt,i + β ∗ Xtc,i + γ ∗ Xtt,i ∗ Xtc,i + εi (1)

∂Ui∂Xtc,i

= β + γ ∗ Xtt,i (2)SP Experimental Designs 7

Fractional factorial designs

• Assume a full factorial design with 10 attributes a 3 levels(310 combinations): To estimate all 10 main-effects, oneneeds at least 20 choice sets (10*(3-1) degrees of freedom)

• Hence, 45 two-way ((10-1)*10/2) as well as many higher orderinteractions (59049-20-45-1 degrees of freedom) are ignored

• Practical considerations: Main-effects typically account for70-90% of explained variance, two-way interactions for 5-15%

– Limit # of attribute levels: Often 2-5 levels– Limit # of attributes: Often 6-16 attributes– Only allow some two-way interactions to be different

from 0 (e.g. travel time x travel cost)– Block-design: Divide fractional factorial into groups with

the same # of choice sets in a statistically efficient way


Block-designs

• Typically, a respondent receives between 6 and 15 choice sets(response burden and cognitive fatigue)

• Even fractional designs often include more choice sets thanwhat the researcher wants to assign to each respondent

• Correlation between blocks and attributes should beminimized. Otherwise, one respondent gets all blocks withe.g. high travel times

• Common mistake: Assign first x choice sets to block b• Orthogonal blocking: Block number is uncorrelated with

attributes• Good news: Most software automatically assign choice sets to

each block specified by the researcher


Some important definitions

• Unlabeled experiment: A choice experiment where alternativeshave no intrinsic meaning (e.g. route 1 vs. route 2)

• Labeled experiment: A choice experiment where thealternatives are labeled. Model parameters can be estimatedfor each alternative independently (e.g. car vs. train vs. bus)

• Generic effect: The same model parameter for all alternativesin the utility function (e.g. travel cost)

• Alternative-specific effect: Different model parameters foreach alternative in the utility function (e.g. travel time car vs.travel time bus vs. travel time train)

• Own vs. cross effect: If cross effects are present, the IID errorassumption is violated


Example of unlabeled experiment


Example of labeled experiment


Orthognonal fraction of full factorial (example)

• 4 attributes with 3 levels, 3 ”unlabelled” choice alternatives,34∗3 possible attribute level combinations: Minimum of 8choice sets to estimate all 4 (generic) main-effects

• Smallest orthogonal fraction = 312−9 choice setsSet TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU3

1 1 1 1 1 1 1 1 1 1 1 1 12 2 2 2 2 2 2 1 2 2 2 1 13 3 3 3 3 3 3 1 3 3 3 1 14 3 3 3 2 2 2 1 1 1 1 2 25 1 1 1 3 3 3 1 2 2 2 2 2.. .. .. .. .. .. .. .. .. .. .. .. ..26 1 2 3 2 3 1 3 1 2 3 2 327 2 3 1 3 1 2 3 2 3 1 2 3

−→ redundant alternatives−→ weakly dominant alternatives−→ dominant alternatives


Problems with orthogonal designs

Reasons for moving away from orthogonal designs (OD):• For some problems, an OD does not exist (e.g. for limited, by

the researcher predefined number of choice sets) → in general,ODs require a larger sample size and lead to larger choice sets

• Behaviorally plausible choice scenarios: ODs may includedominant/weakly dominant/redundant choice sets → noinformation gain

• When working with preference constraints, orthogonalitycannot be maintained

• Need for more sophisticated approaches:Efficient experimental designs


Efficient designs: Some basic concepts

• Efficiency: For given design requirements (violating strictorthogonality), minimize the variances of parameter estimates,which are taken from the variance-covariance matrix of adesign

• D-Efficient GLM Designs: No prior information about theparameter values (signs, magnitude)Efficiency ↔ convergence towards orthogonality

• D-Efficient MNL Designs: Efficiency measures depend on the”unknown” parameter values one wants to estimate

• In many cases, one has some sound knowledge about the signand relative values of the design attributes (e.g. travel costand travel time both have a negative effect on utility, leadingto a positive value of time)


Example 1

2 13 4

Orthogonal design with travel time and travel cost (2 alternatives,3 levels): Quadrants 1 and 3 dominate quadrants 2 and 4


Example 2

-2-1

01

2C

ost_

MIV

- C

ost_

PT

-2 -1 0 1 2Time_MIV - Time_PT

WLS Predictions

Efficient design with travel time and travel cost (2 alternatives, 3levels): Elimination of dominant alternatives


Efficient designs: Some basic concepts

• Main question: How can the researcher make use of priorinformation in order to increase the efficiency (minimizestandard errors of the attributes, i.e. more robust results) andreduce the sample size requirements?

• Example 1: Orthogonal designs make no use prior information→ time and cost attributes are uncorrelated

• Example 2: Efficient design with no dominant alternativesautomatically leads to a negative correlation between timeand cost → forces respondents to trade-off and increases theamount of preference information given sample size

• D-Efficient MNL approach: Use expected parameterdistributions with µk and σk to calculate the ”optimal” design


D-Efficient GLM designs

• Find a design matrix Z , with rows selected from a Q x kmatrix X where n Q, that is optimal in some sense. Z isan n x k matrix, where k is the number of parameters and n isthe number of choice sets in the actual experiment

• Row-based Federov algorithm (R-package AlgDesign):Selection from a predefined candidature set (after exclusion ofdominant/redundant alternatives, etc.)

• Optimization criterion: Maximize k-th root of thedeterminant of the normalized dispersion matrix M ∝ Ω−1

• Assumption: Observations are independent and error termsare normally distributed

max .

D − Efficiency = det(Z ′Z

n

)1k

(3)


D-Efficient MNL designs

• Asymptotic variance-covariance (AVC) matrix for discretechoice models depends on the ”true” parameter values

• Starting point: Need to make assumptions about the model,utility functions and parameter values

• Design matrices Z are created using a column-based swappingalgorithm: Selection of attribute levels over all choicesituations for each attribute

• Optimization criteria: Minimize k-th root of the determinantof the AVC matrix Ω

min.

D − Error = det(

Ω(Z , β))1

k

(4)


Some remarks on D-Efficient designs

• Large number of different algorithms and optimization criteriaexist (focus on D-Efficiency as most common approach in theliterature)

• Eliminating ”undesirable” choice sets has to be done manuallyby using preference constraints

• GLM designs: Can be created in the open-source software R.Robust towards misspecification of priors and often as efficientas MNL designs

• MNL designs: Created in the commercial software NGENE.Easier to implement, more assistance and possibilities.

• Priors usually come from the literature, intuition and pre-teststudies. Misspecification can be minimized by assuming arandom distribution of priors (Bayesian approach)


An example of a design strategy

• 9 attributes with 3 levels (39 full factorial), 2 labeledalternatives, 32 choice sets with 4 blocks, estimation of alllinear main effects, quadratic effects and 6 selected two-wayinteractions (9+9+6+1 degrees of freedom)

• Polynomial and interaction effects have to be specified in theutility function of a design

• No weakly dominant alternatives (i.e. all attribute values ofone alternative in choice set s are strictly better or equal: a14 a2 or a1 < a2)

• No strongly dominant travel time relative to travel costalternatives or vice versa (i.e. a1,cost ≺ a2,cost and a1,time ≺a2,time or vice versa)

• ”Weak” priors to determine the direction of expected effects


Efficient design (example)

• 4 attributes with 3 levels, 3 ”unlabelled” choice alternatives,34∗3 possible attribute level combinations: Minimum of 8choice sets to estimate all 4 (generic) main-effects

• Weak priors, exclusion of all dominant choice sets• Free choice about the number of choice sets

(# choice sets > df )

Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU3

1 1 3 1 1 3 2 3 2 3 1 3 32 1 3 3 3 2 1 1 2 3 3 3 13 3 2 1 1 2 3 1 2 1 2 2 24 1 1 3 2 3 3 2 1 2 3 1 25 2 1 2 2 3 2 1 1 1 3 1 2.. .. .. .. .. .. .. .. .. .. .. .. ..17 1 3 3 3 2 2 1 1 3 1 2 118 3 2 1 2 1 3 1 2 1 3 3 1

−→ no more dominant/weakly dominant/redundant choice sets


Some general remarks

• Experimental design creation is a research topic on its own(Rose and Bliemer, 2009; Quan et al., 2011)

• If priors are misspecified, one might run into troubles. Becareful when using priors!

• Use attributes, values and trade-off variations that are”plausible”

• Make sure that there are some overlapping values of genericattributes between alternatives (pivot designs)

• Order effects: Randomize order of alternatives acrossrespondents in the questionnaire

• Carefully introduce respondents to the (hypothetical) scenarioand explain the attributes you are presenting to them


Pivot designs

• It is preferable to base variations around values for observedbehavior (state-of-the-art in transportation research):Calculate design with placeholder values (e.g. 1,2,3) andreplace them by relative changes (e.g. 0.7, 1.1, 1.5)

• Combination of RP data with variations given by the design→ respondents can better identify with the presented choicescenarios; much more variation in the attribute levels

• Possible to include one reference alternative in the choice sets(e.g. bike travel time, whose value is not varied)

• Problems:– If reference values are (highly) dominant, the respondents

will more likely choose the respective alternative (only inlabeled experiments)

– Correlation between attributes; skewnessSP Experimental Designs 25

Pivot designs: Trade-off distribution

Example where MIV is often cheaper and faster than PT →modification of reference values needed!


Testing a design: A case study

• Once you have your design, you should test the performanceof estimating the coefficients of interest, based on simulationof a more or less hypothetical population

• Define priors for the attribute weights of the utility functionbased on recent similar studies

• Simulate error structure (GEV) for the utility function takinginto account the panel structure of the designs

• Calculate individual utilities and determine the chosenalternatives for each simulated subject

• Estimate the parameters for the simulated data and comparethe results with the a-priori assumptions


Testing a design: Pivot approach

Experimental design: 9 attributes with 3 levels (39 full factorial), 2labeled alternatives, 32 choice sets (8 per subject)Reference values taken from a Swiss mode choice experiment:

• Total travel time: For PT alternative = travel time withoutaccess and egress time; for MIV alternative = travel time +parking search time

• Total travel cost: For PT alternative = ticket price; for MIValternative = fuel cost + parking cost

• Number of transfers: PT alternative only

Attribute Effect code: −1 0 1

Travel time (MIV and PT) [%] 70 90 110Travel cost (MIV and PT) [%] 80 110 130Delay prob. (MIV and PT) [%] 5 10 20Walking / waiting time (MIV and PT) [min.] 5 10 20Number of transfers (PT) [#] −1 0 1


Testing a design: A-priori Coefficients

• Prior values for the individual weights of attribute k,βik ∼ N(µk , σk), and alternative-specific constants aresimulated based on results obtained from the linear model inthe BMVI Zeitkostenstudie (Axhausen et al., 2014)

• For each simulated individual i , the same βik is used over all 8choice sets, representing the panel structure of the experiment

Coefficient Mean SD Type

ASCMIV 0.172 0.5 Alternative-specificβtimeMIV -0.022 0.005 Alternative-specificβtimePT -0.021 0.005 Alternative-specificβcost -0.088 0.01 GenericβdelayMIV -0.058 0.01 Alternative-specificβdelayPT -0.027 0.005 Alternative-specificβwalk -0.047 0.01 GenericβtransfersPT -0.4 0.1 Alternative-specific

VOTMIV 15.0 CHF / hVOTPT 14.2 CHF / h

Number of simulated 400coefficient vectors βik


Testing a design: Utility Function

The random utility model framework (RUM) assumes that in eachchoice set s, individual i perceives utility Uijs for each alternative jamong the full set of alternatives J (MIV and PT), given theattributes Xijs , and chooses the one that maximizes utility. Uijs hasan observed component Vijs and an unobserved component εijs :

Uijs = Vijs + εijs (5)

whereεijs ∼ GEV (0, 1, 0) (6)

and

Vijs =K∑

k=1βikXijsk (7)


Testing a design: Choice simulation

The chosen alternatives choiceis are calculated as follows:

if Uis,MIV > Uis,PT : choiceis =

MIVelse PT

(8)

Snippet of a simulated discrete choice data set:

id set alt choice block time cost walk delay transfers(min.) (CHF) (min.) (min.) (#)

.... . . . . ... ... .. .. ...3 8 MIV 1 1 40 12 20 5 -3 8 PT 0 1 51 16 5 5 34 1 MIV 0 1 73 27 10 20 -4 1 PT 1 1 53 26 20 5 04 2 MIV 1 1 54 27 20 20 -

.... . . . . ... ... .. .. ...


Testing a design: Estimation

• For given randomly drawn subsets of RP data, simulated βikcoefficient vectors and simulated error terms εijs , the modelsare estimated for 3 different designs. This process is repeated2000 times to get insights into the distributions of coefficients(robustness), variances (precision) and values of time

• The between-design differences of E(βk) and E(SEk) withrespect to the a-priori parameters are small

Design approach: GLM MNL: β0 MNL: βkE(βk ) E(SEk ) E(βk ) E(SEk ) E(βk ) E(SEk )

ASCMIV 0.174 0.124 0.176 0.121* 0.175 0.133βtimeMIV -0.020 0.002 -0.021 0.002 -0.020 0.002βtimePT -0.019 0.002 -0.019 0.002 -0.019 0.002βcost -0.081 0.010 -0.082 0.009 -0.082 0.009βdelayMIV -0.054 0.006* -0.054 0.006 -0.054 0.007βdelayPT -0.025 0.006 -0.025 0.006* -0.025 0.006βwalk -0.044 0.004* -0.044 0.004 -0.044 0.005βtransfersPT -0.371 0.049 -0.373 0.050 -0.375 0.051

VOTMIV 16.1 16.4 16.1VOTPT 14.9 15.2 15.0


Conclusions

• No substantial differences between the different designapproaches: Designs are robust and reproduce the a-priorivalues well

• From a behavioral perspective, one should always excludedominant and weakly dominant alternatives!

• Personal suggestion: Create an efficient design by ...– carefully thinking about your research question and aims– assigning about 8 choice sets to a respondents and using

a block-design (total # of choice sets ≈ 1.5 ∗ df )– using MNL approach with zero (or weak) priors,

excluding ”undesired” choice sets by manually settingpreference conditions

– updating your design after a pre-test study


sp experimental designs - theoretical background and … · sp experimental designs - theoretical...

Documents