sp experimental designs - theoretical background and … · sp experimental designs - theoretical...
TRANSCRIPT
SP Experimental Designs - TheoreticalBackground and Case Study
Basil Schmid
IVTETH Zurich
Measurement and ModelingFS2016
Outline
1. Introduction
2. Orthogonal and fractional factorial designs
3. Efficient designs
4. Pivot designs
5. Testing a design: A case study
6. Conclusions
SP Experimental Designs 2
Introduction
Explain how the variation of certain attributes affects the outcomeof interest (causal relationship), applying a statistically efficientand effective framework (maximum amount of information withminimum amount of resources)
Kuhfeld (1994): ”The best approach to design creation is to usethe computer as a tool along with traditional design skills, not as asubstitute for thinking about the problem”
SP Experimental Designs 3
A brief history...
• 1747: While serving as surgeon on HMS Salisbury, James Lindcarried out a systematic clinical trial to compare patients withscurvy (lack of vitamin C disease)
• Entry requirements to reduce exogenous variation• 12 seamen were assigned to 6 treatment groups, each
receiving a different diet over a 2-week-period• Other examples: Agriculture, marketing, economics• Sir Ronald Fisher (1935): Experiments are ”experience
carefully planned in advance, and designed to form a securebasis of new knowledge”:
– manipulation/variation of (existing) attributes– formation of attribute levels– observation/measurement of outcomes
SP Experimental Designs 4
Experimental design
• In contrast to revealed preference (RP) data, statedpreference (SP) data are generated by some systematic andplanned design process → SP data may provide insights into ahypothetical market for which no RP data is available
• Formulation of statistical hypotheses to be tested• Specification of the number of experimental units
(observations) required and the population from which theywill be sampled
• Specification of the randomization procedure for assigning theexperimental units to the attribute levels: Sources of variationamong the units are distributed over the entire experiment
• Determination of the statistical analysis that will beperformed (discrete choice, multivariate regression, ...)
SP Experimental Designs 5
Orthogonal designs
• x ⊥ y : Two attribute vectors x and y are said to be (strictly)orthogonal if the inner product is zero ←→cov(x , y) = E (x − E (x)) ∗ E (y − E (y)) = 0
• Correlations between attributes are zero and attribute levelsappear equally often in combination with all other attributelevels (balance) → the effects of interest can be estimatedefficient and stochastically independent
• Full factorial orthogonal design with 2 attributes x and y a 3levels (32 possible combinations; orthogonally coded):
Choice set x y
1 -1 -12 -1 03 -1 14 0 -15 0 06 0 17 1 -18 1 09 1 1
SP Experimental Designs 6
Fractional factorial designs
• Full factorial design: Experiment size explodes with increasingattributes and levels. E.g. 10 attributes with 3 levels: 310
possible attribute combinations (= 59049 degrees of freedom)• Full factorial designs are, by definition, perfectly orthogonal in
all main-effects and higher order interactions• Use an ”optimal” subset of a full factorial• Orthogonality can be maintained under the assumption that
some effects (often higher order interactions) are zero• However, interactions might be highly correlated with main
effects:
Ui = α ∗ Xtt,i + β ∗ Xtc,i + γ ∗ Xtt,i ∗ Xtc,i + εi (1)
∂Ui∂Xtc,i
= β + γ ∗ Xtt,i (2)SP Experimental Designs 7
Fractional factorial designs
• Assume a full factorial design with 10 attributes a 3 levels(310 combinations): To estimate all 10 main-effects, oneneeds at least 20 choice sets (10*(3-1) degrees of freedom)
• Hence, 45 two-way ((10-1)*10/2) as well as many higher orderinteractions (59049-20-45-1 degrees of freedom) are ignored
• Practical considerations: Main-effects typically account for70-90% of explained variance, two-way interactions for 5-15%
– Limit # of attribute levels: Often 2-5 levels– Limit # of attributes: Often 6-16 attributes– Only allow some two-way interactions to be different
from 0 (e.g. travel time x travel cost)– Block-design: Divide fractional factorial into groups with
the same # of choice sets in a statistically efficient way
SP Experimental Designs 8
Block-designs
• Typically, a respondent receives between 6 and 15 choice sets(response burden and cognitive fatigue)
• Even fractional designs often include more choice sets thanwhat the researcher wants to assign to each respondent
• Correlation between blocks and attributes should beminimized. Otherwise, one respondent gets all blocks withe.g. high travel times
• Common mistake: Assign first x choice sets to block b• Orthogonal blocking: Block number is uncorrelated with
attributes• Good news: Most software automatically assign choice sets to
each block specified by the researcher
SP Experimental Designs 9
Some important definitions
• Unlabeled experiment: A choice experiment where alternativeshave no intrinsic meaning (e.g. route 1 vs. route 2)
• Labeled experiment: A choice experiment where thealternatives are labeled. Model parameters can be estimatedfor each alternative independently (e.g. car vs. train vs. bus)
• Generic effect: The same model parameter for all alternativesin the utility function (e.g. travel cost)
• Alternative-specific effect: Different model parameters foreach alternative in the utility function (e.g. travel time car vs.travel time bus vs. travel time train)
• Own vs. cross effect: If cross effects are present, the IID errorassumption is violated
SP Experimental Designs 10
Example of unlabeled experiment
SP Experimental Designs 11
Example of labeled experiment
SP Experimental Designs 12
Orthognonal fraction of full factorial (example)
• 4 attributes with 3 levels, 3 ”unlabelled” choice alternatives,34∗3 possible attribute level combinations: Minimum of 8choice sets to estimate all 4 (generic) main-effects
• Smallest orthogonal fraction = 312−9 choice setsSet TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU3
1 1 1 1 1 1 1 1 1 1 1 1 12 2 2 2 2 2 2 1 2 2 2 1 13 3 3 3 3 3 3 1 3 3 3 1 14 3 3 3 2 2 2 1 1 1 1 2 25 1 1 1 3 3 3 1 2 2 2 2 2.. .. .. .. .. .. .. .. .. .. .. .. ..26 1 2 3 2 3 1 3 1 2 3 2 327 2 3 1 3 1 2 3 2 3 1 2 3
−→ redundant alternatives−→ weakly dominant alternatives−→ dominant alternatives
SP Experimental Designs 13
Problems with orthogonal designs
Reasons for moving away from orthogonal designs (OD):• For some problems, an OD does not exist (e.g. for limited, by
the researcher predefined number of choice sets) → in general,ODs require a larger sample size and lead to larger choice sets
• Behaviorally plausible choice scenarios: ODs may includedominant/weakly dominant/redundant choice sets → noinformation gain
• When working with preference constraints, orthogonalitycannot be maintained
• Need for more sophisticated approaches:Efficient experimental designs
SP Experimental Designs 14
Efficient designs: Some basic concepts
• Efficiency: For given design requirements (violating strictorthogonality), minimize the variances of parameter estimates,which are taken from the variance-covariance matrix of adesign
• D-Efficient GLM Designs: No prior information about theparameter values (signs, magnitude)Efficiency ↔ convergence towards orthogonality
• D-Efficient MNL Designs: Efficiency measures depend on the”unknown” parameter values one wants to estimate
• In many cases, one has some sound knowledge about the signand relative values of the design attributes (e.g. travel costand travel time both have a negative effect on utility, leadingto a positive value of time)
SP Experimental Designs 15
Example 1
2 13 4
Orthogonal design with travel time and travel cost (2 alternatives,3 levels): Quadrants 1 and 3 dominate quadrants 2 and 4
SP Experimental Designs 16
Example 2
-2-1
01
2C
ost_
MIV
- C
ost_
PT
-2 -1 0 1 2Time_MIV - Time_PT
WLS Predictions
Efficient design with travel time and travel cost (2 alternatives, 3levels): Elimination of dominant alternatives
SP Experimental Designs 17
Efficient designs: Some basic concepts
• Main question: How can the researcher make use of priorinformation in order to increase the efficiency (minimizestandard errors of the attributes, i.e. more robust results) andreduce the sample size requirements?
• Example 1: Orthogonal designs make no use prior information→ time and cost attributes are uncorrelated
• Example 2: Efficient design with no dominant alternativesautomatically leads to a negative correlation between timeand cost → forces respondents to trade-off and increases theamount of preference information given sample size
• D-Efficient MNL approach: Use expected parameterdistributions with µk and σk to calculate the ”optimal” design
SP Experimental Designs 18
D-Efficient GLM designs
• Find a design matrix Z , with rows selected from a Q x kmatrix X where n Q, that is optimal in some sense. Z isan n x k matrix, where k is the number of parameters and n isthe number of choice sets in the actual experiment
• Row-based Federov algorithm (R-package AlgDesign):Selection from a predefined candidature set (after exclusion ofdominant/redundant alternatives, etc.)
• Optimization criterion: Maximize k-th root of thedeterminant of the normalized dispersion matrix M ∝ Ω−1
• Assumption: Observations are independent and error termsare normally distributed
max .
D − Efficiency = det(Z ′Z
n
)1k
(3)
SP Experimental Designs 19
D-Efficient MNL designs
• Asymptotic variance-covariance (AVC) matrix for discretechoice models depends on the ”true” parameter values
• Starting point: Need to make assumptions about the model,utility functions and parameter values
• Design matrices Z are created using a column-based swappingalgorithm: Selection of attribute levels over all choicesituations for each attribute
• Optimization criteria: Minimize k-th root of the determinantof the AVC matrix Ω
min.
D − Error = det(
Ω(Z , β))1
k
(4)
SP Experimental Designs 20
Some remarks on D-Efficient designs
• Large number of different algorithms and optimization criteriaexist (focus on D-Efficiency as most common approach in theliterature)
• Eliminating ”undesirable” choice sets has to be done manuallyby using preference constraints
• GLM designs: Can be created in the open-source software R.Robust towards misspecification of priors and often as efficientas MNL designs
• MNL designs: Created in the commercial software NGENE.Easier to implement, more assistance and possibilities.
• Priors usually come from the literature, intuition and pre-teststudies. Misspecification can be minimized by assuming arandom distribution of priors (Bayesian approach)
SP Experimental Designs 21
An example of a design strategy
• 9 attributes with 3 levels (39 full factorial), 2 labeledalternatives, 32 choice sets with 4 blocks, estimation of alllinear main effects, quadratic effects and 6 selected two-wayinteractions (9+9+6+1 degrees of freedom)
• Polynomial and interaction effects have to be specified in theutility function of a design
• No weakly dominant alternatives (i.e. all attribute values ofone alternative in choice set s are strictly better or equal: a14 a2 or a1 < a2)
• No strongly dominant travel time relative to travel costalternatives or vice versa (i.e. a1,cost ≺ a2,cost and a1,time ≺a2,time or vice versa)
• ”Weak” priors to determine the direction of expected effects
SP Experimental Designs 22
Efficient design (example)
• 4 attributes with 3 levels, 3 ”unlabelled” choice alternatives,34∗3 possible attribute level combinations: Minimum of 8choice sets to estimate all 4 (generic) main-effects
• Weak priors, exclusion of all dominant choice sets• Free choice about the number of choice sets
(# choice sets > df )
Set TT1 TC1 AC1 QU1 TT2 TC2 AC2 QU2 TT3 TC3 AC3 QU3
1 1 3 1 1 3 2 3 2 3 1 3 32 1 3 3 3 2 1 1 2 3 3 3 13 3 2 1 1 2 3 1 2 1 2 2 24 1 1 3 2 3 3 2 1 2 3 1 25 2 1 2 2 3 2 1 1 1 3 1 2.. .. .. .. .. .. .. .. .. .. .. .. ..17 1 3 3 3 2 2 1 1 3 1 2 118 3 2 1 2 1 3 1 2 1 3 3 1
−→ no more dominant/weakly dominant/redundant choice sets
SP Experimental Designs 23
Some general remarks
• Experimental design creation is a research topic on its own(Rose and Bliemer, 2009; Quan et al., 2011)
• If priors are misspecified, one might run into troubles. Becareful when using priors!
• Use attributes, values and trade-off variations that are”plausible”
• Make sure that there are some overlapping values of genericattributes between alternatives (pivot designs)
• Order effects: Randomize order of alternatives acrossrespondents in the questionnaire
• Carefully introduce respondents to the (hypothetical) scenarioand explain the attributes you are presenting to them
SP Experimental Designs 24
Pivot designs
• It is preferable to base variations around values for observedbehavior (state-of-the-art in transportation research):Calculate design with placeholder values (e.g. 1,2,3) andreplace them by relative changes (e.g. 0.7, 1.1, 1.5)
• Combination of RP data with variations given by the design→ respondents can better identify with the presented choicescenarios; much more variation in the attribute levels
• Possible to include one reference alternative in the choice sets(e.g. bike travel time, whose value is not varied)
• Problems:– If reference values are (highly) dominant, the respondents
will more likely choose the respective alternative (only inlabeled experiments)
– Correlation between attributes; skewnessSP Experimental Designs 25
Pivot designs: Trade-off distribution
Example where MIV is often cheaper and faster than PT →modification of reference values needed!
SP Experimental Designs 26
Testing a design: A case study
• Once you have your design, you should test the performanceof estimating the coefficients of interest, based on simulationof a more or less hypothetical population
• Define priors for the attribute weights of the utility functionbased on recent similar studies
• Simulate error structure (GEV) for the utility function takinginto account the panel structure of the designs
• Calculate individual utilities and determine the chosenalternatives for each simulated subject
• Estimate the parameters for the simulated data and comparethe results with the a-priori assumptions
SP Experimental Designs 27
Testing a design: Pivot approach
Experimental design: 9 attributes with 3 levels (39 full factorial), 2labeled alternatives, 32 choice sets (8 per subject)Reference values taken from a Swiss mode choice experiment:
• Total travel time: For PT alternative = travel time withoutaccess and egress time; for MIV alternative = travel time +parking search time
• Total travel cost: For PT alternative = ticket price; for MIValternative = fuel cost + parking cost
• Number of transfers: PT alternative only
Attribute Effect code: −1 0 1
Travel time (MIV and PT) [%] 70 90 110Travel cost (MIV and PT) [%] 80 110 130Delay prob. (MIV and PT) [%] 5 10 20Walking / waiting time (MIV and PT) [min.] 5 10 20Number of transfers (PT) [#] −1 0 1
SP Experimental Designs 28
Testing a design: A-priori Coefficients
• Prior values for the individual weights of attribute k,βik ∼ N(µk , σk), and alternative-specific constants aresimulated based on results obtained from the linear model inthe BMVI Zeitkostenstudie (Axhausen et al., 2014)
• For each simulated individual i , the same βik is used over all 8choice sets, representing the panel structure of the experiment
Coefficient Mean SD Type
ASCMIV 0.172 0.5 Alternative-specificβtimeMIV -0.022 0.005 Alternative-specificβtimePT -0.021 0.005 Alternative-specificβcost -0.088 0.01 GenericβdelayMIV -0.058 0.01 Alternative-specificβdelayPT -0.027 0.005 Alternative-specificβwalk -0.047 0.01 GenericβtransfersPT -0.4 0.1 Alternative-specific
VOTMIV 15.0 CHF / hVOTPT 14.2 CHF / h
Number of simulated 400coefficient vectors βik
SP Experimental Designs 29
Testing a design: Utility Function
The random utility model framework (RUM) assumes that in eachchoice set s, individual i perceives utility Uijs for each alternative jamong the full set of alternatives J (MIV and PT), given theattributes Xijs , and chooses the one that maximizes utility. Uijs hasan observed component Vijs and an unobserved component εijs :
Uijs = Vijs + εijs (5)
whereεijs ∼ GEV (0, 1, 0) (6)
and
Vijs =K∑
k=1βikXijsk (7)
SP Experimental Designs 30
Testing a design: Choice simulation
The chosen alternatives choiceis are calculated as follows:
if Uis,MIV > Uis,PT : choiceis =
MIVelse PT
(8)
Snippet of a simulated discrete choice data set:
id set alt choice block time cost walk delay transfers(min.) (CHF) (min.) (min.) (#)
.... . . . . ... ... .. .. ...3 8 MIV 1 1 40 12 20 5 -3 8 PT 0 1 51 16 5 5 34 1 MIV 0 1 73 27 10 20 -4 1 PT 1 1 53 26 20 5 04 2 MIV 1 1 54 27 20 20 -
.... . . . . ... ... .. .. ...
SP Experimental Designs 31
Testing a design: Estimation
• For given randomly drawn subsets of RP data, simulated βikcoefficient vectors and simulated error terms εijs , the modelsare estimated for 3 different designs. This process is repeated2000 times to get insights into the distributions of coefficients(robustness), variances (precision) and values of time
• The between-design differences of E(βk) and E(SEk) withrespect to the a-priori parameters are small
Design approach: GLM MNL: β0 MNL: βkE(βk ) E(SEk ) E(βk ) E(SEk ) E(βk ) E(SEk )
ASCMIV 0.174 0.124 0.176 0.121* 0.175 0.133βtimeMIV -0.020 0.002 -0.021 0.002 -0.020 0.002βtimePT -0.019 0.002 -0.019 0.002 -0.019 0.002βcost -0.081 0.010 -0.082 0.009 -0.082 0.009βdelayMIV -0.054 0.006* -0.054 0.006 -0.054 0.007βdelayPT -0.025 0.006 -0.025 0.006* -0.025 0.006βwalk -0.044 0.004* -0.044 0.004 -0.044 0.005βtransfersPT -0.371 0.049 -0.373 0.050 -0.375 0.051
VOTMIV 16.1 16.4 16.1VOTPT 14.9 15.2 15.0
SP Experimental Designs 32
Conclusions
• No substantial differences between the different designapproaches: Designs are robust and reproduce the a-priorivalues well
• From a behavioral perspective, one should always excludedominant and weakly dominant alternatives!
• Personal suggestion: Create an efficient design by ...– carefully thinking about your research question and aims– assigning about 8 choice sets to a respondents and using
a block-design (total # of choice sets ≈ 1.5 ∗ df )– using MNL approach with zero (or weak) priors,
excluding ”undesired” choice sets by manually settingpreference conditions
– updating your design after a pre-test study
SP Experimental Designs 33