nicola ortelli michel bierlaire

33
Assisted specification of discrete choice models Nicola Ortelli Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique F´ ed´ erale de Lausanne October 24, 2021 Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 1 / 33

Upload: others

Post on 31-May-2022

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nicola Ortelli Michel Bierlaire

Assisted specification of discrete choice models

Nicola Ortelli Michel Bierlaire

Transport and Mobility LaboratorySchool of Architecture, Civil and Environmental Engineering

Ecole Polytechnique Federale de Lausanne

October 24, 2021

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 1 / 33

Page 2: Nicola Ortelli Michel Bierlaire

Introduction

Outline

1 Introduction

2 Assisted Specification of Discrete Choice Models [Ortelli et al., 2021]

3 Experiments

4 Conclusion

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 2 / 33

Page 3: Nicola Ortelli Michel Bierlaire

Introduction

Model development

1 Behavioral hypothesis.

2 Model specification.

3 Model estimation.

4 Test and validation.

5 Go to 1.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 3 / 33

Page 4: Nicola Ortelli Michel Bierlaire

Introduction

Context

Specifying DCMs is difficult

No “absolute rule” and exhaustive testing is intractable.

Usual approach: expert intuition and trial-and-error.

DCMs & ever-larger datasets

Significant growth of collected data:

“Taller” data — more observations;“Wider” data — more variables.

Consequently, two major problems for DCMs: estimation &specification.

→ Easy way out: use machine learning instead!

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 4 / 33

Page 5: Nicola Ortelli Michel Bierlaire

Introduction

DCMs vs. ML

ML � DCMs

Scalability — better at handling large datasets.

Data-driven — no need for presumptive structural assumptions.

DCMs � ML

Extrapolation — need for theories to explore out of the data.

Interpretability — important for trust in the model.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 5 / 33

Page 6: Nicola Ortelli Michel Bierlaire

Introduction

The Importance of Behavioral Realism

“Predictions made by the model are conditional on the cor-rectness of the behavioral assumptions and, therefore, areno more valid than the behavioral assumptions on which themodel is based. A model can duplicate the data perfectly,but may serve no useful purpose for prediction if it representserroneous behavioral assumptions. For example, consider apolicy that will drastically change present conditions. In thiscase the future may not resemble the present, and simple ex-trapolation from present data can result in significant errors.However, if the behavioral assumptions of the model are wellcaptured, the model is then valid under radically differentconditions.”

[Ben-Akiva, 1973]

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 6 / 33

Page 7: Nicola Ortelli Michel Bierlaire

Introduction

Motivation

Main premise

Human expertise is fundamental to formulate plausible hypothesesand to verify their compliance with behavioral theories. Data — nomatter how big — cannot replace such knowledge.

Still, can we use data to mitigate the need for a model specificationto be known a priori, without compromising model interpretability?

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 7 / 33

Page 8: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs

Outline

1 Introduction

2 Assisted Specification of Discrete Choice Models [Ortelli et al., 2021]

3 Experiments

4 Conclusion

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 8 / 33

Page 9: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs

Journal of Choice Modelling 39 (2021) 100285

Available online 29 March 20211755-5345/© 2021 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license(http://creativecommons.org/licenses/by-nc-nd/4.0/).

Assisted specification of discrete choice models

Nicola Ortelli a,b,*, Tim Hillel b, Francisco C. Pereira c, Matthieu de Lapparent a, Michel Bierlaire b

a School of Management and Engineering Vaud (HEIG-VD), University of Applied Sciences and Arts Western Switzerland (HES-SO), Switzerland b Transport and Mobility Laboratory (TRANSP-OR), Ecole Polytechnique Federale de Lausanne (EPFL), Switzerland c Machine Learning for Smart Mobility Group (MLSM), Danmarks Tekniske Universitet (DTU), Denmark

A R T I C L E I N F O

Keywords: Discrete choice models Utility specification Multi-objective optimization Combinatorial optimization Metaheuristics

A B S T R A C T

Determining appropriate utility specifications for discrete choice models is time-consuming and prone to errors. With the availability of larger and larger datasets, as the number of possible specifications exponentially grows with the number of variables under consideration, the analysts need to spend increasing amounts of time on searching for good models through trial-and-error, while expert knowledge is required to ensure these models are sound. This paper proposes an algorithm that aims at assisting modelers in their search. Our approach translates the task into a multi-objective combinatorial optimization problem and makes use of a variant of the variable neighborhood search algorithm to generate sets of promising model specifications. We apply the algorithm both to semi-synthetic data and to real mode choice datasets as a proof of concept. The results demonstrate its ability to provide relevant insights in reasonable amounts of time so as to effectively assist the modeler in developing interpretable and powerful models.

1. Introduction

In the last 40 years, discrete choice models (DCMs) have been used to tackle a wide variety of demand modeling problems. This is due to their high interpretability, which allows researchers to verify their compliance with well-established behavioral theories (McFadden, 1974) and to provide support for policy and planning decisions founded on utility theory from microeconomics. However, the development of DCMs through manual specification is laborious. The predominant approach for this task is to a priori include a certain number of variables that are regarded as essential in the model, before testing incremental changes in order to improve its goodness of fit while ensuring its behavioral realism (Koppelman and Bhat, 2006). Because the set of candidate specifications grows beyond manageable even with a moderate number of variables under consideration, hypothesis-driven approaches of this kind can be time-consuming and prone to errors. Modelers tend to rely on common sense or intuition, which may lead to incorrectly specified models. The implications of misspecification include lower predictive power, biased parameter estimates and erroneous in-terpretations (Bentz and Merunka, 2000; Torres et al., 2011; Van Der Pol et al., 2014).

This issue, together with the advent of big data and the need to analyze ever-larger datasets, has induced an increasing focus on machine learning (ML) and other data-driven methods as a way of relieving the analyst of the burden of model specification. Unlike

* Corresponding author. School of Management and Engineering Vaud (HEIG-VD), University of Applied Sciences and Arts Western Switzerland (HES-SO), Switzerland.

E-mail addresses: [email protected], [email protected] (N. Ortelli), [email protected] (T. Hillel), [email protected] (F.C. Pereira), [email protected] (M. de Lapparent), [email protected] (M. Bierlaire).

Contents lists available at ScienceDirect

Journal of Choice Modelling

journal homepage: http://www.elsevier.com/locate/jocm

https://doi.org/10.1016/j.jocm.2021.100285 Received 8 July 2020; Received in revised form 4 December 2020; Accepted 16 March 2021

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 9 / 33

Page 10: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs

Disclaimer

What our method does

Assist modelers.

Save them as much time as possible.

Provide a broader understanding of any dataset.

Search more thoroughly than any human modeler would care to.

What our method does not

Replace the analyst.

Provide the best possible model for a given dataset.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 10 / 33

Page 11: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Specification of DCMs as an Optimization Problem

Specification of DCMs...

Data

For each individual n, we have:

a vector sn of socioeconomic characteristics;a vector xin of attributes for each alternative i ;the chosen alternative in.

“Information set”

The analyst provides:

a partition of attributes into K groups;L potential transformations of the attributes;S segmentations of the population;M potential models Pm(i |V ;µ), m = 1, . . . ,M.

→ What is the “best” specification we can come up with?

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 11 / 33

Page 12: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Specification of DCMs as an Optimization Problem

Specification of DCMs... as an Optimization Problem

Decision variables

Five sets of binary variables:

δk = 1 if group k is in the model;γk = 1 if group k is associated with a generic coefficient;φk` = 1 if group k is associated with nonlinear transform `;σks = 1 if the coefficients of group k are segmented based on s;ρm = 1 if model m is used to calculate the choice probabilities.

ω = (δ, γ, φ, σ, ρ) unequivocally describes a model specification.

Objective function

Goodness of fit, i.e., log likelihood L(ω)?

Parsimony, i.e., number of parameters Z(ω)?

Why not both?

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 12 / 33

Page 13: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Specification of DCMs as an Optimization Problem

Multi-Objective Optimization

Two objective functions!

We want the best fit (maximize L) with the least parameters(minimize Z).

Conflicting objectives: improving one deteriorates the other.

AIC and BIC quantify the trade-off between L and Z...

... but we don’t need to!

How do we rank solutions then?

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 13 / 33

Page 14: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Specification of DCMs as an Optimization Problem

Pareto Optimization

Dominance

Solution ω1 dominates ω2 (ω1 � ω2) if:

not worse in any objective:

L(ω1) ≥ L(ω2) and Z(ω1) ≤ Z(ω2),

strictly better in at least one objective:

L(ω1) > L(ω2) or Z(ω1) < Z(ω2).

Pareto front PSet of non-dominated solutions (if ω ∈ P, no feasible ω′ such thatω′ � ω).

All solutions in P are considered equally good.

Solution to a multi-objective problem.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 14 / 33

Page 15: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Proposed Algorithm

Multi-Objective Variable Neighborhood Search

Metaheuristics

Too many feasible solutions!

Explore the search space efficiently.

“Sufficiently good solutions in reasonable amounts of time.”

Description

Iteratively improve an approximation of the Pareto front.

Start with one or several user-defined candidates.

Three ingredients:

Exploration — how we move in the search space,Intensification — how we find local optima,Diversification — how we escape local optima,

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 15 / 33

Page 16: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Proposed Algorithm

Operators

Exploration — how we move in the solution space

Generate “neighbors” of a model.

Mimic what an experienced modeler would do.

Each operator modifies a subset of the decision variables.

The complexity of the modification depends on the “size” of theneighborhood.

Operators are invoked randomly.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 16 / 33

Page 17: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Proposed Algorithm

Operators

Change variables δk ← 1− δk

Change generic * γk ← 1− γk

Change non-linearity * {(k , l) | φkl = 1} : φk` ← 0, φk`′ ← 1

Change linearity *

{If φk0 = 1 : φk0 = 0, φk`′ = 1

If φk0 = 0 : φk0 = 1, φk1 = · · · = φkL = 0

Change segmentation * σks ← 1− σksIncrease segmentation * {(k , s) | σks = 0} : σks ← 1

Decrease segmentation * {(k , s) | σks = 1} : σks ← 0

Change model {m | ρm = 1} : ρm ← 0, ρm′ ← 1

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 17 / 33

Page 18: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Proposed Algorithm

Generic VNS Algorithm

Intensification — how we find local optima

Local search:

Select model ω ∈ P at random;Generate and estimate neighbor ω′.

Update the Pareto front P when appropriate:

Add ω′ to P if {ω ∈ P | ω � ω′} = Ø;Remove from P all solutions dominated by ω′.

Diversification — how we escape local optima

Systematic changes of neighborhood size p = 1, . . . ,P.

After Q unsuccessful iterations, p ← p + 1.

After each successful iteration, p ← 1.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 18 / 33

Page 19: Nicola Ortelli Michel Bierlaire

Assisted Specification of DCMs Proposed Algorithm

Ensuring Behavioral Realism

Consistency with behavioral intuition is important

User-defined constraints to verify model validity.

Two options:

Hard — reject all models that violate the constraints (post-estimation).Soft — constrained maximum likelihood estimation. [Schoenberg, 1997]

Why constrained maximum likelihood?

Hard constraints may conceal severe specification issues.

Fewer rejected models → more thorough search.

All estimates guaranteed to be “behaviorally valid”.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 19 / 33

Page 20: Nicola Ortelli Michel Bierlaire

Experiments

Outline

1 Introduction

2 Assisted Specification of Discrete Choice Models [Ortelli et al., 2021]

3 Experiments

4 Conclusion

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 20 / 33

Page 21: Nicola Ortelli Michel Bierlaire

Experiments In-Sample/Out-of-Sample Comparison

Data

Swissmetro dataset [Bierlaire et al., 2001]

SP data.

3 alternatives.

10’710 observations, 20% as validation data.

Problem size

K = 3, L = 5, S = 7, M = 3.

≈ 4.6× 108 possible specifications.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 21 / 33

Page 22: Nicola Ortelli Michel Bierlaire

Experiments In-Sample/Out-of-Sample Comparison

In-Sample

0 25 50 75 100 125 150 175 200Number of estimated parameters

6000

6500

7000

7500

8000

Neg

.lo

glik

elih

ood

Pareto frontRemovedConsideredInitial modelAIC-optimalBIC-optimalOOS-optimal

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 22 / 33

Page 23: Nicola Ortelli Michel Bierlaire

Experiments In-Sample/Out-of-Sample Comparison

Out-of-Sample

0 25 50 75 100 125 150 175 200Number of estimated parameters

0.70

0.75

0.80

0.85

0.90

0.95N

orm

aliz

edn

eg.

log

likel

ihoo

dPareto front, in-samplePareto front, out-of-sample

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 23 / 33

Page 24: Nicola Ortelli Michel Bierlaire

Experiments Semi-Synthetic Data

Data

Synthetic choices

Based on the Swissmetro data.

We define a “true model”:

Estimated on the original data.Used to sample new, synthetic choices.Can be reached by the algorithm!

Problem size (same as previous)

K = 3, L = 5, S = 7, M = 3.

≈ 4.6× 108 possible specifications.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 24 / 33

Page 25: Nicola Ortelli Michel Bierlaire

Experiments Semi-Synthetic Data

True Model Nearly Recovered

0 25 50 75 100 125 150 175 200Number of estimated parameters

8250

8500

8750

9000

9250

9500

9750

10000

Neg

.lo

glik

elih

ood

Pareto front

Removed

Considered

Initial model

True model

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 25 / 33

Page 26: Nicola Ortelli Michel Bierlaire

Experiments Alternate Initial Models

Data

London Passenger Mode Choice [Hillel et al., 2018]

RP data.

4 alternatives.

54’766 + 26’320 observations.

Problem size

K = 5, L = 3, S = 8, M = 4.

≈ 4.7× 1010 possible specifications.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 26 / 33

Page 27: Nicola Ortelli Michel Bierlaire

Experiments Alternate Initial Models

Initial Model: ASCs Only

0 25 50 75 100 125 150 175 200Number of estimated parameters

35000

40000

45000

50000

55000

60000

65000N

eg.

log

likel

ihoo

dPareto frontRemovedConsideredInitial modelAIC- & BIC-optimalOOS-optimal

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 27 / 33

Page 28: Nicola Ortelli Michel Bierlaire

Experiments Alternate Initial Models

Initial Model: Expert-Defined

0 25 50 75 100 125 150 175 200Number of estimated parameters

35000

40000

45000

50000

55000

60000

65000N

eg.

log

likel

ihoo

dPareto front

Removed

Considered

Initial model

AIC-optimal

BIC- & OOS-optimal

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 28 / 33

Page 29: Nicola Ortelli Michel Bierlaire

Conclusion

Outline

1 Introduction

2 Assisted Specification of Discrete Choice Models [Ortelli et al., 2021]

3 Experiments

4 Conclusion

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 29 / 33

Page 30: Nicola Ortelli Michel Bierlaire

Conclusion

Conclusion

Summary

Flexible approach for assisted specification of DCMs.

Appropriate use of data can partially relieve the modeler.

Sets of high-quality specifications in reasonable amounts of time.

Interpretable results!

Two major limitations

Computational cost.

Overfitting → This is why we should never replace the analyst!

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 30 / 33

Page 31: Nicola Ortelli Michel Bierlaire

Conclusion

Conclusion

Future work

Try other objectives! Out-of-sample performance?

Leverage the whole Pareto front. Model averaging?

Faster estimation:

Heuristics based on partial estimation.More complex optimization algorithms: HAMABS [Lederrey et al., 2021]

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 31 / 33

Page 32: Nicola Ortelli Michel Bierlaire

References

Experiments

Ortelli, N., Hillel, T., Pereira, F.C., de Lapparent, M., Bierlaire, M., 2021. Assisted specification of discrete choice models.J.Choice.Model 39, 100285.

Datasets

Bierlaire, M., Axhausen, K., Abay, G., 2001. The acceptance of modal innovation: the case of Swissmetro. In: Proceedingsof the 1st Swiss Transportation Research Conference.

Hillel, T., Elshafie, M., Jin, Y., 2018. Recreating passenger mode choice-sets for transport simulation: a case study ofLondon, UK.Proc.Inst.Civ.Eng. Smart.Infrastruct.Construct. 171 (1), 29-42.

Quotation

Ben-Akiva, M.E., 1973. Structure of Passenger Travel Demand Models (Ph.D. Thesis). Dept. of Civil Engineering,Massachusetts Institute of Technology, Cambridge, MA.

Faster estimation of DCMs

Lederrey, G., Lurkin, V., Hillel, T., Bierlaire, M., 2021. Estimation of discrete choice models with hybrid stochastic adaptivebatch size algorithms. J.Choice.Model 38, 100226.

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 32 / 33

Page 33: Nicola Ortelli Michel Bierlaire

Questions?

Nicola Ortelli, Michel Bierlaire (EPFL) Assisted Specification of DCMs October 24, 2021 33 / 33