sylvie huet
DESCRIPTION
Sylvie Huet. Modelling from data: an experience in modelling rural demography. Laboratoire d’Ingénierie pour les Systèmes Complexes. From data to models Cergy-Pontoise, 27-28 june 2013. Context: demography in rural municipalities. Evolution du rural in Europe? - PowerPoint PPT PresentationTRANSCRIPT
www.irstea.fr
Pour mieux affirmer ses missions, le Cemagref devient Irstea
Modelling from data: an experience in modelling rural demography
From data to models Cergy-Pontoise, 27-28 june 2013
Sylvie Huet
Laboratoire
d’Ingénierie pour les Systèmes
Complexes
2
Context: demography in rural municipalities2
Evolution du rural in Europe?
Coupling demography and residential mobility of people in order to study their evolution at a very local scale: the municipality level
In-migration
Demography
Residential
mobility
Out-migration
3
Context: demography in rural municipalities3
Coupling microsimulation and agent-based modelling
No integrated theories so extracted and using data to build a globally coherent theory through the dynamic modelling approach at the individual level and at the municipality level
Decision-support in demography generally uses microsimulation modeling
(O'Donoghue, C. (2001), Li and O’Donoghue, 2012).
Space and residential mobility
A first instance on the Cantal population (French region) (Huet et al 2012a, 2012b).
4
Problem
1. An interesting motivation2. A well identified overall modelling choice3. A marvellous applied research question
But data! As a constraint, as theories, as results…
“Pas de chichis, pas de blabla, que des résultats”
4
5
5
1. Finding and censing data 2. Choosing data for dynamic modelling
Summary: everything through the prism of data
6
6
tenacity… and then
1. Can’t built a specific survey: too large problem
2. Can’t use a reweighted sample of individual: not enough and too much difficult to access
Finding data
7
Finding and make the census of data7
At first, we had nothing…
and finally we have too much!
Enquêtes générations
1988, 1998, …
Histoires familiales
Histoire de vie 2003
Distribution des salaires (INSEE)
Enquête Emploi
Labour Force Survey
Household Panel
Corinne Land Cover
Recensements agricoles 1988,
1998, 2005
Inventaire Communal 1998, 1998
Base permanente des
Equipements
Recensements 1990, 1999,
2006, …
Tables de mobilité
1999
Enquête logements
Revenus des ménages ISSP sens
du travail
SIRENE entreprises
Finances Communales
DGF
Réseau chambre d’hôtes
Taxes de séjour
SITADEL (logements)
Confusion
8
Changing confusion in results8
DATA MODELLINKING
Criteria to choose
9
Summary
9
1. Finding and censing data 2. Choosing data for dynamic modelling
10
Criteria to choose among all the data?10
1. Quantity of work
2. People and ideas
3. Building the various dynamics (and their couplings)
4. Calibrating and validating the model
11
1. Criteria: time and cognitive costs!11
The ones we don’t really talked about linked to the quantity of work
• Cost in terms of investigation of the data sources• Easiness to use statistical tool and representativity• Possible reuse of generic objects and dynamics in other
countries
12
What a costly approach! 12
List of questionsList of variables (not necessarily the direct answers to questions)List of modalities for a variablesRepresentativity at various scales, for various population…Understanding hiden/above model, theories
Require to study for every possible source:
Laborious, difficult, not valorised,…
not publishable, not a research problem, too
long to explain…
A lot of people always use the same survey as we use the same tools or the same methods
13
2. Criteria: working with people and ideas13
In interdisciplinary work, the ones you don’t think a priori:
• Understandable for involved people (and comparable with other models)
• Working with research partners
A compromise to decide aboutOr who you are going
not to understand
14
Criteria: working with researchers and ideas
14
• The existing/choosing data are not collected under their theories’ hypothesis: misunderstood, disagreement
• Some, especially modellers, don’t use data usually • Some, especially modellers, have difficulties to
understand what individual based modelling means
Why not to use the wages?
15
3. Criteria: building the various dynamics15
To build the various dynamics (and their couplings)
• Possible interconnectivity of various sources
Example: using conjointly the LFS and the Census, giving both the “same” activity sectors and socio-professional category allowing to define the employment offer at the municipality level (Census) and the way an individual choose an employment and change it (LFS)
16
Criteria: building the various dynamics16
• Problem of the statistical representation (example of low density areas representing a small part of the population: 39% zones rurales ou périurbaines)
Census: rare datasources at low level and rare theories and/or knowledge
Example in Cantal: number of farmers in Cantal; no problem to access to a lodging but problem to access services)
European Household Panel or National Census?
17
Criteria: building the various dynamics starting from wrong data
17
With the wrong data, in sense of irrelevant, not convenient, chosen for their capacity to « reveal » a relevant dynamics
The number of in and out migrants has this property since it links every processes related to mobility, starting from the decision to move
18Choosing a decision to move: “checking model”
17025
17075
18
nbSizes
iips cd1
17025
17075
15exp1 anbSizes
iips cd
Old people move too much for a decision only based on the size of the current housing
Family reasons are the most cited reasons for the decision to move (impact on needed size)
19
Assessing the chosen decision to move
LITTERATURE (statistical analysis from data)(Debrand and Taffin 2006) notice that moving decreases with age
But also the move to a large housing is much more common than the move to a smaller one
19
And finally we can also reproduce the critical values, and more simply, deciding to move with a lower probability when the need is to decrease the residence size
20
Choosing dynamics to ensure consistency (in case again of wrong data)
20
Counterintuitive choices to ensure the consistency between endogenous submodels, being parameterised from calibration, and exogenous submodels, parameterised from data.
Example: residential mobility modelling, people are susceptible to migrate out the region if and only if they have found a new residence place inside the region!
=> only because we only know about the probability to quit the region versus moving inside the region (ie problem of the unknown decision to move)
21
4. Criteria: calibration and validation
21
To calibrate (finding out the parameters of the dynamics chosen through the checking-model procedure) and validate the model:
• Temporal continuity of the definitions and availability, comprising also the initial state (ex. : 1990, 1999, 2006, dwelling size…)
• Relevance of the spatial scale at which the data are available
• Critical indicators about the temporal evolution, especially related to “initially” unknown dynamics
Example for Cantal…
22
The Cantal: data for calibration
19992000-2006
22
A DECREASING POPULATION BUT AN INCREASING MIGRATORY BALANCE (switching during the period) AND A DECREASING NATURAL BALANCE
23
The Cantal: data for calibration
decreasing municipalities: redincreasing municipalities: blue
2000-2006
23
WITH A LARGE HETEROGENEITY OF THE TENDENCIES AT LOCAL LEVEL
24
The Cantal: data for calibration
1999
2000-2006
24
WITH A STRONG SPATIAL CONSTRAINT
A LOT OF MOVES DESPITE A WEAK MIGRATORY BALANCE
133459
116461
17025
17075
9814
11905
1990-1999
2000-2006
25
An almost impossible calibration despite the data and because of the data
25
Aim at respecting the tendency (not only the absolute difference to various measures of the time). What about a small overall distances if the tendency is not the same? A combination of every tendencies is almost impossible to obtain… Require a quasi continuous loop of rebuilding the model
Small distance but bad tendency
26
A never ending validation26
Too many data in a way… how choosing to restrict the validation process? I don’t know at this stage.
Similarly to the calibration problem, you can’t be satisfied since you have a lot of data, almost all the data you have not retain for building the initialisation or the dynamics
27
Synthesis at this point of my study of what data brings into the dynamic modelling at low level of large systems
27
Finally very difficult to use as a predictive tool even if microsimulation (built from data) are usually built for this reason and considered as reliable since it propose a consistent theory extracted from data
Much more useful (probably even classical theoretical approach or discrete choice models) to learn about composing dynamics since they consider a lot of coupling dynamics (instead hypothesizing they are neglectable) : checking dynamics procedure
Data challenges the interdisciplinary work (instead of simplifying)!
28
What a richness and a nightmare!28