demography with massive collaborative data · arthur charpentier a& ewen gallic acrem umr cnrs...

44
Demography with Massive Collaborative Data Arthur Charpentier a & Ewen Gallic a a CREM UMR CNRS 6211, Université de Rennes 1 & Chaire Actinfo Source: Wikipedia, https://en.wikipedia.org/wiki/The_Simpsons https://3wen.github.io/genealogy/ eRum 2018 Budapest, May 2018

Upload: phamduong

Post on 21-Oct-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Demography with Massive Collaborative Data

Arthur Charpentiera & Ewen Gallica

aCREM UMR CNRS 6211, Université de Rennes 1 & Chaire Actinfo

Source: Wikipedia, https://en.wikipedia.org/wiki/The_Simpsons

https://3wen.github.io/genealogy/eRum 2018

Budapest, May 2018

Introduction Methodology Representativity Mortality Migration References

Historical Demography : a large literature

• longitudinal data have been used in many projetctI Matthijs and Moreels (2010) (COR∗ )

• Antwerp, Belgium, 1846–1920, ≈ 125k events, ≈ 57k individualsI Mandemakers (2000)

• Netherlands, 1812–1922, ≈ 77k invidivualsI Bouchard et al. (1989) (BALSAC)

• Québec, Canada, since 17th Century, ≈ 2M events, ≈ 575k individualsI Bean et al. (1978)

• mainly Utah, USA, since 18th Century, ≈ 1.2M individual s

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 2/36

Introduction Methodology Representativity Mortality Migration References

Big Data and Collaborative Data

• Can we use collaborative data for historical demographic studies• A priori, yes, according to studies on longevity :

I Fire and Elovici (2015) data from WikiTree.com• +1M profiles, (unknown number of individuals)

I Cummins (2017) genealogical trees from FamilySearch.org• +1, 3M individuals

I Gavrilova and Gavrilov (2007) genealogical data from rootsweb• +75M individuals (dead)

I Gergaud et al. (2016) from wikipedia biographies• +1, 2M individuals

I Kaplanis et al. (2018) with genealogical trees from Geni.com• 13M individuals

• but they do not focus on representativity of their dataset

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 3/36

Introduction Methodology Representativity Mortality Migration References

Agenda

1 Introduction

2 Methodology

3 Representativity

4 Mortality

5 Migration

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 4/36

Introduction Methodology Representativity Mortality Migration References

1 Introduction

2 Methodology

3 Representativity

4 Mortality

5 Migration

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 5/36

Introduction Methodology Representativity Mortality Migration References

Geneanet Data

• Users can create their genealogical trees• For each event (birth, marriage, death), they (can) mention :

I namesI datesI locations

• Extraction we obtained :I individuals born between 1900 and 1910 in France : 238, 009 usersI extraction of complete trees from 238, 009 users : +700M recordsI among all records : focus on individuals born between 1800 and 1804 and their descendantsI cleaning the dataset

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 6/36

Introduction Methodology Representativity Mortality Migration References

Geneanet Dataset

• The sample contains :I individuals born in France between 1800 and 1804

• 1, 547, 086 individualsI their descendants, up to 3 generations

• 402, 190 children• 286, 071 grand-children• 222, 103 grand-grand-children

• Note : we only have access to public trees

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 7/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

ID_user ID_ns ID_num Name Surname Sex date_b date_m1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

• Each row : event for one individual in a tree (Birth (N), Marriage, Death)ID_user ID_ns ID_num Name Surname Sex date_b date_m

1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

• Each row : event for one individual in a tree (Birth (N), Marriage, Death)I location (name, latitude, longitude)

ID_user ID_ns ID_num Name Surname Sex date_b date_m1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

• Each row : event for one individual in a tree (Birth (N), Marriage, Death)I location (name, latitude, longitude)I date

ID_user ID_ns ID_num Name Surname Sex date_b date_m1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

• Individuals identified by the pair (ID_user, ID_ns)ID_user ID_ns ID_num Name Surname Sex date_b date_m

1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees from Raw Data

• Link to the parentsID_user ID_ns ID_num Name Surname Sex date_b date_m

1 daage besnard|jean|1 575 BESNARD Jean 1 18000227 182801212 denisgallienne besnard|louis|1 22771 BESNARD Louis 1 18040603 182511103 domiassi besnard|jean| 1748 BESNARD Jean 1 18000227 182801214 dutheilfr besnard|pierre| 729 BESNARD Pierre 1 18001221 182707035 dvivier1 besnard|louis|1 65196 BESNARD Louis 1 18001215 18291027

date_d type place Lat Long ID_num_m ID_num_f1 16810000 NM Longué, 0180 47.37806 -0.10806 4457 5742 18831027 ND Cunault, 49350 47.30833 -0.15389 994 16203 18560000 NM Longué, 49180 47.37806 -0.108064 N Gennes, 49350 47.34083 -0.23278 99 595 18490717 N Pommeraye, 49244 47.35528 -0.86028 43116 4063

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 8/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees

• Raw data are obtained from ascendant genealogy,• we want to study descendants, we need descendant genealogy.

{VictorMarie}-{Hugo}?1802-02-26

(Besançon,25)

{AdèleJulie}-

{Foucher}?1803-09-27(Paris, 75)

†1868-08-27(?)

{CharlesMélanie

Abel}-{Hugo}?1826-11-03(Paris, 75)

†1871-03-13(Bordeaux,

33)

{VictorMarie}-{Hugo}

?1802-02-26(Besançon,

25)†1885-05-22

(?)

{AdèleJulie}-

{Foucher}?1803-09-27(Paris, 75)

†1868-08-27(?)

{Adèle}-{Hugo}?1830-07-28

(Paris, 75)†1915-04-21

(?)

(a) Row Data: two individuals with the same parents

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 9/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees

• Raw data are obtained from ascendant genealogy,• we want to study descendants, we need descendant genealogy.

{VictorMarie}-{Hugo}?1802-02-26

(Besançon,25)

{AdèleJulie}-

{Foucher}?1803-09-27(Paris, 75)

†1868-08-27(?)

{CharlesMélanie

Abel}-{Hugo}?1826-11-03(Paris, 75)

†1871-03-13(Bordeaux,

33)

{VictorMarie}-{Hugo}

?1802-02-26(Besançon,

25)†1885-05-22

(?)

{AdèleJulie}-

{Foucher}?1803-09-27(Paris, 75)

†1868-08-27(?)

{Adèle}-{Hugo}?1830-07-28

(Paris, 75)†1915-04-21

(?)

(a) Row Data: two individuals with the same parents

{VictorMarie}-{Hugo}

?1802-02-26(Besançon,

25)†1885-05-22

(?)

{CharlesMélanie

Abel}-{Hugo}?1826-11-03(Paris, 75)

†1871-03-13(Bordeaux,

33)

{Adèle}-{Hugo}?1830-07-28

(Paris, 75)†1915-04-21

(?)

{AdèleJulie}-

{Foucher}?1803-09-27(Paris, 75)

†1868-08-27(?)

{CharlesMélanie

Abel}-{Hugo}?1826-11-03(Paris, 75)

†1871-03-13(Bordeaux,

33)

{Adèle}-{Hugo}?1830-07-28

(Paris, 75)†1915-04-21

(?)

(b) Data we need: all children from a couple

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 9/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees

• Individuals can appear several times in the datasetI use a simple algorithm to regroup them in six steps

ID_user ID_np ID_num Nom Prenom Sexe Date_N Date_D1 ericde78 jolly|pierre|5 9549 JOLLY Pierre 1 180004062 cfph89villy jolly|pierre|8 5142 JOLLY Pierre 1 18000406 186401203 ericde78 fournier|marie brigitte| 6688 FOURNIER Marie Brigitte 2 176000004 ericde78 jolly|claude| 9487 JOLLY Claude 1 17570524 182412095 cfph89villy fournier|marie brigitte| 1351 FOURNIER Marie Brigitte 2 17631005 182712276 cfph89villy jolly|claude|2 1745 JOLLY Claude 1 17570524 18241209

Type Lieu Lat Long ID_num_m ID_num_p1 N Villy,89800 3.75111 47.86778 6688 94872 ND Villy 3.75111 47.86778 1351 17453 M Lignorelles,89800 3.72750 47.86306 495 67134 M Lignorelles,89800 3.72750 47.86306 16871 95475 NM Lignorelles 3.72750 47.86306 167 13606 M Lignorelles 3.72750 47.86306 4236 1906

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 10/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees• Individuals can appear several times in the dataset

I use a simple algorithm to regroup them in six stepsI close names (e.g., Jean or Jehan) according to some metric

ID_user ID_np ID_num Nom Prenom Sexe Date_N Date_D1 ericde78 jolly|pierre|5 9549 JOLLY Pierre 1 180004062 cfph89villy jolly|pierre|8 5142 JOLLY Pierre 1 18000406 186401203 ericde78 fournier|marie brigitte| 6688 FOURNIER Marie Brigitte 2 176000004 ericde78 jolly|claude| 9487 JOLLY Claude 1 17570524 182412095 cfph89villy fournier|marie brigitte| 1351 FOURNIER Marie Brigitte 2 17631005 182712276 cfph89villy jolly|claude|2 1745 JOLLY Claude 1 17570524 18241209

Type Lieu Lat Long ID_num_m ID_num_p1 N Villy,89800 3.75111 47.86778 6688 94872 ND Villy 3.75111 47.86778 1351 17453 M Lignorelles,89800 3.72750 47.86306 495 67134 M Lignorelles,89800 3.72750 47.86306 16871 95475 NM Lignorelles 3.72750 47.86306 167 13606 M Lignorelles 3.72750 47.86306 4236 1906

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 10/36

Introduction Methodology Representativity Mortality Migration References

Construction of Genealogical Trees

• Individuals can appear several times in the datasetI use a simple algorithm to regroup them in six stepsI close names (e.g., Jean or Jehan) according to some metric

• We obtain 2, 457, 450 unique individuals in the sample

ID Nom Prenom Sexe Date_N Long_N Lat_N Date_D Long_D Lat_D3301509 jolly pierre 1 18000406 3.75111 47.86778 18640120 3.75111 47.86778

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 10/36

Introduction Methodology Representativity Mortality Migration References

Individuals from the Dataset

Figure 1: Distribution of the year of birth, per generation

0

5

10

1800 1820 1840 1860 1880 1900 1920 1940Annee de naissance

Nom

bre

d’i

nd

ivid

us

(log

)

Aıeux Enfants Petits-enfants Arriere-petits-enfants

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 11/36

Introduction Methodology Representativity Mortality Migration References

1 Introduction

2 Methodology

3 Representativity

4 Mortality

5 Migration

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 12/36

Introduction Methodology Representativity Mortality Migration References

Number of Births

Figure 2: Proportion of birth per département in the sample, compared with official data (INSEE).

Femmes Hommes

1020304050

%

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 13/36

Introduction Methodology Representativity Mortality Migration References

Gender Ratio

• Male Rate (# male births / # female birth × 100) for individuals born between 1800 and1804 :

I Geneanet : 117 (born between 1800 and 1804)I Geneanet : 116 (entire sample)I INSEE : 105 (in 1801)I Blayo and Henry (1967) : 105.4 (between 1740 and 1829, in Brittany and Anjou)

• ⇒ sexist biasI Gavrilov and Gavrilova (2001) observed those bias with genealogical data (royal families)

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 14/36

Introduction Methodology Representativity Mortality Migration References

Gender Ratio : small spatial heterogeneity

Figure 3: Proportion of men per département.

0.520.530.540.55

%

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 15/36

Introduction Methodology Representativity Mortality Migration References

1 Introduction

2 Methodology

3 Representativity

4 Mortality

5 Migration

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 16/36

Introduction Methodology Representativity Mortality Migration References

Mortality Analysis

• Use of dates of birth and death for each individual• Calculate age at death• Study mortality dof individuals

I on national scaleI on regional scale

• Two measures :I (conditional) cumulative survival distributionI life expectancy (at birth)

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 17/36

Introduction Methodology Representativity Mortality Migration References

Following cohorts

Figure 4: Lexis diagram. “ Life expectancy at birth isequal to the average life spanof a notional generation

that would know throughoutits lifetime the age-specificmortality conditions of theyear considered. (INSEE) ”

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 18/36

Introduction Methodology Representativity Mortality Migration References

Following cohorts

• We study 5 cohorts : born from 1800 to 1804 (813,551 individuals)• At each age :

I how many are still alive ?I how many died ? (exposure)

• Comparison with other life tables (Vallin and Meslé, 2001)I official life tables start in 1806I estimate mortality above 6 years oldI huge bias on infant mortality

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 19/36

Introduction Methodology Representativity Mortality Migration References

Survival Function

Figure 5: Comparing survival functions (left) and force of mortality (right) for women and men, compared with historical data.

Probabilite de survie Force de mortalite (en log)

10 20 30 40 50 60 70 80 90 100 110 10 20 30 40 50 60 70 80 90 100 110-6

-4

-2

0

0.00

0.25

0.50

0.75

1.00

age

Geneanet Historique

Femmes HommesA. Charpentier & E. Gallic Demography with Massive Collaborative Data - 20/36

Introduction Methodology Representativity Mortality Migration References

Life Expectancy

• Calculations on the entire populationI no more lower bound for the age

• Computation per gender• On our data :

I 41.8 years for womenI 43.5 years for men

• Comparison with Vallin and Meslé (2001) :I 38.1 years for womenI 36.3 ears for men

• Bias in the estimation of life expectancy at birth

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 21/36

Introduction Methodology Representativity Mortality Migration References

Spatial Heterogeneity

• Beside this bias, can we obtain known spatial differences ?• Comparison with (van de Walle, 1973) :

I women onlyI cohorts from 1801 to 1810

• Methodology :I Life expectancy of women born between 1801 and 1804, per départementI difference département vs. nation

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 22/36

Introduction Methodology Representativity Mortality Migration References

Spatial Heterogeneity

Figure 6: Life expectancy at birth of women per département .

Geneanet van de Walle

-10 -5 0 5 10 15Ecart a la moyenne, par departement (annees)

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 23/36

Introduction Methodology Representativity Mortality Migration References

1 Introduction

2 Methodology

3 Representativity

4 Mortality

5 Migration

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 24/36

Introduction Methodology Representativity Mortality Migration References

Between Generation Migration - département scale

• Exploiting coordinates of locations of birth of individuals• migration between generations

• départements perspective• Comparing

I location of the birth of an individualI location of birth of descendants

• Calculation of migrants proportion (i.e., proportion to be born on a different département

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 25/36

Introduction Methodology Representativity Mortality Migration References

département scale

Figure 7: Percentage of descendents born in a different département than the one of their ancestor.

Enfants Petits-enfants Arriere-petits-enfants

20 40 60Pourcentage

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 26/36

Introduction Methodology Representativity Mortality Migration References

Trimodality - real distance• Distance, between generations• Without administrative regions• As in Bourdieu et al. (2000),we observe tri-modality the the distribution of distances :

I sedentary / short distance migration / long distance migration

Figure 8: Migration between generations Bourdieu et al. (2000).

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 27/36

Introduction Methodology Representativity Mortality Migration References

Trimodality for the first generation

Figure 9: Migration between generations.

Arriere-petits-enfants

Enfants Petits-enfants

1 2 5 10 25 50 100 500 2500 10000

1 2 5 10 25 50 100 500 2500 10000 1 2 5 10 25 50 100 500 2500 100000%

20%

40%

60%

0%

20%

40%

60%

Distance (km)

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 28/36

Introduction Methodology Representativity Mortality Migration References

Trimodality of Migrations

• We cut using :I median distance between places of birth (ancestors and children)I ≈ 20 kilometers

• This distance corresponds more or less to a proximity circle within which descendants willremain

Table 1: Within generation migrations in France (in percentages).

Children Grand-Children Grand-Grand-Children

Sedentary 62.17 38.06 24.17Small Distance 19.43 27.70 27.06Long Distance 18.40 34.24 48.77

Note: Sedentaries are born in the same place, short distance means < 20 km, long distance > 20 km.

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 29/36

Introduction Methodology Representativity Mortality Migration References

Migration Between Cities

• Migration as a function of the city / village size :I large cities : more than 50, 000 inhabitants (9 cities);I medium cities : between 10, 000 and 50, 000 inhabitants (98 cities);I small cities : less than 10, 000 inhabitants (254 cities) ;I villages : villages not mentioned in Statistique Générale de la France (2010).

• With XIXth Century data, no risk to over-weight large cities:I 98% of deliviries were still at home in 1930, (Fine, 1991, p. 34)

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 30/36

Introduction Methodology Representativity Mortality Migration References

Migration as a function of the city / village size

Table 2: Distribution of place of birth of individuals as a function of the city size.

Type of city Ancestor Children Grand-Children Grand-Grand-ChildrenLarge 3.40 3.92 6.31 7.63Medium 5.18 5.25 6.55 7.86Small 3.43 3.56 3.73 3.81Village (XS) 87.99 87.27 83.41 80.70

Note: Each columns indicates (per generation) the percentage of individuals born in any cities category.

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 31/36

Introduction Methodology Representativity Mortality Migration References

Transitions among cities (according to their size)

Table 3: Transition probabilities (in %).

Child Grand-Child Grand-Grand-Child

L M S XS L M S XS L M S XS

L 67,10 4,14 2,42 26,34 52,67 7,47 2,93 36,92 36,73 9,97 3,27 50,03M 5,75 67,97 1,96 24,32 10,42 47,99 3,23 38,36 14,02 33,85 3,02 49,11S 4,63 4,11 65,37 25,89 10,21 7,64 37,46 44,69 12,50 11,13 21,38 54,99XS 1,13 1,52 1,29 96,06 3,62 3,92 2,39 90,07 5,46 5,91 3,22 85,42

Note: Frequency of birth in a large (L), average (M), small (S) or extra-small (XS) city, per column, according to the place ofbirth of the ancestor (per row).

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 32/36

Introduction Methodology Representativity Mortality Migration References

Paris

Figure 10: Percentage of migration towards Paris.

Enfants Petits-enfants Arriere-petits-enfants

0 3 6 9 12Pourcentage

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 33/36

Introduction Methodology Representativity Mortality Migration References

Paris

Figure 11: Proportions of départements of descendents born in Paris, as a function of the distance to Paris.

0

3

6

9

0 250 500 750Distance a Paris (km)

Pou

rcen

tage

Enfants Petits-enfants Arriere-petits-enfantsA. Charpentier & E. Gallic Demography with Massive Collaborative Data - 34/36

Introduction Methodology Representativity Mortality Migration References

Bibliography I

Bean, L. L., May, D. L., and Skolnick, M. (1978). The Mormon historical demography project. Historical Methods: A Journal ofQuantitative and Interdisciplinary History, 11(1):45–53. doi:10.1080/01615440.1978.9955216.

Blayo, Y. and Henry, L. (1967). Données démographiques sur la Bretagne et l’Anjou de 1740 à 1829. Annales de démographiehistorique, 1967(1):91–171. doi:10.3406/adh.1967.955.

Bouchard, G., Roy, R., Casgrain, B., and Hubert, M. (1989). Fichier de population et structures de gestion de base de données :le fichier-réseau BALSAC et le système INGRES/INGRID. Histoire & Mesure, 4(1):39–57. doi:10.3406/hism.1989.874.

Bourdieu, J., Postel-Vinay, G., Rosental, P.-A., and Suwa-Eisenmann, A. (2000). Migrations et transmissions inter-générationnelles dans la France du XIXe et du début du XXe siècle. Annales. Histoire, Sciences Sociales, 55(4):749–789.doi:10.3406/ahess.2000.279879.

Cummins, N. (2017). Lifespans of the European elite, 800–1800. The Journal of Economic History, 77(02):406–439.doi:10.1017/s0022050717000468.

Fine, A. (1991). La population française au XIXe siècle, volume 1420. Presses Universitaires de France-PUF.Fire, M. and Elovici, Y. (2015). Data mining of online genealogy datasets for revealing lifespan patterns in human population.

ACM Transactions on Intelligent Systems and Technology, 6(2):1–22. doi:10.1145/2700464.Gavrilov, L. A. and Gavrilova, N. S. (2001). Etude biodemographique des determinants familiaux de la longevite humaine.

Population, 56(1/2):225. doi:10.2307/1534823.Gavrilova, N. S. and Gavrilov, L. A. (2007). Search for predictors of exceptional human longevity. North American Actuarial

Journal, 11(1):49–67. doi:10.1080/10920277.2007.10597437.Gergaud, O., Laouenan, M., and Wasmer, E. (2016). A Brief History of Human Time. Exploring a database of" notable people".

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 35/36

Introduction Methodology Representativity Mortality Migration References

Bibliography II

Kaplanis, J., Gordon, A., Shor, T., Weissbrod, O., Geiger, D., Wahl, M., Gershovits, M., Markus, B., Sheikh, M., Gymrek, M.,Bhatia, G., MacArthur, D. G., Price, A. L., and Erlich, Y. (2018). Quantitative analysis of population-scale family trees withmillions of relatives. Science. doi:10.1126/science.aam9309.

Mandemakers, K. (2000). Historical sample of the Netherlands. Handbook of international historical microdata for populationresearch, pages 149–177.

Matthijs, K. and Moreels, S. (2010). The antwerp COR*-database: A unique Flemish source for historical-demographic research.The History of the Family, 15(1):109–115. doi:10.1016/j.hisfam.2010.01.002.

Statistique Générale de la France (2010). Données sur la démographie, la population et l’enseignement primaire sur la période1800-1925. https://www.insee.fr/fr/statistiques/2591293?sommaire=2591397. Consulé le 9 février 2018.

Vallin, J. and Meslé, F. (2001). Tables de mortalité françaises pour les XIXe et XXe siècles et projections pour le XXIe siècle.Éditions de l’Institut national d’études démographiques.

van de Walle, E. (1973). La mortalité des départements français ruraux au XIXe siècle. Annales de démographie historique,1973(1):581–589. doi:10.3406/adh.1973.1164.

A. Charpentier & E. Gallic Demography with Massive Collaborative Data - 36/36