networks, big data and statistical physics: a killing combination

20
Statistical Physics, Network theory & Big data An approach to human mobility Oleguer Sagarra Dept. Física Fonamental, University of Barcelona 1

Upload: oleguer-sagarra

Post on 01-Nov-2014

677 views

Category:

Education


0 download

DESCRIPTION

 

TRANSCRIPT

Page 1: Networks, Big Data and Statistical Physics: A killing combination

Statistical Physics, Network theory & Big data

An approach to human mobility

Oleguer Sagarra Dept. Física Fonamental,

University of Barcelona�1

Page 2: Networks, Big Data and Statistical Physics: A killing combination

�2

Statistical Physics &

Big Data

“New Social Sciences”

A killing combination...

Page 3: Networks, Big Data and Statistical Physics: A killing combination

Why?

Mobility has deep implications in many processes.. (contagion, spread of ideas...)

The development of GPS/mobile phone technologies makes gathering data cheap and possible at large scale.

�3

We want to study Human Mobility…

Page 4: Networks, Big Data and Statistical Physics: A killing combination

What?

Different scales (Micro/Meso/Macro)

Society is heterogeneous… (Humans are not “monkeys”… in principle!)

�4

(Human) Mobility is a rather complex process…

But we are physicists! So we will try to model it anyway…

Page 5: Networks, Big Data and Statistical Physics: A killing combination

But we don’t need modelling…

�5

“Computers are useless, they can only give you answers…” (P. Picasso)

This talk is about questions rather…

“Models push the boundaries of our understanding"

Page 6: Networks, Big Data and Statistical Physics: A killing combination

How?

�6

Real (big) Data

Theoretical Empirical

Physics Mathematics

Network Science

Page 7: Networks, Big Data and Statistical Physics: A killing combination

The data... (has problems)

�7Citizens

a) How to get it?

Private companies (Social Media)

Page 8: Networks, Big Data and Statistical Physics: A killing combination

Getting the data... ExperimentsSmartphones give lots of “sensing opportunities”

Citizen science aims to involve people in data collection, sharing and processing

�8

BeePath: Experiments on human mobility

http://bee-path.net

(Btw: Very interesting project, but don’t have time for it today)

Page 9: Networks, Big Data and Statistical Physics: A killing combination

Getting the data... Social Media

�9

b) Is it biased? (Big data can also mean big errors)

Page 10: Networks, Big Data and Statistical Physics: A killing combination

Social media data

Social media data is geolocalized, we can extract trajectories from it.

But first, is the data representative from the population?

�10

We can compare with the census… Analysis must be done at user level!

(We want info about people, not about “some people that tweet a lot”)

Page 11: Networks, Big Data and Statistical Physics: A killing combination

�11

From points to a network?

The data... is geolocalized, and (too) big!

c) Continuous vs discrete data

(We want only the flows: From where and to where people go, “on average”)

Page 12: Networks, Big Data and Statistical Physics: A killing combination

The network approach

Network

Data

Filtering

Aggregation (grid)

�������������� ������

� ���

����

�12

Page 13: Networks, Big Data and Statistical Physics: A killing combination

Network data

�13

(We can now apply network metrics and… data is normalized!)

Sagarra, O. Master Thesis. http://upcommons.upc.edu/pfc/handle/2099.1/13134

Page 14: Networks, Big Data and Statistical Physics: A killing combination

Now we know how to deal with the data...

�14

We want to detect “abnormal” patterns...

What is chance, what is not?

What is important, what is not?

Page 15: Networks, Big Data and Statistical Physics: A killing combination

Modeling as a physicist…

�15

Take all trivial elements out…

Keep just the “basic” factors in mobility !

- Distance / Cost (a.k.a. laziness) - Population density (a.k.a. opportunities)

(We look for causality, not correlation)

Page 16: Networks, Big Data and Statistical Physics: A killing combination

Macro/Meso level: (urban/regional/national)

�16

Taking inspiration from Statistical Mechanics and Network Theory, one can define flexible

null models.

We need a general model for mobility networks…

Page 17: Networks, Big Data and Statistical Physics: A killing combination

�17

Procedure: 1. Fix some hypothesis

“The population leaving or entering each cell is given” !

2. Generate predictions “How do the flows organize?”

!

3. Compare Data vs Prediction

We need a null model for the data...

(quite a lot of maths….)*

Sagarra, O. et altr. Phys. Rev. E 88, 062806 (2013)

Page 18: Networks, Big Data and Statistical Physics: A killing combination

Roadmap

�18

Hypothesis... Modelling

Raw data Clean data

Data featuresPrediction

Experiments, Databases...

Data treatment tools

Null Model predictions

Visualizations

Statistical Validation

(We are here)(Product)

Page 19: Networks, Big Data and Statistical Physics: A killing combination

What’s the goal of all this?

Understand what drives human mobility

Discriminate important factors from negligible ones (population density, distance, cost...)

Create tools to study data in an unbiased manner

�19

Page 20: Networks, Big Data and Statistical Physics: A killing combination

Thanks for your attention...

[email protected] @usagarra

�20