O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a navegar o site, você aceita o uso de cookies. Leia nosso Contrato do Usuário e nossa Política de Privacidade.
O SlideShare utiliza cookies para otimizar a funcionalidade e o desempenho do site, assim como para apresentar publicidade mais relevante aos nossos usuários. Se você continuar a utilizar o site, você aceita o uso de cookies. Leia nossa Política de Privacidade e nosso Contrato do Usuário para obter mais detalhes.
Networks, Big Data and Statistical Physics: A killing combination
Statistical Physics, Network
theory & Big data
An approach to human mobility
Dept. Física Fonamental,
University of Barcelona
A killing combination...
“New Social Sciences”
We want to study Human Mobility…
Mobility has deep implications in many processes..
(contagion, spread of ideas...)
The development of GPS/mobile phone technologies
makes gathering data cheap and possible at large
(Human) Mobility is a rather complex process…
Different scales (Micro/Meso/Macro)
Society is heterogeneous… (Humans are not
“monkeys”… in principle!)
But we are physicists! So we will try to
model it anyway…
But we don’t need
“Computers are useless, they can only
give you answers…” (P. Picasso)
This talk is about questions rather…
“Models push the boundaries of our
Real (big) Data
The data... (has problems)
a) How to get it?
Getting the data... Experiments
Smartphones give lots of “sensing opportunities”
Citizen science aims to involve people in data
collection, sharing and processing
BeePath: Experiments on
(Btw: Very interesting project, but don’t have time for it today)
Getting the data...
b) Is it biased?
(Big data can also mean big errors)
Social media data
Social media data is geolocalized, we can extract
trajectories from it.
But ﬁrst, is the data representative from the population?
(We want info about people, not about “some people that tweet a lot”)
We can compare with the census…
Analysis must be done at user level!
The data... is geolocalized,
and (too) big!
c) Continuous vs discrete data
From points to a network?
(We want only the ﬂows: From where and to where people go, “on average”)
The network approach
(We can now apply network metrics
and… data is normalized!)
Sagarra, O. Master Thesis. http://upcommons.upc.edu/pfc/handle/
Now we know how to deal
with the data...
We want to detect “abnormal” patterns...
What is chance, what is not?
What is important, what is not?
Modeling as a physicist…
Take all trivial elements out…
Keep just the “basic” factors in mobility
- Distance / Cost (a.k.a. laziness)
- Population density (a.k.a. opportunities)
(We look for causality, not correlation)
We need a general model for mobility networks…
Taking inspiration from Statistical Mechanics
and Network Theory, one can deﬁne ﬂexible
We need a null model for the
1. Fix some hypothesis
“The population leaving or entering each cell is given”
(quite a lot of maths….)*
2. Generate predictions
“How do the ﬂows organize?”
Data vs Prediction
Sagarra, O. et altr. Phys. Rev. E 88, 062806 (2013)
Data treatment tools
(We are here)
Null Model predictions
What’s the goal of all this?
Understand what drives human mobility
Discriminate important factors from negligible ones
(population density, distance, cost...)
Create tools to study data in an unbiased manner
Thanks for your attention...