SlideShare uma empresa Scribd logo
1 de 26
Big Data Challenge
COMP 41700
Seminars in Data Science
Summary of the presentation:

Short Introduction of Telecom Italia Big Data Challenge – Donagh

Summary of Paper 1 and Paper 2 – Rajesh

Other interesting insights we can draw from this dataset – Malika
a contest designed to stimulate
the creation and development of
innovative technological ideas in
the Big Data field
history
•
Early 2014 Telecom Italia released first edition which was closed
•
Success meant that the next iteration was open
•
Freely available for anyone to use.
•
https://dandelion.eu/datamine/open-big-data/
data sets
•
Geo-referenced (Milan and the Autonomous Province of Trento)
•
Anonymised
•
Millions of records
•
November -> December 2013
•
extracted from telecom records, energy, weather, public and private
transport, social networks
Milano / Trentino
•
Grid
grid
Milano datasets
Domain
Telecommunications SMS, Call Internet; MI to Provinces; MI to MI;
Weather Weather Station Data ; Precipitation
Environment Air Quality
News Milano Today
Social Tweets
tweets
•
username - anonymised
•
entities
•
language
•
municipality
•
Tweet time
•
geometry
Paper 1
(Anatomy and efficiency of urban multimodal mobility)
Main Goal: To find the optimal time-respecting path between two Geo locations in multi-modal layer
Where, l(a,b) is the quickest length (time respecting and minimal) trips on the network
d(a,b) is the euclidean distance from the origin 'a' to the destination 'b'
Rail becomes then dominant at 40 kms and air travel is dominant
for trips of distance of order 700 kms. Other transportation modes
play a secondary role, with peaks at 22 kms for the Metro, 40 kms
for Ferries and 70 kms for Coaches
The bus system is covering most of the
short trips, whereas the advantage of
using the Metro and Rail systems emerges
progressively for longer distances
The total number of stop events
Omega grows proportionally with the
urban area populations P.
Where, C(alpha) is the
number of stop events in the
layer 'alpha' and Delta-t is the
duration of the time interval
Paper 2
(High resolution population estimates from telecommunications data)
Data Source: Telecommunications(provided by Telecom Italia)
Census data
Satellite images(provided by Landsat)
Main Goal: Create high-resolution(235m x 235m) population estimates in time and space
Difficulties: Population counts can change rapidly that means is hard to acquire local census estimates
in a timely and accurate manner. The correlation coefficient between call volume and the
underlying population distribution vary with time.
Building map:
41% of area on the map are directly
generated.
To classify the remaining 59% , they train a
Random forest classifier using OpenStreetMap
data as labeled training examples.
Population is distributed exponentially in the beginning:
29% of grid-squares have zero population
5% of grid-squares have a population of 1
3% of grid-squares have population of 2 and so on.
39% of grid-squares have a population over 100
Then follow a normal distribution with a mean of 400 persons
Population Distribution:
10-minute intervals for each of the 235m × 235m grid cells.
Communication activity is approximately log normal
There are 5 types of communications activity: SMSIN,
SMSOUT, CALLIN, CALLOUT, and INTERNET.
Telecommunications activity:
Elementary Model:
Previous research have suggest that the relation between location(i), population and telecommunication:
(w stands for call volume, p stands for population)
Not Perfect:
The relationship between call volume and population
in this region is much weaker below a threshold of
351 persons.
Main reason is that the dense population area tend to
have more cell tower for we to observe the relationship.
Model(1):
Model(2):
Try to find the best hours of call volume data:
Each type correlates most strongly during the hour
from 10 am to 11 am, and as with the total call
volumes, CALLOUT has the greatest correlation,
Approximately 0.68. Thus we use CALLOUT from
10 am to 1 am for the wi in
model(2).
Where else can we use the Telecom Italia
Dataset?
Analyzing cities using the space-time structure of mobile phone
network
•
Attempts to connect telecom usage data from Telecom Italia mobile to geography
of human activity
•
Usage of telecom data to enhance the understanding of cities as space of flows
 Using Telecom Dataset for social network analysis
 investigating social structures through the use
of network and graph theories.
 Anthropology, Biology, Communication Studies, …etc
social network analysis
Traffic monitoring in urban area.
•
Use of Telecom data to track the dense regions.
•
Rerouting strategies
•
Increase the public transport in dense area.
•
Provide more taxies in dense area.
Other Usages
Users localization

Security

Health Care : Tracking users exercises
Thank you...
Special Thanks to my team members:
Hao Wu and He Ping

Mais conteúdo relacionado

Mais procurados

I.t in space
I.t in spaceI.t in space
I.t in space
nunna09
 
E readiness assessment framework
E readiness assessment frameworkE readiness assessment framework
E readiness assessment framework
Prasanna Rasal
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
Rachit Goel
 
context aware computing
context aware computingcontext aware computing
context aware computing
swati sonawane
 

Mais procurados (20)

Implement iot using python
Implement iot using pythonImplement iot using python
Implement iot using python
 
Software and Sustainability
Software and SustainabilitySoftware and Sustainability
Software and Sustainability
 
[AIIM17] Knowledge Management and the Internet of Things - Katrina Pugh
[AIIM17]  Knowledge Management and the Internet of Things - Katrina Pugh[AIIM17]  Knowledge Management and the Internet of Things - Katrina Pugh
[AIIM17] Knowledge Management and the Internet of Things - Katrina Pugh
 
Internet of Things and Governance
Internet of Things and GovernanceInternet of Things and Governance
Internet of Things and Governance
 
DS Lecture 2.ppt
DS Lecture 2.pptDS Lecture 2.ppt
DS Lecture 2.ppt
 
RTOS - Real Time Operating Systems
RTOS - Real Time Operating SystemsRTOS - Real Time Operating Systems
RTOS - Real Time Operating Systems
 
I.t in space
I.t in spaceI.t in space
I.t in space
 
15CS81 Module1 IoT
15CS81 Module1 IoT15CS81 Module1 IoT
15CS81 Module1 IoT
 
借助Denodo实现数据网格架构和数据共享
借助Denodo实现数据网格架构和数据共享借助Denodo实现数据网格架构和数据共享
借助Denodo实现数据网格架构和数据共享
 
Smart digital farming
Smart digital farmingSmart digital farming
Smart digital farming
 
Big Data to avoid weather related flight delays
Big Data to avoid weather related flight delaysBig Data to avoid weather related flight delays
Big Data to avoid weather related flight delays
 
E readiness assessment framework
E readiness assessment frameworkE readiness assessment framework
E readiness assessment framework
 
Data in Motion vs Data at Rest
Data in Motion vs Data at RestData in Motion vs Data at Rest
Data in Motion vs Data at Rest
 
Outernet seminar ppt
Outernet seminar pptOuternet seminar ppt
Outernet seminar ppt
 
SQL vs Nosql - Cartoon strip
SQL vs Nosql - Cartoon strip SQL vs Nosql - Cartoon strip
SQL vs Nosql - Cartoon strip
 
Grid Computing Systems and Resource Management
Grid Computing Systems and Resource ManagementGrid Computing Systems and Resource Management
Grid Computing Systems and Resource Management
 
Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14Twitter sentiment-analysis Jiit2013-14
Twitter sentiment-analysis Jiit2013-14
 
context aware computing
context aware computingcontext aware computing
context aware computing
 
IoT Devices
IoT DevicesIoT Devices
IoT Devices
 
Introduction to MQ Telemetry Transport (MQTT)
Introduction to MQ Telemetry Transport (MQTT)Introduction to MQ Telemetry Transport (MQTT)
Introduction to MQ Telemetry Transport (MQTT)
 

Semelhante a Telecom Italia Big Data Challenge

Gis in telecomm
Gis in telecommGis in telecomm
Gis in telecomm
Atiqa khan
 
Cz3210711074
Cz3210711074Cz3210711074
Cz3210711074
IJMER
 

Semelhante a Telecom Italia Big Data Challenge (20)

Mobile data offloading
Mobile data offloadingMobile data offloading
Mobile data offloading
 
Mobile data offloading
Mobile data offloadingMobile data offloading
Mobile data offloading
 
Mobile data offloading
Mobile data offloadingMobile data offloading
Mobile data offloading
 
Gis in telecomm
Gis in telecommGis in telecomm
Gis in telecomm
 
Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...Human mobility,urban structure analysis,and spatial community detection from ...
Human mobility,urban structure analysis,and spatial community detection from ...
 
Implementation of IoTs in Smart Cities
Implementation of IoTs in Smart CitiesImplementation of IoTs in Smart Cities
Implementation of IoTs in Smart Cities
 
Modelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large dataModelling traffic flows with gravity models and mobile phone large data
Modelling traffic flows with gravity models and mobile phone large data
 
Cz3210711074
Cz3210711074Cz3210711074
Cz3210711074
 
The collaboration network in OSM - the case of Italy
The collaboration network in OSM - the case of Italy The collaboration network in OSM - the case of Italy
The collaboration network in OSM - the case of Italy
 
Mobility prediction in telecom cloud using telecom calls.
Mobility prediction in telecom cloud using telecom calls.Mobility prediction in telecom cloud using telecom calls.
Mobility prediction in telecom cloud using telecom calls.
 
A Review on Cooperative Communication Protocols in Wireless World
A Review on Cooperative Communication  Protocols in Wireless World A Review on Cooperative Communication  Protocols in Wireless World
A Review on Cooperative Communication Protocols in Wireless World
 
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial DataA Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
A Data Scientist Exploration in the World of Heterogeneous Open Geospatial Data
 
MODELLING DYNAMIC PATTERNS USING MOBILE DATA
MODELLING DYNAMIC PATTERNS USING MOBILE DATAMODELLING DYNAMIC PATTERNS USING MOBILE DATA
MODELLING DYNAMIC PATTERNS USING MOBILE DATA
 
Modelling dynamic patterns using mobile data
Modelling dynamic patterns using mobile dataModelling dynamic patterns using mobile data
Modelling dynamic patterns using mobile data
 
On the development of methodology for planning and cost modeling of a wide ar...
On the development of methodology for planning and cost modeling of a wide ar...On the development of methodology for planning and cost modeling of a wide ar...
On the development of methodology for planning and cost modeling of a wide ar...
 
City Data Dating: emerging affinities between diverse urban datasets
City Data Dating: emerging affinities between diverse urban datasetsCity Data Dating: emerging affinities between diverse urban datasets
City Data Dating: emerging affinities between diverse urban datasets
 
A strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census dataA strategy for the matching of mobile phone signals with census data
A strategy for the matching of mobile phone signals with census data
 
real life applications of network in graph theory.pptx
real life applications of network in graph theory.pptxreal life applications of network in graph theory.pptx
real life applications of network in graph theory.pptx
 
ICT AND URBAN PLANNING. By Antonio Caperna
ICT AND URBAN PLANNING. By Antonio CapernaICT AND URBAN PLANNING. By Antonio Caperna
ICT AND URBAN PLANNING. By Antonio Caperna
 
Integrative Model for Quantitative Evaluation of Selection Telecommunication ...
Integrative Model for Quantitative Evaluation of Selection Telecommunication ...Integrative Model for Quantitative Evaluation of Selection Telecommunication ...
Integrative Model for Quantitative Evaluation of Selection Telecommunication ...
 

Último

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
amitlee9823
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
Lars Albertsson
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
AroojKhan71
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
shivangimorya083
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
shambhavirathore45
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 

Último (20)

Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Zuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptxZuja dropshipping via API with DroFx.pptx
Zuja dropshipping via API with DroFx.pptx
 
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...Determinants of health, dimensions of health, positive health and spectrum of...
Determinants of health, dimensions of health, positive health and spectrum of...
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort ServiceBDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
BDSM⚡Call Girls in Mandawali Delhi >༒8448380779 Escort Service
 

Telecom Italia Big Data Challenge

  • 1. Big Data Challenge COMP 41700 Seminars in Data Science
  • 2. Summary of the presentation:  Short Introduction of Telecom Italia Big Data Challenge – Donagh  Summary of Paper 1 and Paper 2 – Rajesh  Other interesting insights we can draw from this dataset – Malika
  • 3. a contest designed to stimulate the creation and development of innovative technological ideas in the Big Data field
  • 4. history • Early 2014 Telecom Italia released first edition which was closed • Success meant that the next iteration was open • Freely available for anyone to use. • https://dandelion.eu/datamine/open-big-data/
  • 5. data sets • Geo-referenced (Milan and the Autonomous Province of Trento) • Anonymised • Millions of records • November -> December 2013 • extracted from telecom records, energy, weather, public and private transport, social networks
  • 8.
  • 9. Milano datasets Domain Telecommunications SMS, Call Internet; MI to Provinces; MI to MI; Weather Weather Station Data ; Precipitation Environment Air Quality News Milano Today Social Tweets
  • 11. Paper 1 (Anatomy and efficiency of urban multimodal mobility) Main Goal: To find the optimal time-respecting path between two Geo locations in multi-modal layer Where, l(a,b) is the quickest length (time respecting and minimal) trips on the network d(a,b) is the euclidean distance from the origin 'a' to the destination 'b'
  • 12. Rail becomes then dominant at 40 kms and air travel is dominant for trips of distance of order 700 kms. Other transportation modes play a secondary role, with peaks at 22 kms for the Metro, 40 kms for Ferries and 70 kms for Coaches
  • 13. The bus system is covering most of the short trips, whereas the advantage of using the Metro and Rail systems emerges progressively for longer distances
  • 14. The total number of stop events Omega grows proportionally with the urban area populations P. Where, C(alpha) is the number of stop events in the layer 'alpha' and Delta-t is the duration of the time interval
  • 15. Paper 2 (High resolution population estimates from telecommunications data) Data Source: Telecommunications(provided by Telecom Italia) Census data Satellite images(provided by Landsat) Main Goal: Create high-resolution(235m x 235m) population estimates in time and space Difficulties: Population counts can change rapidly that means is hard to acquire local census estimates in a timely and accurate manner. The correlation coefficient between call volume and the underlying population distribution vary with time.
  • 16. Building map: 41% of area on the map are directly generated. To classify the remaining 59% , they train a Random forest classifier using OpenStreetMap data as labeled training examples.
  • 17. Population is distributed exponentially in the beginning: 29% of grid-squares have zero population 5% of grid-squares have a population of 1 3% of grid-squares have population of 2 and so on. 39% of grid-squares have a population over 100 Then follow a normal distribution with a mean of 400 persons Population Distribution:
  • 18. 10-minute intervals for each of the 235m × 235m grid cells. Communication activity is approximately log normal There are 5 types of communications activity: SMSIN, SMSOUT, CALLIN, CALLOUT, and INTERNET. Telecommunications activity:
  • 19. Elementary Model: Previous research have suggest that the relation between location(i), population and telecommunication: (w stands for call volume, p stands for population) Not Perfect: The relationship between call volume and population in this region is much weaker below a threshold of 351 persons. Main reason is that the dense population area tend to have more cell tower for we to observe the relationship. Model(1):
  • 20. Model(2): Try to find the best hours of call volume data: Each type correlates most strongly during the hour from 10 am to 11 am, and as with the total call volumes, CALLOUT has the greatest correlation, Approximately 0.68. Thus we use CALLOUT from 10 am to 1 am for the wi in model(2).
  • 21. Where else can we use the Telecom Italia Dataset?
  • 22. Analyzing cities using the space-time structure of mobile phone network • Attempts to connect telecom usage data from Telecom Italia mobile to geography of human activity • Usage of telecom data to enhance the understanding of cities as space of flows
  • 23.  Using Telecom Dataset for social network analysis  investigating social structures through the use of network and graph theories.  Anthropology, Biology, Communication Studies, …etc social network analysis
  • 24. Traffic monitoring in urban area. • Use of Telecom data to track the dense regions. • Rerouting strategies • Increase the public transport in dense area. • Provide more taxies in dense area.
  • 26. Thank you... Special Thanks to my team members: Hao Wu and He Ping

Notas do Editor

  1. What is the GOAL ? Why are they doing this ?
  2. At the beginning of 2014, Telecom Italia, in collaboration with several international partners, launched the Telecom Italia Big Data Challenge. The contest made available to developers, designers and scientists a large dataset of 30+ kinds of data (mobile, weather, energy, etc.) Datasets were released only to be used by the participants after the end of the contest, the demand for those datasets has raised They want people to reuse data
  3. The data provided in the dataset of the Big Data Challenge is geo-referenced (areas: Milan and the Autonomous Province of Trento – Italy) and anonymized. The dataset contains millions of records of data covering the period from November to December 2013 extracted from telecommunications records, energy, weather, public and private transport, social networks and events.
  4. Some of the datasets referring to the Milano urban area are spatially aggregated using a grid. We refer to this grid as the Milano Grid. The schema of the grid is cellId, geometry expressed as geoJSON
  5. Grid has following spatial description The square id numbering starts from the bottom left corner of the grid and grows till its right top corner.
  6. Datasets are divided up into domains Telecommunications has 3 datasets – SMS & Internet Calls Call data from Milan to provinces Call Data within Milan
  7. Each row corresponds to a tweet. For privacy issues the user id has been obfuscated and the text has been replaced with a list of entites extracted by the Entity Extraction API tool. Entities are provided as links to DBpedia. User, entities, language, municipality, date created, timestamp, geometry