Talk given at Location Based Services conference 2019 in Vienna. Paper available at https://lbsconference.org/wp-content/uploads/2019/11/5_1.pdf
Abstract: The increasing use of geosocial media in research to draw quantitative and qualitative conclusions about urban environments bears questions about the consistency of the data across the different platforms. This paper therefore presents a comparative analysis of data from six different geosocial media platforms (Facebook, Twitter, Google, Foursquare, Flickr, and Instagram) for Washington, D.C., using population and zoning data for reference. We find that there is little consistency between the different platforms at small spatial units and even semantically rich datasets have severe limitations when predicting functional zones in a city. The results show that researchers need to carefully evaluate which platform they can use for a particular study, and that more work is needed to better understand the differences between the different platforms.
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Consistency across geosocial media platforms
1. Consistency Across
Geosocial Media Platforms
Carsten Keßler1
& Grant D. McKenzie2
1
Department of Planning, Aalborg University, Copenhagen, Denmark
2
Department of Geography, McGill University, Montreal, Canada
1
2. Motivation
Geosocial media data are used to study urban structure, functional regions, to
generate gazetteers, for population mapping, to study population mobility, and
to detect events [...]
2
3. Motivation
Geosocial media data are used to study urban structure, functional regions, to
generate gazetteers, for population mapping, to study population mobility, and
to detect events [...]
But do datasets from di!erent
platforms actually tell a similar
story?
3
4. Study area
All data within the boundary of
Washington, D.C.
!
!
Map by Peter Fitzgerald, CC BY 3.0
4
12. Can we predict the zone
group from the present
POIs?
- Experiment with 24,428 Foursquare
POISs
- 10 top-level and 449 second-level
categories*
- Only 126 zones actually have POIs in
them
- Mean: 31 POIs per zone
*
https://developer.foursquare.com/docs/resources/categories
12
14. Random forest classifier
Trained on frequencies of foursquare POI types to predict zone group
Results
Out-of-bag estimate of error rate of 38.1% (first-level)
and 36.5% (second-level).
14
17. Conclusions
— Consistency between platforms is limited
— We should be skeptical about insights derived from
geosocial media if they are only based on a single platform
— Even rich semantic annotations are of limited used when
studying city structure
— But: Also hard to say what is really going on in reality
17
18. Future work needs to...
— investigate which data source can be used for which kinds of
inferences
— study the di!erences between data sources (user groups, data
collection mechanisms, business models etc.)
— study robustness when deriving new insights (without ground
truth data) from geosocial media
Carsten Keßler – @carstenkessler
Department of Planning, Aalborg University, Copenhagen, Denmark
Grant D. McKenzie – @grantdmckenzie
Department of Geography, McGill University, Montreal, Canada
18