SlideShare uma empresa Scribd logo
1 de 30
HUMAN COMPUTATION
FOR VGI MANAGEMENT
Irene Celino – irene.celino@cefriel.com
Cefriel, Viale Sarca 226, 20126 Milano
Workshop on Volunteered Geographic Information – Milano, April 16th, 2018
1. Introduction
2. Human Computation and Games with a Purpose
3. GWAP examples for VGI Management
4. Indirect people involvement
5. Future perspectives
AGENDA
2copyright © 2018 Cefriel – All rights reserved
from ideation to business value
3
1. INTRODUCTION
Relation between VGI and Human Computation
copyright © 2018 Cefriel – All rights reserved
VGI AND HUMAN COMPUTATION
• VGI is carried out by volunteers, so by definition it implies human intervention
• Still VGI suffers of all issues related to that:
• Varying participation  impact on sustainability (long tail effect)
• Reliability of volunteers  impact of information quality
• Uneven distribution of contributions  impact on coverage
• Human Computation is an approach that can bring benefits to VGI…
• …and VGI can reveal more than you could expect!
4copyright © 2018 Cefriel – All rights reserved
WISDOM OF CROWDS
• “Why the Many Are Smarter Than the Few and
How Collective Wisdom Shapes Business, Economies, Societies and Nations”
• Criteria for a wise crowd
• Diversity of opinion (importance of interpretation)
• Independence (not a “single mind”)
• Decentralization (importance of local knowledge)
• Aggregation (aim to get a collective decision)
• The are also failures/risks in crowd decisions:
• Homogeneity, centralization, division, imitation, emotionality
5copyright © 2018 Cefriel – All rights reserved
James Surowiecki
The wisdom of crowds
Anchor, 2005
from ideation to business value
6
2. HUMAN COMPUTATION &
GAMES WITH A PURPOSE
What is Human Computation? What goals can humans help machines to achieve? How to
involve a crowd of persons?
What extrinsic rewards (money, prizes, etc.) or intrinsic incentives can we adopt to
motivate people?
copyright © 2018 Cefriel – All rights reserved
HUMAN COMPUTATION
• Human Computation is a computer science technique in which a computational process
is performed by outsourcing certain steps to humans. Unlike traditional computation,
in which a human delegates a task to a computer, in Human Computation the computer
asks a person or a large group of people to solve a problem; then it collects, interprets
and integrates their solutions
• The original concept of Human Computation by its inventor Luis von Ahn derived from the
common sense observation that people are intrinsically very good at solving some
kinds of tasks which are, on the other hand, very hard to address for a computer;
this is the case of a number of targets of Artificial Intelligence (like image recognition or
natural language understanding) for which research is still open
7copyright © 2018 Cefriel – All rights reserved
Edith Law and Luis von Ahn. Human computation.
Synthesis Lectures on Artificial Intelligence and Machine Learning, 2011
HUMAN COMPUTATION
8copyright © 2018 Cefriel – All rights reserved
Problem: an Artificial Intelligence
algorithm is unable to achieve an
adequate result with a satisfactory
level of confidence
Solution: ask people to intervene
when the AI system fails, “masking”
the task within another human
process
Example: https://www.google.com/recaptcha/
WHY HUMAN COMPUTATION FOR VGI?
• Collection of new data – as a complement to VGI itself, exploiting redundancy of multiple
contributions
• Validation of collected data or automatic processing – as “third party” to solve
discrepancies
• Completion of data, to fill out “missing pieces”
• Identification of mistakes/outdated information and respective “correction”
9copyright © 2018 Cefriel – All rights reserved
GAMES WITH A PURPOSE
• A GWAP lets to outsource to humans some steps of a computational process in an
entertaining way
• The application has a “collateral effect”, because players’ actions are exploited to
solve a hidden task
• The application *IS* a fully-fledged game (opposed to gamification, which is the use
of game-like features in non-gaming environments)
• The players are (usually) unaware of the hidden purpose, they simply meet game
challenges
10copyright © 2018 Cefriel – All rights reserved
Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006
Luis Von Ahn and Laura Dabbish. Designing games with a purpose.
Communications of the ACM, 51(8):58–67, 2008
GAMES WITH A PURPOSE (GWAP)
11copyright © 2018 Cefriel – All rights reserved
Problem: it’s the same of
Human Computation (ask
humans when AI fails)
Solution: Solution: hide the
task within a game, so that
users are motivated by game
challenges, often remaining
unaware of the hidden purpose,
task solution comes from
agreement between players
SOME “VARIATIONS” OF HUMAN COMPUTATION
• Other terms have been used to indicate approaches and methods that are similar to
Human Computation and sometimes mistaken for it
• While there is of course quite a large overlap, it is useful to distinguish them
• Crowdsourcing
• Citizen Science
12copyright © 2018 Cefriel – All rights reserved
CROWDSOURCING
• Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people.
The possibility to exploit the Internet as vehicle to recruit contributors and to assign
tasks led to the rise of micro-work platforms, thus often (but not always) implying a
monetary reward. The term Crowdsourcing, although quite recent, is used to indicate a
wide range of practices; however, the most common meaning of Crowdsourcing implies
that the “crowd” of workers involved in the solution of tasks is different from the traditional
or intended groups of task solvers
13copyright © 2018 Cefriel – All rights reserved
Jeff Howe. Crowdsourcing: How the power of the crowd
is driving the future of business. Random House, 2008
CROWDSOURCING
14copyright © 2018 Cefriel – All rights reserved
Problem: a company needs to
execute a lot of simple tasks,
but cannot afford hiring a
person to do that job
Solution: pack tasks in
bunches (human intelligence
tasks or HITs) and outsource
them to a very cheap workforce
through an online platform
Example: https://www.mturk.com/
CITIZEN SCIENCE
• Citizen Science is the involvement of volunteers to collect or process data as part of
a scientific or research experiment; those volunteers can be the scientists and
researchers themselves, but more often the name of this discipline “implies a form of
science developed and enacted by citizens” including those “outside of formal scientific
institutions”, thus representing a form of public participation to science. Formally, Citizen
Science has been defined as “the systematic collection and analysis of data; development
of technology; testing of natural phenomena; and the dissemination of these activities by
researchers on a primarily avocational basis”.
15copyright © 2018 Cefriel – All rights reserved
Alan Irwin. Citizen science: A study of people, expertise
and sustainable development. Psychology Press, 1995
CITIZEN SCIENCE
16copyright © 2018 Cefriel – All rights reserved
Example: https://www.zooniverse.org/
Problem: a scientific
experiment requires the
execution of a lot of simple
tasks, but researchers are busy
Solution: engage the general
audience in solving those tasks,
explaining that they are
contributing to science,
research and the public good
SPOT THE DIFFERENCE…
• Similarities:
• Involvement of people
• Aggregation of multiple contributions
• No automatic replacement
• Variations:
• Motivation
• Reward (glory, money, passion/need)
• Hybrids or parallel!
17copyright © 2018 Cefriel – All rights reserved
Citizen Science
Crowdsourcing
Human
Computation
from ideation to business value
18
3. GWAP EXAMPLES FOR VGI
MANAGEMENT
Can we embed VGI management tasks within Games with a Purpose?
copyright © 2018 Cefriel – All rights reserved
3 EXAMPLES OF GAMES WITH A PURPOSE FOR VGI
• Collection of missing data: GWAP enabler for OSM Restaurants
• Validation of automatically collected information: LCV game
• Collection, validation and correction of data: Urbanopoly
19copyright © 2018 Cefriel – All rights reserved
20
• Input: OSM restaurants in a
given area with/without
cuisine tag (those with the
tag are used for assessing
player reliability)
• Goal: assign score 𝜎 to
each restaurant-cuisine pair
to discover the “right”
category
• Score 𝜎 of each pair is
updated on the basis of
players’ choices
(incremented if link
selected)
• When the score overcomes
the threshold 𝜎 ≥ 𝑡 , the
restaurant’s category is
considered “true” (and
removed from the game)
• Restaurant POIs (amenity=restaurant) from OSM may miss the cuisine type (cuisine key)
GWAP ENABLER TUTORIAL FOR OSM RESTAURANTS
copyright © 2018 Cefriel – All rights reserved
Pure GWAP with
double player game
mechanics
Points, badges,
leaderboard as
intrinsic reward
A player scores if he/she
chooses the same cuisine
of its gameplay “mate”
Data validation is a result
of the “agreement”
between players
https://github.com/STARS4ALL/
gwap-enabler-tutorial
Points, badges,
leaderboard as
intrinsic reward
21
• Input: set of pixels where
the two classifications
“disagree”
• Goal: assign score 𝜎 to
each pixel-category pair to
discover the “right” land
cover class
• Score 𝜎 of each pair is
updated on the basis of
players’ choices
(incremented if selected,
decremented if not
selected)
• When the score overcomes
the threshold 𝜎 ≥ 𝑡 , the
pixel’s category is
considered “true” (and
removed from the game)
• Two automatic land cover classifications in disagreement:
• DUSAF (Lombardy Region) and GlobeLand 30 (Chinese governmental agency)
LAND COVER VALIDATION GAME
copyright © 2018 Cefriel – All rights reserved
https://youtu.be/Q0ru1hhDM9Q
http://bit.ly/foss4game
Pure GWAP with
not-so-hidden purpose
(played by “experts”)
Points, badges,
leaderboard as
intrinsic reward
A player scores if he/she
guess one of the two
disagreeing classifications
Data validation is a result
of the “agreement”
between players
Maria Antonia Brovelli, Irene Celino, Andrea Fiano, Monia Elisa Molinari, Vijaycharan Venkatachalam.
A crowdsourcing-based game for land cover validation. Applied Geomatics, 2017
22
• Input: data from OSM
• Goal:
if data doesn’t exist, collect
if data exists, validate
if data is wrong, correct
• Complex game embedding
“mini-games” for data
collection, validation and
correction
• Same score mechanisms,
with score 𝜎 updated on the
basis of players’ choices
• When the score overcomes
the threshold 𝜎 ≥ 𝑡 , data is
considered “true” (and can
be sent back to OSM)
• POI information from OSM to be collected or validated/corrected
URBANOPOLY
copyright © 2018 Cefriel – All rights reserved
Irene Celino. Geospatial dataset curation through a location-based game.
Semantic Web Journal, Volume 6, Number 2, IOS Press, 2015
Monopoly-like game
to win venues in the
real world
Wheel of fortune and
mini-games to acquire
venues and become
“rich” in the game
Data acquisition
challenges as
contributions for
missing data
Data validation
challenges to check
pre-existing data
Result from
players
“agreement”
LESSONS LEARNED BY DESIGNING AND RUNNING THOSE GAMES
• Designing and developing a full game is expensive
• The simpler the game, the better its acceptance by players and its “throughput”
• Different players are motivated by different incentives
• Fun is not always enough to engage people, especially in the long term
• Data collected via games can be enough to train automatic models
23copyright © 2018 Cefriel – All rights reserved
Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning:
an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
from ideation to business value
24
4. INDIRECT PEOPLE INVOLVEMENT
Are there indirect ways to involve humans in data processing?
copyright © 2018 Cefriel – All rights reserved
HUMANS AS A SOURCE OF INFORMATION
• People are not only task executors, they are also information providers!
• Open content and cooperative knowledge
• Data explicitly provided by people like VGI can “hide” further information
• e.g., logs of wiki editing, statistical distribution of contributes
• Opportunistic sensing
• Voluntary or involuntary digital traces of human-related activities
• e.g., phone call logs, GPS traces, social media activities
25copyright © 2018 Cefriel – All rights reserved
FROM SPATIAL ANALYTICS TO GEO-SPATIAL “SEMANTICS”
• Spatial distribution and conglomeration of specific points of interest (POI)
from OpenStreetMap can give hints about the geographical space
• Re-engineering of spatial features through comparison between areas:
same POI type shows different distribution  evidence for different
semantics (e.g. what is a pub in Milano vs. London)
• Semantic specification of spatial neighbourhoods:
• Emerging neighbourhoods from spatial clustering of POIs (opposed
to administrative divisions)
• Spatial version of tf-idf to compare between different areas (e.g.
central or peripheral areas in different cities) and to characterise
neighbourhoods (e.g. shopping district)
26copyright © 2018 Cefriel – All rights reserved
Gloria Re Calegari, Emanuela Carlino, Irene Celino, Diego Peroni. Supporting Geo-Ontology
Engineering through Spatial Data Analytics. 13th Extended Semantic Web Conference, 2016
FROM POI INFORMATION AND PHONE CALL LOGS TO LAND USE
• General topic: exploit “low-cost” information about a geographic area as features to
train a predictive model that outputs “expensive” information about the same area
• “Inexpensive” input information:
• Geo-information about points of interests processed to characterize space
(distance from the nearest POI of type X)
• Mobile traffic data processed using different time series techniques
(smoothing, decomposition, filtering, time-windowing)
• “Expensive” output information:
• Land use characterization (usually collected through long and expensive
workflows that mix machine processing and costly human labour)
27copyright © 2018 Cefriel – All rights reserved
Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Extracting Urban Land Use from Linked Open Geospatial Data. IJGI, 2015
Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Filtering and Windowing Mobile Traffic Time Series for Territorial Land Use Classification. COMCOM, 2016
from ideation to business value
28
5. FUTURE PERSPECTIVES
Are we there yet?!?
copyright © 2018 Cefriel – All rights reserved
FUTURE PERSPECTIVES
• VGI management is still an open issue
• Human Computation methods (and the like) can be employed to support VGI
management
• Parallel/joint adoption of different methods to get the best out of them
• Research challenges are still the same
• Collection, completion/coverage, quality, (in)homogeneity, update/sustainability, …
• Human-in-the-loop is an emerging trend and paradigm also in Machine Learning
research (e.g. active learning)
29copyright © 2018 Cefriel – All rights reserved
MILANO
viale Sarca 226,
20126,
Milano - Italy
LONDON
4th floor
57 Rathbone Place
London W1T 1JU – UK
NEW YORK
One Liberty Plaza,
165 Broadway, 23rd Floor,
New York City, New York, 10006 USA
Cefriel.com
Thanks for your attention!
Any question?
Irene Celino
Knowledge Technologies
Digital Interaction Division
irene.celino@cefriel.com

Mais conteúdo relacionado

Semelhante a Human Computation for VGI Management

Chapter 7Evaluating and Controlling TechnologyBased.docx
Chapter 7Evaluating and Controlling TechnologyBased.docxChapter 7Evaluating and Controlling TechnologyBased.docx
Chapter 7Evaluating and Controlling TechnologyBased.docx
robertad6
 

Semelhante a Human Computation for VGI Management (20)

Human computation @ Data Semantics
Human computation @ Data SemanticsHuman computation @ Data Semantics
Human computation @ Data Semantics
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big Data
 
BDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big DataBDVe Webinar Series - QROWD: The Human Factor in Big Data
BDVe Webinar Series - QROWD: The Human Factor in Big Data
 
Human factor in big data qrowd bdve
Human factor in big data qrowd bdveHuman factor in big data qrowd bdve
Human factor in big data qrowd bdve
 
Human Computation
Human ComputationHuman Computation
Human Computation
 
Increasing Perfomance via Gamification in a Volunteer-Based Evolutionary Comp...
Increasing Perfomance via Gamification in a Volunteer-Based Evolutionary Comp...Increasing Perfomance via Gamification in a Volunteer-Based Evolutionary Comp...
Increasing Perfomance via Gamification in a Volunteer-Based Evolutionary Comp...
 
Chapter 7Evaluating and Controlling TechnologyBased.docx
Chapter 7Evaluating and Controlling TechnologyBased.docxChapter 7Evaluating and Controlling TechnologyBased.docx
Chapter 7Evaluating and Controlling TechnologyBased.docx
 
AI Orange Belt - Session 2
AI Orange Belt - Session 2AI Orange Belt - Session 2
AI Orange Belt - Session 2
 
inte
inteinte
inte
 
Crowdsourcing: A Survey
Crowdsourcing: A SurveyCrowdsourcing: A Survey
Crowdsourcing: A Survey
 
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIsDashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
Dashboards are Dumb Data - Why Smart Analytics Will Kill Your KPIs
 
The big data revolution in healthcare
The big data revolution in healthcareThe big data revolution in healthcare
The big data revolution in healthcare
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 
The Unreasonable Effectiveness of Data
The Unreasonable Effectiveness of DataThe Unreasonable Effectiveness of Data
The Unreasonable Effectiveness of Data
 
Scientific revenue unreasonable effectiveness of data
Scientific revenue unreasonable effectiveness of dataScientific revenue unreasonable effectiveness of data
Scientific revenue unreasonable effectiveness of data
 
LT-Innovate Brussels June 26, 2013 Innovation Session III: Cooperation
LT-Innovate Brussels June 26, 2013 Innovation Session III: CooperationLT-Innovate Brussels June 26, 2013 Innovation Session III: Cooperation
LT-Innovate Brussels June 26, 2013 Innovation Session III: Cooperation
 
Clipperton - AI - Deep Learning: From Hype to Maturity?
Clipperton - AI - Deep Learning: From Hype to Maturity?Clipperton - AI - Deep Learning: From Hype to Maturity?
Clipperton - AI - Deep Learning: From Hype to Maturity?
 
Minne analytics presentation 2018 12 03 final compressed
Minne analytics presentation 2018 12 03 final   compressedMinne analytics presentation 2018 12 03 final   compressed
Minne analytics presentation 2018 12 03 final compressed
 
The Purdue IronHacks
The Purdue IronHacksThe Purdue IronHacks
The Purdue IronHacks
 
Counting the World with AI Models
Counting the World with AI ModelsCounting the World with AI Models
Counting the World with AI Models
 

Mais de Irene Celino

Mais de Irene Celino (20)

Knowledge Technologies group at Cefriel
Knowledge Technologies group at CefrielKnowledge Technologies group at Cefriel
Knowledge Technologies group at Cefriel
 
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
Interplay of Game Incentives, Player Profiles and Task Difficulty in Games with ...
 
A Framework to build Games with a Purpose for Linked Data Refinement
A Framework to build Games with a Purpose  for Linked Data RefinementA Framework to build Games with a Purpose  for Linked Data Refinement
A Framework to build Games with a Purpose for Linked Data Refinement
 
Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...Involving people in Citizen Science through game incentives: the case of the ...
Involving people in Citizen Science through game incentives: the case of the ...
 
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
Ninja Riders: sensibilizzare i giovani a una mobilità più sicura attraverso i...
 
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility BehavioursNinja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
Ninja Riders - Youth and Road Safety: Discovering Urban Mobility Behaviours
 
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
BotDCAT-AP: An Extension of the DCAT Application Profile for Describing Datas...
 
Give and Take in Citizen Science
Give and Take in Citizen ScienceGive and Take in Citizen Science
Give and Take in Citizen Science
 
Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017Ninja Riders @ Human Factory Day 2017
Ninja Riders @ Human Factory Day 2017
 
Night Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaignNight Knights: exploiting games to engage people in a citizen science campaign
Night Knights: exploiting games to engage people in a citizen science campaign
 
STARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-WorkshopSTARS4ALL-CAPSSI-Workshop
STARS4ALL-CAPSSI-Workshop
 
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
Towards Talkin'Piazza: Engaging Citizens through Playful Interaction with Urb...
 
SSSW 2016 Cognition Tutorial
SSSW 2016 Cognition TutorialSSSW 2016 Cognition Tutorial
SSSW 2016 Cognition Tutorial
 
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
Analysis of a Cultural Heritage Game with a Purpose with an Educational Incen...
 
Supporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data AnalyticsSupporting Geo-Ontology Engineering through Spatial Data Analytics
Supporting Geo-Ontology Engineering through Spatial Data Analytics
 
Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...Smart City Semantics - Data Analytics and Human Computation to understand the...
Smart City Semantics - Data Analytics and Human Computation to understand the...
 
Towards a Semantic City Service Ecosystem
Towards a Semantic City Service EcosystemTowards a Semantic City Service Ecosystem
Towards a Semantic City Service Ecosystem
 
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
Living Land Use - Telecom Big Data Challenge - Trento ICT Days 2014
 
Urbanopoly @ PlanetData review
Urbanopoly @ PlanetData reviewUrbanopoly @ PlanetData review
Urbanopoly @ PlanetData review
 
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
Urbanopoly: Collection and Quality Assessment of Geo-spatial Linked Data via ...
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Último (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 

Human Computation for VGI Management

  • 1. HUMAN COMPUTATION FOR VGI MANAGEMENT Irene Celino – irene.celino@cefriel.com Cefriel, Viale Sarca 226, 20126 Milano Workshop on Volunteered Geographic Information – Milano, April 16th, 2018
  • 2. 1. Introduction 2. Human Computation and Games with a Purpose 3. GWAP examples for VGI Management 4. Indirect people involvement 5. Future perspectives AGENDA 2copyright © 2018 Cefriel – All rights reserved
  • 3. from ideation to business value 3 1. INTRODUCTION Relation between VGI and Human Computation copyright © 2018 Cefriel – All rights reserved
  • 4. VGI AND HUMAN COMPUTATION • VGI is carried out by volunteers, so by definition it implies human intervention • Still VGI suffers of all issues related to that: • Varying participation  impact on sustainability (long tail effect) • Reliability of volunteers  impact of information quality • Uneven distribution of contributions  impact on coverage • Human Computation is an approach that can bring benefits to VGI… • …and VGI can reveal more than you could expect! 4copyright © 2018 Cefriel – All rights reserved
  • 5. WISDOM OF CROWDS • “Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations” • Criteria for a wise crowd • Diversity of opinion (importance of interpretation) • Independence (not a “single mind”) • Decentralization (importance of local knowledge) • Aggregation (aim to get a collective decision) • The are also failures/risks in crowd decisions: • Homogeneity, centralization, division, imitation, emotionality 5copyright © 2018 Cefriel – All rights reserved James Surowiecki The wisdom of crowds Anchor, 2005
  • 6. from ideation to business value 6 2. HUMAN COMPUTATION & GAMES WITH A PURPOSE What is Human Computation? What goals can humans help machines to achieve? How to involve a crowd of persons? What extrinsic rewards (money, prizes, etc.) or intrinsic incentives can we adopt to motivate people? copyright © 2018 Cefriel – All rights reserved
  • 7. HUMAN COMPUTATION • Human Computation is a computer science technique in which a computational process is performed by outsourcing certain steps to humans. Unlike traditional computation, in which a human delegates a task to a computer, in Human Computation the computer asks a person or a large group of people to solve a problem; then it collects, interprets and integrates their solutions • The original concept of Human Computation by its inventor Luis von Ahn derived from the common sense observation that people are intrinsically very good at solving some kinds of tasks which are, on the other hand, very hard to address for a computer; this is the case of a number of targets of Artificial Intelligence (like image recognition or natural language understanding) for which research is still open 7copyright © 2018 Cefriel – All rights reserved Edith Law and Luis von Ahn. Human computation. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2011
  • 8. HUMAN COMPUTATION 8copyright © 2018 Cefriel – All rights reserved Problem: an Artificial Intelligence algorithm is unable to achieve an adequate result with a satisfactory level of confidence Solution: ask people to intervene when the AI system fails, “masking” the task within another human process Example: https://www.google.com/recaptcha/
  • 9. WHY HUMAN COMPUTATION FOR VGI? • Collection of new data – as a complement to VGI itself, exploiting redundancy of multiple contributions • Validation of collected data or automatic processing – as “third party” to solve discrepancies • Completion of data, to fill out “missing pieces” • Identification of mistakes/outdated information and respective “correction” 9copyright © 2018 Cefriel – All rights reserved
  • 10. GAMES WITH A PURPOSE • A GWAP lets to outsource to humans some steps of a computational process in an entertaining way • The application has a “collateral effect”, because players’ actions are exploited to solve a hidden task • The application *IS* a fully-fledged game (opposed to gamification, which is the use of game-like features in non-gaming environments) • The players are (usually) unaware of the hidden purpose, they simply meet game challenges 10copyright © 2018 Cefriel – All rights reserved Luis Von Ahn. Games with a purpose. Computer, 39(6):92–94, 2006 Luis Von Ahn and Laura Dabbish. Designing games with a purpose. Communications of the ACM, 51(8):58–67, 2008
  • 11. GAMES WITH A PURPOSE (GWAP) 11copyright © 2018 Cefriel – All rights reserved Problem: it’s the same of Human Computation (ask humans when AI fails) Solution: Solution: hide the task within a game, so that users are motivated by game challenges, often remaining unaware of the hidden purpose, task solution comes from agreement between players
  • 12. SOME “VARIATIONS” OF HUMAN COMPUTATION • Other terms have been used to indicate approaches and methods that are similar to Human Computation and sometimes mistaken for it • While there is of course quite a large overlap, it is useful to distinguish them • Crowdsourcing • Citizen Science 12copyright © 2018 Cefriel – All rights reserved
  • 13. CROWDSOURCING • Crowdsourcing is the process to outsource tasks to a “crowd” of distributed people. The possibility to exploit the Internet as vehicle to recruit contributors and to assign tasks led to the rise of micro-work platforms, thus often (but not always) implying a monetary reward. The term Crowdsourcing, although quite recent, is used to indicate a wide range of practices; however, the most common meaning of Crowdsourcing implies that the “crowd” of workers involved in the solution of tasks is different from the traditional or intended groups of task solvers 13copyright © 2018 Cefriel – All rights reserved Jeff Howe. Crowdsourcing: How the power of the crowd is driving the future of business. Random House, 2008
  • 14. CROWDSOURCING 14copyright © 2018 Cefriel – All rights reserved Problem: a company needs to execute a lot of simple tasks, but cannot afford hiring a person to do that job Solution: pack tasks in bunches (human intelligence tasks or HITs) and outsource them to a very cheap workforce through an online platform Example: https://www.mturk.com/
  • 15. CITIZEN SCIENCE • Citizen Science is the involvement of volunteers to collect or process data as part of a scientific or research experiment; those volunteers can be the scientists and researchers themselves, but more often the name of this discipline “implies a form of science developed and enacted by citizens” including those “outside of formal scientific institutions”, thus representing a form of public participation to science. Formally, Citizen Science has been defined as “the systematic collection and analysis of data; development of technology; testing of natural phenomena; and the dissemination of these activities by researchers on a primarily avocational basis”. 15copyright © 2018 Cefriel – All rights reserved Alan Irwin. Citizen science: A study of people, expertise and sustainable development. Psychology Press, 1995
  • 16. CITIZEN SCIENCE 16copyright © 2018 Cefriel – All rights reserved Example: https://www.zooniverse.org/ Problem: a scientific experiment requires the execution of a lot of simple tasks, but researchers are busy Solution: engage the general audience in solving those tasks, explaining that they are contributing to science, research and the public good
  • 17. SPOT THE DIFFERENCE… • Similarities: • Involvement of people • Aggregation of multiple contributions • No automatic replacement • Variations: • Motivation • Reward (glory, money, passion/need) • Hybrids or parallel! 17copyright © 2018 Cefriel – All rights reserved Citizen Science Crowdsourcing Human Computation
  • 18. from ideation to business value 18 3. GWAP EXAMPLES FOR VGI MANAGEMENT Can we embed VGI management tasks within Games with a Purpose? copyright © 2018 Cefriel – All rights reserved
  • 19. 3 EXAMPLES OF GAMES WITH A PURPOSE FOR VGI • Collection of missing data: GWAP enabler for OSM Restaurants • Validation of automatically collected information: LCV game • Collection, validation and correction of data: Urbanopoly 19copyright © 2018 Cefriel – All rights reserved
  • 20. 20 • Input: OSM restaurants in a given area with/without cuisine tag (those with the tag are used for assessing player reliability) • Goal: assign score 𝜎 to each restaurant-cuisine pair to discover the “right” category • Score 𝜎 of each pair is updated on the basis of players’ choices (incremented if link selected) • When the score overcomes the threshold 𝜎 ≥ 𝑡 , the restaurant’s category is considered “true” (and removed from the game) • Restaurant POIs (amenity=restaurant) from OSM may miss the cuisine type (cuisine key) GWAP ENABLER TUTORIAL FOR OSM RESTAURANTS copyright © 2018 Cefriel – All rights reserved Pure GWAP with double player game mechanics Points, badges, leaderboard as intrinsic reward A player scores if he/she chooses the same cuisine of its gameplay “mate” Data validation is a result of the “agreement” between players https://github.com/STARS4ALL/ gwap-enabler-tutorial Points, badges, leaderboard as intrinsic reward
  • 21. 21 • Input: set of pixels where the two classifications “disagree” • Goal: assign score 𝜎 to each pixel-category pair to discover the “right” land cover class • Score 𝜎 of each pair is updated on the basis of players’ choices (incremented if selected, decremented if not selected) • When the score overcomes the threshold 𝜎 ≥ 𝑡 , the pixel’s category is considered “true” (and removed from the game) • Two automatic land cover classifications in disagreement: • DUSAF (Lombardy Region) and GlobeLand 30 (Chinese governmental agency) LAND COVER VALIDATION GAME copyright © 2018 Cefriel – All rights reserved https://youtu.be/Q0ru1hhDM9Q http://bit.ly/foss4game Pure GWAP with not-so-hidden purpose (played by “experts”) Points, badges, leaderboard as intrinsic reward A player scores if he/she guess one of the two disagreeing classifications Data validation is a result of the “agreement” between players Maria Antonia Brovelli, Irene Celino, Andrea Fiano, Monia Elisa Molinari, Vijaycharan Venkatachalam. A crowdsourcing-based game for land cover validation. Applied Geomatics, 2017
  • 22. 22 • Input: data from OSM • Goal: if data doesn’t exist, collect if data exists, validate if data is wrong, correct • Complex game embedding “mini-games” for data collection, validation and correction • Same score mechanisms, with score 𝜎 updated on the basis of players’ choices • When the score overcomes the threshold 𝜎 ≥ 𝑡 , data is considered “true” (and can be sent back to OSM) • POI information from OSM to be collected or validated/corrected URBANOPOLY copyright © 2018 Cefriel – All rights reserved Irene Celino. Geospatial dataset curation through a location-based game. Semantic Web Journal, Volume 6, Number 2, IOS Press, 2015 Monopoly-like game to win venues in the real world Wheel of fortune and mini-games to acquire venues and become “rich” in the game Data acquisition challenges as contributions for missing data Data validation challenges to check pre-existing data Result from players “agreement”
  • 23. LESSONS LEARNED BY DESIGNING AND RUNNING THOSE GAMES • Designing and developing a full game is expensive • The simpler the game, the better its acceptance by players and its “throughput” • Different players are motivated by different incentives • Fun is not always enough to engage people, especially in the long term • Data collected via games can be enough to train automatic models 23copyright © 2018 Cefriel – All rights reserved Gloria Re Calegari, Gioele Nasi, Irene Celino. Human Computation vs. Machine Learning: an Experimental Comparison for Image Classification. Human Computation Journal, 2018.
  • 24. from ideation to business value 24 4. INDIRECT PEOPLE INVOLVEMENT Are there indirect ways to involve humans in data processing? copyright © 2018 Cefriel – All rights reserved
  • 25. HUMANS AS A SOURCE OF INFORMATION • People are not only task executors, they are also information providers! • Open content and cooperative knowledge • Data explicitly provided by people like VGI can “hide” further information • e.g., logs of wiki editing, statistical distribution of contributes • Opportunistic sensing • Voluntary or involuntary digital traces of human-related activities • e.g., phone call logs, GPS traces, social media activities 25copyright © 2018 Cefriel – All rights reserved
  • 26. FROM SPATIAL ANALYTICS TO GEO-SPATIAL “SEMANTICS” • Spatial distribution and conglomeration of specific points of interest (POI) from OpenStreetMap can give hints about the geographical space • Re-engineering of spatial features through comparison between areas: same POI type shows different distribution  evidence for different semantics (e.g. what is a pub in Milano vs. London) • Semantic specification of spatial neighbourhoods: • Emerging neighbourhoods from spatial clustering of POIs (opposed to administrative divisions) • Spatial version of tf-idf to compare between different areas (e.g. central or peripheral areas in different cities) and to characterise neighbourhoods (e.g. shopping district) 26copyright © 2018 Cefriel – All rights reserved Gloria Re Calegari, Emanuela Carlino, Irene Celino, Diego Peroni. Supporting Geo-Ontology Engineering through Spatial Data Analytics. 13th Extended Semantic Web Conference, 2016
  • 27. FROM POI INFORMATION AND PHONE CALL LOGS TO LAND USE • General topic: exploit “low-cost” information about a geographic area as features to train a predictive model that outputs “expensive” information about the same area • “Inexpensive” input information: • Geo-information about points of interests processed to characterize space (distance from the nearest POI of type X) • Mobile traffic data processed using different time series techniques (smoothing, decomposition, filtering, time-windowing) • “Expensive” output information: • Land use characterization (usually collected through long and expensive workflows that mix machine processing and costly human labour) 27copyright © 2018 Cefriel – All rights reserved Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Extracting Urban Land Use from Linked Open Geospatial Data. IJGI, 2015 Gloria Re Calegari, Emanuela Carlino, Diego Peroni, Irene Celino. Filtering and Windowing Mobile Traffic Time Series for Territorial Land Use Classification. COMCOM, 2016
  • 28. from ideation to business value 28 5. FUTURE PERSPECTIVES Are we there yet?!? copyright © 2018 Cefriel – All rights reserved
  • 29. FUTURE PERSPECTIVES • VGI management is still an open issue • Human Computation methods (and the like) can be employed to support VGI management • Parallel/joint adoption of different methods to get the best out of them • Research challenges are still the same • Collection, completion/coverage, quality, (in)homogeneity, update/sustainability, … • Human-in-the-loop is an emerging trend and paradigm also in Machine Learning research (e.g. active learning) 29copyright © 2018 Cefriel – All rights reserved
  • 30. MILANO viale Sarca 226, 20126, Milano - Italy LONDON 4th floor 57 Rathbone Place London W1T 1JU – UK NEW YORK One Liberty Plaza, 165 Broadway, 23rd Floor, New York City, New York, 10006 USA Cefriel.com Thanks for your attention! Any question? Irene Celino Knowledge Technologies Digital Interaction Division irene.celino@cefriel.com