O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.
ONE DOES NOT SIMPLY
CROWDSOURCE THE
SEMANTIC WEB
TECHNOLOGY DESIGN AND INCENTIVES
Elena Simperl
e.simperl@soton.ac.uk
@esi...
CROWDSOURCING
PROBLEM SOLVING VIA OPEN CALLS
“Crowdsourcing represents the act of a company or
institution taking a functi...
THE SEMANTIC WEB
WEB OF DATA THAT CAN BE
PROCESSED BY MACHINES
3
“The Semantic Web provides a common framework
that allows...
MAKING THE SEMANTIC WEB
HUMANLY POSSIBLE
Crowdsourcing increasingly
used to help algorithms solve
Semantic Web problems
Gr...
DESIGNING
CROWDSOURCING
PROJECTS 5
DIFFERENT FORMS AND
PLATFORMS TO CHOOSE FROM
6
Macrotasks
Microtasks
Challenges
Self-organized crowds
Crowdfunding
Source:...
MANY QUESTIONS TO ANSWER
TASK DESIGN
WORKFLOW DESIGN
AND EXECUTION
TASK INTERFACES
QUALITY
ASSURANCE
TASK ASSIGNMENT
CROWD...
SOME ANSWERS
8
IMPROVING PAID MICROTASKS
@WWW15Compared effectivity of microtasks on
CrowdFlower vs self-developed game
 Image labelling...
HYBRID NER ON TWITTER
@ESWC15
Identified content and crowd factors that impact effectivity
Findings
 Shorter tweets with ...
CROWD-EMPOWERED SPARQL
QUERIES @KCAP2015
A hybrid machine/human SPARQL
query engine that enhances query
answers.
 Uses no...
OPEN QUESTIONS
12
NOT CROWDSOURCING AS USUAL
Knowledge-intensive tasks
Structured, interlinked content
Content meant for machine consumption...
FUNDAMENTAL CHALLENGES
SCALE
No‘Big Crowd’
TIME
From one-off and short-term to mid and long-term
SCOPE
Problems technology...
PATHWAYS TO SOLUTIONSSCALE
Aligning
incentives
Better
reuse of
crowd
outputs
TIME
Sustaining
engagement
Building
relations...
THANKS
e.simperl@soton.ac.
uk
@esimperl
16
Próximos SlideShares
Carregando em…5
×

One does not simply crowdsource the Semantic Web

Invited talk at the University of Sheffield

  • Seja o primeiro a comentar

  • Seja a primeira pessoa a gostar disto

One does not simply crowdsource the Semantic Web

  1. 1. ONE DOES NOT SIMPLY CROWDSOURCE THE SEMANTIC WEB TECHNOLOGY DESIGN AND INCENTIVES Elena Simperl e.simperl@soton.ac.uk @esimperl January 26th, 2016 1
  2. 2. CROWDSOURCING PROBLEM SOLVING VIA OPEN CALLS “Crowdsourcing represents the act of a company or institution taking a function once performed by employees and outsourcing it to an undefined (and generally large) network of people in the form of an open call. “ [Howe, 2006] 2
  3. 3. THE SEMANTIC WEB WEB OF DATA THAT CAN BE PROCESSED BY MACHINES 3 “The Semantic Web provides a common framework that allows data to be shared and reused across application, enterprise, and community boundaries “ [W3C, 2011]
  4. 4. MAKING THE SEMANTIC WEB HUMANLY POSSIBLE Crowdsourcing increasingly used to help algorithms solve Semantic Web problems Great challenges  How to run a crowdsourcing project effectively?  Which form of crowdsourcing for which task?  How to combine crowd and machine intelligence?  How to encourage participation? 4
  5. 5. DESIGNING CROWDSOURCING PROJECTS 5
  6. 6. DIFFERENT FORMS AND PLATFORMS TO CHOOSE FROM 6 Macrotasks Microtasks Challenges Self-organized crowds Crowdfunding Source: [Prpić et al., 2015]
  7. 7. MANY QUESTIONS TO ANSWER TASK DESIGN WORKFLOW DESIGN AND EXECUTION TASK INTERFACES QUALITY ASSURANCE TASK ASSIGNMENT CROWD TRAINING AND FEEDBACK INCENTIVES ENGINEERING COLLABORATION, COMPETITION, SELF- ORGANIZATION REAL-TIME DELIVERY NICHESOURCING EXTENSIONS TO TECHNOLOGIES SOCIAL MACHINES ENGINEERING
  8. 8. SOME ANSWERS 8
  9. 9. IMPROVING PAID MICROTASKS @WWW15Compared effectivity of microtasks on CrowdFlower vs self-developed game  Image labelling on ESP data set as gold standard  Evaluated accuracy, #labels, cost per label, avg/max #labels/contributor  For three types of tasks  Nano: 1 image  Micro: 11 images  Small: up to 2000 images  Probabilistic reasoning to personalize furtherance incentives Findings  Gamification and payments work well together  Furtherance incentives particularly interesting
  10. 10. HYBRID NER ON TWITTER @ESWC15 Identified content and crowd factors that impact effectivity Findings  Shorter tweets with fewer entities work better  Crowd is more familiar with people and places from recent news  MISC as a NER category sometimes confusing but useful to identify partial and implicitly named entities #entities in post types of entities content sentiment skipped TP posts avg. time/task UI interaction
  11. 11. CROWD-EMPOWERED SPARQL QUERIES @KCAP2015 A hybrid machine/human SPARQL query engine that enhances query answers.  Uses novel RDF completeness model, to identify portions of a query with missing values  Resorts to microtask crowdsourcing to resolve the missing values  Evaluated # of answers/delivery time/accuracy  50 queries against Dbpedia in five domains: History, Life Sciences, Movies, Music, and Sports. Findings Size of query answer set increased on avg. 3.13 times 12 minutes to get 98% of all answers Accuracy between 84 And 96% 11
  12. 12. OPEN QUESTIONS 12
  13. 13. NOT CROWDSOURCING AS USUAL Knowledge-intensive tasks Structured, interlinked content Content meant for machine consumption Scale, shape, and quality of the data Context is critical Open-set answers 13
  14. 14. FUNDAMENTAL CHALLENGES SCALE No‘Big Crowd’ TIME From one-off and short-term to mid and long-term SCOPE Problems technology cannot solve 14
  15. 15. PATHWAYS TO SOLUTIONSSCALE Aligning incentives Better reuse of crowd outputs TIME Sustaining engagement Building relationship s Better integration SCOPE New problems and problem solving paradigms Novel human- 15
  16. 16. THANKS e.simperl@soton.ac. uk @esimperl 16

×