WebSci2013 Harnessing Disagreement in Crowdsourcing

•

2 gostaram•5,865 visualizações

Lora Aroyo

Tecnologia Diversão e humor

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Gold Standard
Assumption
• typically in cognitive systems
• for each annotated instance there is a single right answer
• gold standard quality can be measured in inter-annotator
agreement
Let them disagree?

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Hypothesis
Annotator disagreement is not noise, but signal.
Not a problem to overcome but a source of information for machines
Artificially restricting humans does not help machines to learn.
They will learn better from diversity

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Position
disagreement is a sign of
intrinsic vagueness & ambiguity in human understanding

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Approach Principles
1.  Tolerate, capture & exploit disagreement
2.  Understand it by a space of possibilities (frequencies & similarities)
3.  Score the machine output based on where it falls in this space
4.  Adapt to new annotation tasks

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Relation Extraction
crowdsourcing gold standard data
Relations overlap in meaning
Sentences are vague and ambiguous
Experts have different interpretations

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Feeling the way the CHEST expands (PALPATION), can identify areas of
the lung that are full of ﬂuid.
?PALPATIONIs CHEST related to
diagnose location associated
with
is_a otherpart_of
0 0 02 3 0 0 0 1 0 0 44 1
?CONJUNCTIVITISHYPERAEMIA related toIs
0 0 0 1 0 0 0 013 0 0 0 0 0
symptomcause
Redness (HYPERAEMIA), irritation (chemosis) and watering (epiphora)
of the eyes are symptoms common to all forms of CONJUNCTIVITIS.

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Harnessing Disagreement
• Sentence-relation score: core crowd truth metric for relation extraction, measured for each relation on
each sentence as the cosine of the unit vector for relation with sentence vector
• Sentence clarity: for each sentence - max relation score for that sentence. If all the workers selected the
same relation for a sentence, the max score is 1, indicating a clear sentence
• Relation similarity: pairwise conditional probability that if relation Ri is annotated in a sentence, Rj is as
well. Indicates how confusable the linguistic expression of two relations are
• Relation ambiguity: max relation similarity for a relation. If a relation is clear it has low score
• Relation clarity: max sentence-relation score for a relation over all sentences. If a relation has a high
clarity score, it means that it is at least possible to express the relation clearly

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
The Dark Side of Crowdsourcing
Disagreement
• spammers generate disagreement for the wrong reasons
• most spam detection requires gold standard
• Worker-sentence disagreement: the average of all the cosines between each
worker’s sentence vector and the full sentence vector (minus that worker).
Indicates how much a worker disagrees with the crowd on a sentence basis
• Worker-worker disagreement: a pairwise confusion matrix between workers
and the average agreement across the matrix for each worker. Indicates
whether there are consistently like-minded workers

Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo
Questions?

Mais conteúdo relacionado

Destaque

Keynote at SMAP2012: Personalized Access to TV ContentLora Aroyo

Agora User Committee Meeting 2013Lora Aroyo

SealincMedia Accurator DemosLora Aroyo

AGORA Project: Final Review 2012Lora Aroyo

CHIP Project: Personalized Museum Tour with Real-Time Adaptation on a Mobile ...Lora Aroyo

Europeana Tech 2011Michiel Hildebrand

Stitch by Stitch: Annotating Fashion at the RijksmuseumLora Aroyo

DIVE+: Explorative Search for Digital HumanitiesJohan Oomen

Destaque (8)

Keynote at SMAP2012: Personalized Access to TV Content

Agora User Committee Meeting 2013

SealincMedia Accurator Demos

AGORA Project: Final Review 2012

CHIP Project: Personalized Museum Tour with Real-Time Adaptation on a Mobile ...

Europeana Tech 2011

Stitch by Stitch: Annotating Fashion at the Rijksmuseum

DIVE+: Explorative Search for Digital Humanities

Semelhante a WebSci2013 Harnessing Disagreement in Crowdsourcing

Dartmouth 2018 writing assessment presentation Les PerelmanLes Perelman

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffnRwanEnan

School Essay Essays Format. Online assignment writing service.Carolina Abrams

Communities of Trust - from regulation to cooperationScreamin Wrba

the relevance theory- pragmaticskiran nazir

kiranppt-170704170919 (1).pdfSemaYILDIZHUSEYNOV1

RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020Zachary Schendel

Sample Self Evaluation Essay.pdfAndrea Santiago

Cbse Class 7 English EssayVanessa Henderson

Essay On Exam Stress. Online assignment writing service.Amanda Anderson

Example Of Event Report EssayEmily Owusuansah

Recsys PresentationNeal Lathia

Xmas Writing PaperJennifer Perry

CrowdTruth Tutorial: Using the Crowd to Understand AmbiguityAnca Dumitrache

Size Of Writing Paper. Writing Paper Sizes Chart. 2019-01-16Kimberly Gomez

Semantic Patterns for Sentiment Analysis of TwitterKnowledge Media Institute - The Open University

Dialogue based Meaning NegotiationTerry Payne

Puppy Writing Stationary Writing, Puppies, WordsMichelle Adams

IndiaS Natural Beauty Essay In Hindi. Online assignment writing service.Heather Wilkins

2000 Word Essay How Long Introduction. Online assignment writing service.Tammy Adams

Semelhante a WebSci2013 Harnessing Disagreement in Crowdsourcing (20)

Dartmouth 2018 writing assessment presentation Les Perelman

L6.pptxsdv dfbdfjftj hgjythgfvfhjyggunghb fghtffn

School Essay Essays Format. Online assignment writing service.

Communities of Trust - from regulation to cooperation

the relevance theory- pragmatics

kiranppt-170704170919 (1).pdf

RecSys 2020 A Human Perspective on Algorithmic Similarity Schendel 9-2020

Sample Self Evaluation Essay.pdf

Cbse Class 7 English Essay

Essay On Exam Stress. Online assignment writing service.

Example Of Event Report Essay

Recsys Presentation

Xmas Writing Paper

CrowdTruth Tutorial: Using the Crowd to Understand Ambiguity

Size Of Writing Paper. Writing Paper Sizes Chart. 2019-01-16

Semantic Patterns for Sentiment Analysis of Twitter

Dialogue based Meaning Negotiation

Puppy Writing Stationary Writing, Puppies, Words

IndiaS Natural Beauty Essay In Hindi. Online assignment writing service.

2000 Word Essay How Long Introduction. Online assignment writing service.

Mais de Lora Aroyo

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdfLora Aroyo

CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine LearningLora Aroyo

Harnessing Human Semantics at Scale (updated)Lora Aroyo

Data excellence: Better data for better AILora Aroyo

CHIP Demonstrator presentation @ CATCH SymposiumLora Aroyo

Semantic Web Challenge: CHIP DemonstratorLora Aroyo

The Rijksmuseum Collection as Linked DataLora Aroyo

Keynote at International Conference of Art Libraries 2018 @RijksmuseumLora Aroyo

FAIRview: Responsible Video Summarization @NYCML'18Lora Aroyo

Understanding bias in video news & news filtering algorithmsLora Aroyo

StorySourcing: Telling Stories with Humans & MachinesLora Aroyo

Data Science with Humans in the LoopLora Aroyo

Digital Humanities Benelux 2017: Keynote Lora AroyoLora Aroyo

DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...Lora Aroyo

Crowdsourcing ambiguity aware ground truth - collective intelligence 2017Lora Aroyo

My ESWC 2017 keynote: Disrupting the Semantic Comfort ZoneLora Aroyo

Data Science with Human in the Loop @Faculty of Science #Leiden UniversityLora Aroyo

SXSW2017 @NewDutchMedia Talk: Exploration is the New SearchLora Aroyo

Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital AgeLora Aroyo

"Video Killed the Radio Star": From MTV to SnapchatLora Aroyo

Mais de Lora Aroyo (20)

NeurIPS2023 Keynote: The Many Faces of Responsible AI.pdf

CATS4ML Data Challenge: Crowdsourcing Adverse Test Sets for Machine Learning

Harnessing Human Semantics at Scale (updated)

Data excellence: Better data for better AI

CHIP Demonstrator presentation @ CATCH Symposium

Semantic Web Challenge: CHIP Demonstrator

The Rijksmuseum Collection as Linked Data

Keynote at International Conference of Art Libraries 2018 @Rijksmuseum

FAIRview: Responsible Video Summarization @NYCML'18

Understanding bias in video news & news filtering algorithms

StorySourcing: Telling Stories with Humans & Machines

Data Science with Humans in the Loop

Digital Humanities Benelux 2017: Keynote Lora Aroyo

DH Benelux 2017 Panel: A Pragmatic Approach to Understanding and Utilising Ev...

Crowdsourcing ambiguity aware ground truth - collective intelligence 2017

My ESWC 2017 keynote: Disrupting the Semantic Comfort Zone

Data Science with Human in the Loop @Faculty of Science #Leiden University

SXSW2017 @NewDutchMedia Talk: Exploration is the New Search

Europeana GA 2016: Harnessing Crowds, Niches & Professionals in the Digital Age

"Video Killed the Radio Star": From MTV to Snapchat

Último

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

Scaling API-first – The story of a global engineering organizationRadu Cotescu

Artificial Intelligence: Facts and MythsJoaquim Jorge

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun

Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal

The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services

Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer

Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays

Advantages of Hiring UIUX Design Service Providers for Your BusinessPixlogix Infotech

Boost PC performance: How more available memory can improve productivityPrincipled Technologies

Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun

TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc

Histor y of HAM Radio presentation slidevu2urc

Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

WebSci2013 Harnessing Disagreement in Crowdsourcing

1. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo gathering gold standard annotations for relation extraction Crowd Truth Harnessing Disagreement in Crowdsourcing

2. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Gold Standard Assumption • typically in cognitive systems • for each annotated instance there is a single right answer • gold standard quality can be measured in inter-annotator agreement Let them disagree?

3. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Hypothesis Annotator disagreement is not noise, but signal. Not a problem to overcome but a source of information for machines Artificially restricting humans does not help machines to learn. They will learn better from diversity

4. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Position disagreement is a sign of intrinsic vagueness & ambiguity in human understanding

5. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Approach Principles 1.  Tolerate, capture & exploit disagreement 2.  Understand it by a space of possibilities (frequencies & similarities) 3.  Score the machine output based on where it falls in this space 4.  Adapt to new annotation tasks

6. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Relation Extraction crowdsourcing gold standard data Relations overlap in meaning Sentences are vague and ambiguous Experts have different interpretations

7. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo

8. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Feeling the way the CHEST expands (PALPATION), can identify areas of the lung that are full of ﬂuid. ?PALPATIONIs CHEST related to diagnose location associated with is_a otherpart_of 0 0 02 3 0 0 0 1 0 0 44 1 ?CONJUNCTIVITISHYPERAEMIA related toIs 0 0 0 1 0 0 0 013 0 0 0 0 0 symptomcause Redness (HYPERAEMIA), irritation (chemosis) and watering (epiphora) of the eyes are symptoms common to all forms of CONJUNCTIVITIS.

9. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo

10. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Harnessing Disagreement • Sentence-relation score: core crowd truth metric for relation extraction, measured for each relation on each sentence as the cosine of the unit vector for relation with sentence vector • Sentence clarity: for each sentence - max relation score for that sentence. If all the workers selected the same relation for a sentence, the max score is 1, indicating a clear sentence • Relation similarity: pairwise conditional probability that if relation Ri is annotated in a sentence, Rj is as well. Indicates how confusable the linguistic expression of two relations are • Relation ambiguity: max relation similarity for a relation. If a relation is clear it has low score • Relation clarity: max sentence-relation score for a relation over all sentences. If a relation has a high clarity score, it means that it is at least possible to express the relation clearly

11. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo The Dark Side of Crowdsourcing Disagreement • spammers generate disagreement for the wrong reasons • most spam detection requires gold standard • Worker-sentence disagreement: the average of all the cosines between each worker’s sentence vector and the full sentence vector (minus that worker). Indicates how much a worker disagrees with the crowd on a sentence basis • Worker-worker disagreement: a pairwise confusion matrix between workers and the average agreement across the matrix for each worker. Indicates whether there are consistently like-minded workers

12. Chris Welty Crowd Truth for Cognitive Computing Lora Aroyo Questions?

WebSci2013 Harnessing Disagreement in Crowdsourcing

Recomendados

Recomendados

Mais conteúdo relacionado

Destaque

Destaque (8)

Semelhante a WebSci2013 Harnessing Disagreement in Crowdsourcing

Semelhante a WebSci2013 Harnessing Disagreement in Crowdsourcing (20)

Mais de Lora Aroyo

Mais de Lora Aroyo (20)

Último

Último (20)

WebSci2013 Harnessing Disagreement in Crowdsourcing