SlideShare a Scribd company logo
1 of 29
Download to read offline
Crowdsourcing satellite imagery:
          study of iterative vs. parallel models
                           Nicolas Maisonneuve, Bastien Chopard




                                                          Twitter: nmaisonneuve




Friday, September 21, 12                                                          1
Damage assessment after a humanitarian crisis




Friday, September 21, 12                                           2
Port-au-prince: 300K buildings assessed
                           in 3 months for 8 UNOSAT experts




Friday, September 21, 12                                             3
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?




Friday, September 21, 12                                             4
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           5
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           6
Organizational challenges:
                           How to organize non-trained volunteers,
                                especially to enforce quality?



        Investigated scope:
        • Qualitative + Quantitative study of 2 collaborative models inspired by
        Computer science: iterative vs parallel information processing

        • Controlled experiment to isolate quality = F(organisation), removing
        other parameters e.g. training, task difficulty

        • this research != studying real world collaborative practices but more
        extreme/symbolic cases to guide collaborative system designers



Friday, September 21, 12                                                           7
Tested Collaborative Models (1/2)
                                  iterative model




                       e.g. wikipedia, open street map, assembly lines
Friday, September 21, 12                                                 8
Tested Collaborative Models (2/2)
                                   parallel model




                                                 aggregation




             e.g. voting systems in society, distributed computing
Friday, September 21, 12                                             9
Tested Collaborative Models (2/2)
                                   parallel model




            old version (17th to mid 20th century): when computers were human/women
            (Mathematical Table project - (1938 -1948)
Friday, September 21, 12                                                              10
Qualitative comparison
                                    Iterative                    Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces




Friday, September 21, 12                                                               11
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration




Friday, September 21, 12                                                               12
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions




Friday, September 21, 12                                                               13
Qualitative comparison
                                    Iterative                     Parallel

                 problem      No need to divide complex   Complex problem need to be
               divisibility   problem                     divided in easier pieces


           optimization       copy emphasizing            isolation emphasizing
               tradeoff       exploitation                exploration


               quality                                    redundancy + diversity of
                              sequential improvement
            mechanism                                     opinions

                                                          useless redundancy for
                              path dependency effect +
               side effect                                obvious decisions + pb of
                              sensitivity to vandalism
                                                          aggregation




Friday, September 21, 12                                                               14
Controlled Experiment: web platform




                           Interface/instruction for the Parallel model

Friday, September 21, 12                                                  15
on 3 maps with different topologies
                    (annotated by 1 UNITAR expert)




Friday, September 21, 12                                16
Participants used for the experiments:
              Mechanical Turk as simulator




Friday, September 21, 12                          17
Data Quality Metrics

                 Quality of the collective output
                 • type I errors = p(wrong annotation)
                 • type II errors = p(missing a building)
                 • Consistency

                 Analogy with the information retrieval field:
                 • Precision = p(an annotation is a building)
                 • Recall = p(a building is annotated)
                 • F-measure = score mixing recall + precision
                 • (metrics adjusted with tolerance distance)



Friday, September 21, 12                                         18
Methodology for parallel model
                     Step 1 - collecting independent contribution:
                     N for (map1, map2, map3) = (121,120,113)




Friday, September 21, 12                                             19
Methodology for parallel model
                       Step 2 - for each map,
       generating the set of groups of m=[1 to N] participants


  m=1


  m=2



m=3


Friday, September 21, 12                                         20
Methodology for parallel model
         Step 3 - for each group: aggregating + computing quality

 groups
of m = 2

                           Spatial Clustering of points + quorum




                 Compute Data Quality with Gold Standard

                             Precision          Recall             F-measure

Friday, September 21, 12                                                       21
The more = the better?
                              (parallel model)
      avg. F-measure




    yes but until some points..
    • (Adding more people wont change the consensus panel)
    • Limitation of Linus’ law (compared to iterative model e.g.
    openstreetmap)
    • Wisdom != skill: we can’t replace training by more people
Friday, September 21, 12                                           22
Methodology for Iterative model




                           sample of an iterative process for map3




Friday, September 21, 12                                             23
Methodology for Iterative model




 n instances
 of about m
  iterations

      Collected data for map1, map2, map3 = 13, 21,25
              instances of about 10 iterations
Friday, September 21, 12                                24
Methodology for Iterative model
            Step 2- for each iteration, we compute the precision,
                     recall, f-measure of all the instances




                           Precision   Recall       F-measure

Friday, September 21, 12                                            25
Intrepretation of results / Comparison
               on data quality

                           Parallel                               Iterative

   Accuracy -
   wrong                   consensual results (*)                 error propagation
   annotations
                                                                  accumulation of
   Accuracy -
                           useless redundancy on                  knowledge driving
   missing
                           obvious buildings                      attention on
   buildings
                                                                  uncovered area
   Consistency             redundancy                             naive last = best
  (*) but parallel < iterative in difficult cases (map 2) (lack of consensus)

Friday, September 21, 12                                                              26
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




Friday, September 21, 12                                                     27
Side-objective: Measuring how the crowd spatially agrees
          Method: taking randomly 2 participants and measure their
        spatial inter-agreement (e.g. ratio of points matching) and repeat
                              the process N time




                           way to measure the intrinsic difficulty of a task
                                  (map 1 = easy , map 2 = quite hard)
Friday, September 21, 12                                                      28
future tracks
                     Impact of the organization beyond data
                     quality
                     • Energy / Footprint to collectively solve a problem,
                     • Participation sustainability,
                     • On Individual behavior (skill Learning & Enjoyment)
                     Skill complementarity:
                     Is the best group of 3 people the best 3 people at the
                     individual level? data says no!
                     Other symbolic organisations / mechanism:
                     • human cellular automata (cell = 1 person, resubmit a task at
                     time t, because influenced by peers results generated at time
                     t-1)
                     • Integration of Game design / Gamification
Friday, September 21, 12                                                              29

More Related Content

Viewers also liked

The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.Nicolas Maisonneuve
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewNicolas Maisonneuve
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualizationNicolas Maisonneuve
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNicolas Maisonneuve
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentNicolas Maisonneuve
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeLuca Galli
 

Viewers also liked (10)

Observer service
Observer service Observer service
Observer service
 
The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.The territory of expertise: machine vs. human expert vs. group of amateurs.
The territory of expertise: machine vs. human expert vs. group of amateurs.
 
a dynamic web feed system
a dynamic web feed systema dynamic web feed system
a dynamic web feed system
 
Social Attention analysis
Social Attention analysisSocial Attention analysis
Social Attention analysis
 
Mapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street ViewMapping Visual Perceptions using Google Street View
Mapping Visual Perceptions using Google Street View
 
Team activity analysis / visualization
Team activity analysis / visualizationTeam activity analysis / visualization
Team activity analysis / visualization
 
NoiseTube project
NoiseTube projectNoiseTube project
NoiseTube project
 
NoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phonesNoiseTube: Participatory sensing for noise pollution via mobile phones
NoiseTube: Participatory sensing for noise pollution via mobile phones
 
Orientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignmentOrientation of the Community's attention and User alignment
Orientation of the Community's attention and User alignment
 
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a PurposeMatching Game Mechanics and Human Computation Tasks in Games with a Purpose
Matching Game Mechanics and Human Computation Tasks in Games with a Purpose
 

Recently uploaded

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 

Recently uploaded (20)

Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 

Crowdsourcing satellite imagery (Talk at Giscience2012)

  • 1. Crowdsourcing satellite imagery: study of iterative vs. parallel models Nicolas Maisonneuve, Bastien Chopard Twitter: nmaisonneuve Friday, September 21, 12 1
  • 2. Damage assessment after a humanitarian crisis Friday, September 21, 12 2
  • 3. Port-au-prince: 300K buildings assessed in 3 months for 8 UNOSAT experts Friday, September 21, 12 3
  • 4. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Friday, September 21, 12 4
  • 5. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 5
  • 6. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 6
  • 7. Organizational challenges: How to organize non-trained volunteers, especially to enforce quality? Investigated scope: • Qualitative + Quantitative study of 2 collaborative models inspired by Computer science: iterative vs parallel information processing • Controlled experiment to isolate quality = F(organisation), removing other parameters e.g. training, task difficulty • this research != studying real world collaborative practices but more extreme/symbolic cases to guide collaborative system designers Friday, September 21, 12 7
  • 8. Tested Collaborative Models (1/2) iterative model e.g. wikipedia, open street map, assembly lines Friday, September 21, 12 8
  • 9. Tested Collaborative Models (2/2) parallel model aggregation e.g. voting systems in society, distributed computing Friday, September 21, 12 9
  • 10. Tested Collaborative Models (2/2) parallel model old version (17th to mid 20th century): when computers were human/women (Mathematical Table project - (1938 -1948) Friday, September 21, 12 10
  • 11. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces Friday, September 21, 12 11
  • 12. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration Friday, September 21, 12 12
  • 13. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions Friday, September 21, 12 13
  • 14. Qualitative comparison Iterative Parallel problem No need to divide complex Complex problem need to be divisibility problem divided in easier pieces optimization copy emphasizing isolation emphasizing tradeoff exploitation exploration quality redundancy + diversity of sequential improvement mechanism opinions useless redundancy for path dependency effect + side effect obvious decisions + pb of sensitivity to vandalism aggregation Friday, September 21, 12 14
  • 15. Controlled Experiment: web platform Interface/instruction for the Parallel model Friday, September 21, 12 15
  • 16. on 3 maps with different topologies (annotated by 1 UNITAR expert) Friday, September 21, 12 16
  • 17. Participants used for the experiments: Mechanical Turk as simulator Friday, September 21, 12 17
  • 18. Data Quality Metrics Quality of the collective output • type I errors = p(wrong annotation) • type II errors = p(missing a building) • Consistency Analogy with the information retrieval field: • Precision = p(an annotation is a building) • Recall = p(a building is annotated) • F-measure = score mixing recall + precision • (metrics adjusted with tolerance distance) Friday, September 21, 12 18
  • 19. Methodology for parallel model Step 1 - collecting independent contribution: N for (map1, map2, map3) = (121,120,113) Friday, September 21, 12 19
  • 20. Methodology for parallel model Step 2 - for each map, generating the set of groups of m=[1 to N] participants m=1 m=2 m=3 Friday, September 21, 12 20
  • 21. Methodology for parallel model Step 3 - for each group: aggregating + computing quality groups of m = 2 Spatial Clustering of points + quorum Compute Data Quality with Gold Standard Precision Recall F-measure Friday, September 21, 12 21
  • 22. The more = the better? (parallel model) avg. F-measure yes but until some points.. • (Adding more people wont change the consensus panel) • Limitation of Linus’ law (compared to iterative model e.g. openstreetmap) • Wisdom != skill: we can’t replace training by more people Friday, September 21, 12 22
  • 23. Methodology for Iterative model sample of an iterative process for map3 Friday, September 21, 12 23
  • 24. Methodology for Iterative model n instances of about m iterations Collected data for map1, map2, map3 = 13, 21,25 instances of about 10 iterations Friday, September 21, 12 24
  • 25. Methodology for Iterative model Step 2- for each iteration, we compute the precision, recall, f-measure of all the instances Precision Recall F-measure Friday, September 21, 12 25
  • 26. Intrepretation of results / Comparison on data quality Parallel Iterative Accuracy - wrong consensual results (*) error propagation annotations accumulation of Accuracy - useless redundancy on knowledge driving missing obvious buildings attention on buildings uncovered area Consistency redundancy naive last = best (*) but parallel < iterative in difficult cases (map 2) (lack of consensus) Friday, September 21, 12 26
  • 27. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time Friday, September 21, 12 27
  • 28. Side-objective: Measuring how the crowd spatially agrees Method: taking randomly 2 participants and measure their spatial inter-agreement (e.g. ratio of points matching) and repeat the process N time way to measure the intrinsic difficulty of a task (map 1 = easy , map 2 = quite hard) Friday, September 21, 12 28
  • 29. future tracks Impact of the organization beyond data quality • Energy / Footprint to collectively solve a problem, • Participation sustainability, • On Individual behavior (skill Learning & Enjoyment) Skill complementarity: Is the best group of 3 people the best 3 people at the individual level? data says no! Other symbolic organisations / mechanism: • human cellular automata (cell = 1 person, resubmit a task at time t, because influenced by peers results generated at time t-1) • Integration of Game design / Gamification Friday, September 21, 12 29