O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

Role of crowdsourcing

Slides presented by Anna Noel-Storr at UK Cochrane Centre on 16th March 2015 at ContentMinw workshop on Clinical Trials.

  • Entre para ver os comentários

Role of crowdsourcing

  1. 1. Data and text mining workshop The role of crowdsourcing Anna Noel-Storr Wellcome Trust, London, Friday 6th March 2015
  2. 2. What is crowdsourcing? “…the practice of obtaining needed services, ideas, or content by soliciting contributions from a large group of people, and especially from an online community, rather than from traditional employees…” Image credit: DesignCareer
  3. 3. What is crowdsourcing? Knowledge discovery and management Brabham’s problem focused crowdsourcing typology: 4 types
  4. 4. What is crowdsourcing? Knowledge discovery and management Broadcast search Brabham’s problem focused crowdsourcing typology: 4 types
  5. 5. What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Brabham’s problem focused crowdsourcing typology: 4 types
  6. 6. What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Distributed human intelligence tasking Brabham’s problem focused crowdsourcing typology: 4 types
  7. 7. What is crowdsourcing? Knowledge discovery and management Broadcast search Peer-vetted creative production Distributed human intelligence tasking Brabham’s problem focused crowdsourcing typology: 4 types
  8. 8. Micro-tasking: process Breaking down large corpus of data into smaller units and distributing those units to a large online crowd “the distribution of small parts of a problem”
  9. 9. Human computation Humans remain better than machines at certain tasks: e.g. Identifying pizza toppings from a picture of a pizza e.g. “preventing obesity without eating like a rabbit”.ti. – autotag: Animal study
  10. 10. Tools and platforms What platforms and tools exist and how do they work? Image credit: ThinkStock
  11. 11. The Zooniverse “each project uses the efforts and ability of volunteers to help scientists and researchers deal with the flood of data that confronts them”
  12. 12. Classification and annotation Galaxy Zoo Operation War Diary
  13. 13. Health related evidence production Can we use crowdsourcing to identify the evidence in a more timely way? - Known pressure point within the review production - Between 2000 and 5000 citations per new review, but can be much more - A not much loved task Trial identification
  14. 14. The Embase project Cochrane’s Central Register of Controlled Trials: CENTRAL Embase Crowd Embase auto Step 2: Use a crowd to screen thousands of search results from Embase and feed the identified reports of RCTs into CENTRAL Howwill the crowd do this? Step 1: run a very sensitive search in the largest biomedical database for studies
  15. 15. The screening tool Three choice s You are not alone! (and you can’t go back) Progress bar Yellow highlights to indicate a likely RCT Red highlights
  16. 16. The Embase project: recruitment - 900+ people have signed-up to screen citations in 12 months - 110,000+ citations have been collectively screened - 4,000 RCTs/q-RCTs identified by the crowd 0 100 200 300 400 500 600 700 800 900 1000 Feb-14 Mar-14 Apr-14 May-14 Jun-14 Jul-14 Aug-14 Sep-14 Oct-14 Nov-14 Dec-14 Jan-15 Feb-15 Mar-15 Number of Participants Participants
  17. 17. Why do people do it? Made it very easy to participate (and equally easy to stop!) Gain experience (bulk up the CV) Provide feedback: both to the individual and to the community Wanting to do something to contribute (healthcare is a strong hook) (people are more likely to come back)
  18. 18. RCT RCT RCT Reject Reject Reject Unsure CENTRAL Bin Resolver How accurate is the crowd? RCTReject Resolver 5%
  19. 19. Crowd accuracy TP 1565 FP 9 FN 2 TN 2888 TP 415 FP 5 FN 1 TN 2649 The Crowd: INDEX TEST The Crowd: INDEX TEST The Info specialist: REFERENCE STANDARD The Info specialists: REFERENCE STANDARD Validation 1 Validation 2 Sensitivity: 99.9% Specificity: 99.7% Sensitivity: 99.8% Specificity: 99.8% Enriched sample; blinded to crowd decision; dual independent screeners as reference standard Enriched sample; blinded to crowd decision; single independent expert screener (me!) as reference standard; possibility of incorporation bias Individual screener accuracy is also carefully monitored
  20. 20. How fast is the crowd? Number of weeks Jan 2014 Jul 2014 Jan 2015 6 weeks 5 weeks 2 weeks More screeners and more screeners screening more quickly Length of time to screen one month’s worth of records
  21. 21. More of the same, and more tasks As the crowd becomes more efficient, we plan to do two things: 1. Increase the databases we search – feed in more citations 2. Offer other ‘micro-tasks’ Feed in more citations – from other databases Bin Y N Screen Annotate, appraise And in these tasks the machine plays a vital and complementary role… e.g. is the healthcare condition Alzheimer’s disease? Y, N, Unsure
  22. 22. Perfect partnership Machine driven probability + Collective human decision-making It’s not one or the other, the ideal is both
  23. 23. In summary • Effective method in large scale study identification • Identify more studies, more quickly • No compromise on quality or accuracy • Offers meaningful ways to contribute • Feasible to recruit a crowd • Highly functional tool • Complements data and text mining And enables the move towards the living review Crowdsourcing:

×