SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
• Introduction
• Crowd Motivation
• Client Motivations and Types of tasks
• Scale up with Machine Learning
• Quality Management
• Workflows for Complex tasks
• Reputation Systems
• Economic shift
                              PWI - September 29, 2011
                               corina@waterloohills.com
               http://bitsofknowledge.waterloohills.com
                        http://bitsofknowledge.waterloohills.com
Crowdsourcing                      Crowd or Community
                                   (online audience)




1               2


                                                               3


                    4
                    http://bitsofknowledge.waterloohills.com
Ex: “Adult Websites” Classification
• Large number of sites to label
• Get people to look at sites and classify them as:

  –G           (general audience)
  – PG         (parental guidance)
  –R           (restricted)
  –X           (porn)



  [Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Ex: “Adult Websites” Classification
• Large number of hand‐labeled sites
• Get people to look at sites and classify them as:

   –G           (general audience)
   – PG         (parental guidance)
   –R           (restricted)
   –X           (porn)
Cost/Speed Statistics:
• Undergrad intern: 200 websites/hr, cost: $15/hr
• MTurk: 2500 websites/hr, cost: $12/hr

 [Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Crowd Motivation

• €,$ = Money!
• Self-serving purpose (learning new skills,
  get recognition, avoid boredom, enjoyment,
  create a network with other profesionals)
• Socializing, feeling of belonging to a
  community, friendship
• Altruism (public good, help others)


                       http://bitsofknowledge.waterloohills.com
Examples: Altruism




         http://bitsofknowledge.waterloohills.com
Crowd Demography
  (background defines motivation)
• The 2008 survey at iStockphoto indicates that
  the crowd is quite homogenous and elite.

• Amazon’s Mechanical Turk workers come
  mainly from 2 countries:
  a) USA
  b) India


                       http://bitsofknowledge.waterloohills.com
Crowd Demography




        http://bitsofknowledge.waterloohills.com
Client motivation
• Need Suppliers:

  Mass work, Distributed work, or just tedious work
   Creative work
   Look for specific talent
   Testing
   Support
   To offload peak demands
   Tackle problems that need specific communities
  or human variety
   Any work that can be done cheaper this way.
                          http://bitsofknowledge.waterloohills.com
Client motivation

• Need customers!

• Need Funding

• Need to be Backed up

• Crowdsourcing is your business!



                     http://bitsofknowledge.waterloohills.com
Examples of Funding




         http://bitsofknowledge.waterloohills.com
Client Tasks Goals
3 main goals for a task to be done:

1. Minimize Cost (cheap)
2. Minimize Completion Time (fast)
3. Maximize Quality (good)

 Remember Crowd Motivation!
  (ex.: Game-ify your task,
  explain the final purpose)
                       http://bitsofknowledge.waterloohills.com
Examples: Games




       http://bitsofknowledge.waterloohills.com
http://bitsofknowledge.waterloohills.com
[Panos Ipeirotis. WWW2011 tutorial]
Pros
• Quicker: Parallellism reduces time
• Cheap
• Creativity, Innovation
• Quality (*depends)
• Access to scarce resources: The ‘long tail’
• Multiple feedback
• Allows to create a community (followers)
• Business Agility
• Scales up! (*up to a level)

                        http://bitsofknowledge.waterloohills.com
Cons
• Lack of professionalism: Unverified quality
• Too many answers
• No standards
• Not always cheap: Added costs to bring a
project to conclusion
• Too few participants if task or pay is not
attractive
• If worker is not motivated, lower quality of work


                         http://bitsofknowledge.waterloohills.com
Scale Up with Machine Learning
    Build an ‘Adult Website’ Classifier

• Crowdsourcing is cheap but not free
  - Workers cannot do more than xxhours/day,
   Cannot scale to web without help

Build automatic classification models using
 examples from crowdsourced data


                       http://bitsofknowledge.waterloohills.com
Integration with Machine Learning

• Humans label training data
• Use training data to build model




                       http://bitsofknowledge.waterloohills.com
Quality Management
       Ex: “Adult Website” Classification
• Bad news: Spammers!
• Worker ATAMRO447HWJQ labeled
  X (porn) sites as G (general audience)




[Panos Ipeirotis. WWW2011 tutorial]   http://bitsofknowledge.waterloohills.com
Quality Management
   Majority Voting and Label Quality
• Spammers try to go undetected
• Good willing workers may have bias
      difficult to set apart.

1. Ask multiple labelers
2. Keep majority label as
   “true” label

Use the probability of
being correct as the
Quality Indicator

                            http://bitsofknowledge.waterloohills.com
Complex tasks
 Handle answers through workflow
• Q: “My task does not have discrete answers….”
• A: Break into two Human Intelligence Tasks (HITs):
   – “Create” HIT
   – “Vote” HIT

Vote controls quality of Creation HIT
• Redundancy controls quality of Voting HIT



                          http://bitsofknowledge.waterloohills.com
Collaboration: Photo description
  But the free-form
  answer can be more
  complex, not just right or
  wrong…




TurkIt toolkit [Little et al., UIST 2010]: http://groups.csail.mit.edu/uid/turkit/
                                                 http://bitsofknowledge.waterloohills.com
Collaboration: Description Versions
1. A partial view of a pocket calculator
   together with some coins and a pen.
2. ...
3. A close‐up photograph of the following
   items: A CASIO multi‐function
   calculator. A ball point pen, uncapped.
   Various coins, apparently European,
   both copper and gold. Seems to be a
   theme illustration for a brochure or
   document cover treating finance,
   probably personal finance.
4. …
8. A close‐up photograph of the following items: A CASIO
   multi‐function, solar powered scientific calculator. A blue ball
   point pen with a blue rubber grip and the tip extended. Six
   British coins; two of £1value, three of 20p value and one of 1p
   value. Seems to be a theme illustration for a brochure or
   document cover treating finance ‐ probably personal finance.
                                      http://bitsofknowledge.waterloohills.com
Collaboration

• Exploration / exploitation tradeoff
  (Independence/or not)
– Can accelerate learning, by sharing good
  solutions
– But can lead to premature convergence on
  suboptimal solution

[Mason and Watts, submitted to Science, 2011]


                             http://bitsofknowledge.waterloohills.com
Collaboration: Positive
• Building iteratively allows better outcomes
  for the image description task.
• In the FoldIt puzzles, workers built on each
  other’s results. They recently found in 10
  days the molecular structure of a protein-
  cutting enzyme from an AIDS-like virus.




                        http://bitsofknowledge.waterloohills.com
Collaboration: Negative
             Group Thinking Effect
• Individual search strategies affect group success:

                        Players copying each other
                        make less exploring
                         lower probability of finding
                        peak on a round




                            http://bitsofknowledge.waterloohills.com
Workflow Patterns
• Generate / Create
• Find
• Improve / Edit / Fix
                                  Creation
• Vote for accept‐reject
• Vote up, vote down, to generate rank
• Vote for best / select top‐k
                                 Quality Control
• Split task
• Aggregate Flow Control
• Iterate
                                 Flow Control
                             http://bitsofknowledge.waterloohills.com
AdSafe Crowdsourcing Experience




               http://bitsofknowledge.waterloohills.com
http://bitsofknowledge.waterloohills.com
AdSafe Crowdsourcing Experience
•Detect pages that discuss swine flu
– Pharmaceutical firm had drug “treating” (off-label) swine flu
– FDA prohibited pharmaceuticals to display drug ad in
pages about swine flu
       Two days to comply!

• Big fast-food chain does not want ad to appear:
– In pages that discuss the brand (99% negative sentiment)
– In pages discussing obesity




                               http://bitsofknowledge.waterloohills.com
Adsafe Crowdsourcing Experience
     Workflow to classify URLs
• Find URLs for a given topic (hate speech, gambling, alcohol
abuse, guns, bombs, celebrity gossip, etc etc)
http://url‐collector.appspot.com/allTopics.jsp

• Classify URLs into appropriate categories
http://url‐annotator.appspot.com/AdminFiles/Categories.jsp

• Mesure quality of the labelers and remove spammers
http://qmturk.appspot.com/

• Get humans to “beat” the classifier by providing cases where
the classifier fails
http://adsafe‐beatthemachine.appspot.com/
                                http://bitsofknowledge.waterloohills.com
Crowdsourcing Aggregators
Act as Portals
• Create a crowd or community.
• Create a site to connect a client to the crowd
• Deal with workflow of complex tasks, like
decomposition into simpler tasks and answer
recomposition
• Works as Broker and Bank, Mediator

 Allow anonymity
 Consumers can benefit from a crowd without
the need to create it. http://bitsofknowledge.waterloohills.com
Market Design:
Crude vs Intelligent Crowdsourcing
• Intelligent Crowdsourcing uses an
  organized workflow to tackle CONS of
  crude crowdsourcing.

 Complex task is divided by experts,
 Given to relevant crowds, and not to
 everyone
Individual answers are recomposed by
 experts into general answer
                     http://bitsofknowledge.waterloohills.com
Lack of Reputation and
              Market for Lemons
“When quality of sold good is uncertain and hidden before
  transaction, prize goes to value of lowest valued good”
  [Akerlof, 1970; Nobel prize winner]

• Market evolution steps:
  1. Employers pays $10 to good worker, $0.1 to bad worker
  2. 50% good workers, 50% bad; indistinguishable from
  each other
  3. Employer offers price in the middle: $5
  4. Some good workers leave the market (pay too low)
  5. Employer revised prices downwards as % of bad
  increased
  6. More good workers leave the market… death spiral


http://en.wikipedia.org/wiki/The_Market_for_Lemons
                                        http://bitsofknowledge.waterloohills.com
Reputation systems
• Challenges:
  - Insufficient participation
  - Overwhelmingly positive feedback
   + Hoping to get a positive ranking in return
   - Negative feedback avoided for fear of retaliation
  - Dishonest reports
   + « Riddle for a PENNY! No shipping-Positive Feedback »
    - « Bad-mouth » reports
• Incentive mechanisms to get honest feedback
  - pay rater if report matches next;
  - delay next transaction over time
                                 http://bitsofknowledge.waterloohills.com
Reputation systems
• “Cheap pseudonyms”: easy to disappear and
  reregister under a new identity with almost no cost.
  [Friedman and Resnick 2001]
 Introduce opportunities to misbehave without
 paying reputational consequences.
Increase the difficulty of online identity changes
 Impose upfront costs to new entrants: allow new
 identities (forget the past) but make it costly.

• 2-sided Reputation Mechanisms
  – Crowd: To ensure worker quality
  – Employer: To ensure their trustworthiness
                                http://bitsofknowledge.waterloohills.com
Economical Shift
• From Social Networking to Social Production
  through Collaborative Innovation

   Mass-Collaboration changes how Products &
  Services are Designed,Manufactured,Marketed

• Classical geo-political and economical organisations
  do not correspond to new economy

   Realignment of competitive advantages
   Move towards Collaborative Enterprises based
  on Open Infrastructure
                            http://bitsofknowledge.waterloohills.com
Societal Shift
          Moral values Reinforcement
•   Open data access makes actions Transparent
•   Transparency makes people Accountable
•   Accountability forces/fosters Integrity
•   Integrity breeds Community Support

 Link between Ethical values and ROI


                         http://bitsofknowledge.waterloohills.com
References
• Wikipedia,2011
• Dion Hinchcliffe Crowdsourcing: 5 Reasons Its Not Just For Start Ups
Anymore,2009
• Tomoko A. Hosaka, MSNBC. "Facebook asks users to translate for
free“,2008.
• Daren C. Brabham. "Moving the Crowd at iStockphoto: The Composition of
the Crowd and Motivations for Participation in a Crowdsourcing Application",
First Monday, 13(6),2008.
• Karim R. Lakhani, Lars Bo Jeppesen, Peter A. Lohse & Jill A. Panetta. The
value of openness in scientific problem solving (Harvard Business School
Working Paper No. 07-050),2007.
• Klaus-Peter Speidel How to Do Intelligent Crowdsourcing,2011
• Panos Ipeirotis. Managing Crowdsourced Human Computation,
WWW2011 tutorial,2011
• Omar Alonso & Matthew Lease. Crowdsourcing 101: Putting the WSDM of
Crowds to Work for You, WSDM Hong Kong 2011.
• Sanjoy Dasgupta,
http://videolectures.net/icml09_dasgupta_langford_actl/,2009
•Don Tapscott, Anthony Williams. Macrowikinomics, 2010.
                                        http://bitsofknowledge.waterloohills.com
Call For Ideas:

                 If you have a large set of examples
                         or just an idea of application
                 for a program to classify or predict,
                       I would love to hear from you!

Questions?
                       corina@waterloohills.com
        http://bitsofknowledge.waterloohills.com
                           PWI - September 29, 2011
                       http://bitsofknowledge.waterloohills.com

Mais conteúdo relacionado

Destaque

The World of Crowdsourcing
The World of CrowdsourcingThe World of Crowdsourcing
The World of CrowdsourcingRMA OmniGrade
 
Webster the future of hr
Webster   the future of hrWebster   the future of hr
Webster the future of hrSteve Urquhart
 
Start Innovating Already: 13 Poisons to Open Innovation
Start  Innovating Already: 13 Poisons to Open InnovationStart  Innovating Already: 13 Poisons to Open Innovation
Start Innovating Already: 13 Poisons to Open InnovationLisa Thorell
 
Crowdsource Your Performance Review
Crowdsource Your Performance ReviewCrowdsource Your Performance Review
Crowdsource Your Performance ReviewGloboforce
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012xin wang
 
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Gaurav Mishra
 

Destaque (6)

The World of Crowdsourcing
The World of CrowdsourcingThe World of Crowdsourcing
The World of Crowdsourcing
 
Webster the future of hr
Webster   the future of hrWebster   the future of hr
Webster the future of hr
 
Start Innovating Already: 13 Poisons to Open Innovation
Start  Innovating Already: 13 Poisons to Open InnovationStart  Innovating Already: 13 Poisons to Open Innovation
Start Innovating Already: 13 Poisons to Open Innovation
 
Crowdsource Your Performance Review
Crowdsource Your Performance ReviewCrowdsource Your Performance Review
Crowdsource Your Performance Review
 
Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012Crowdsourcing challenges and opportunities 2012
Crowdsourcing challenges and opportunities 2012
 
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
Future of Crowdsourcing: Creation to Curation, Search to Synthesis, Content t...
 

Semelhante a Crowdsourcing Techniques and Best Practices

Getting Things Done with Crowdsourcing PWI May-2014
Getting Things Done with Crowdsourcing  PWI May-2014Getting Things Done with Crowdsourcing  PWI May-2014
Getting Things Done with Crowdsourcing PWI May-2014Corina Ciechanow
 
Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012Carol Smith
 
Selling UX at CodeMash 2012
Selling UX at CodeMash 2012Selling UX at CodeMash 2012
Selling UX at CodeMash 2012Carol Smith
 
Social media class 02062013
Social media class 02062013Social media class 02062013
Social media class 02062013Kyle Claypool
 
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar
 
TRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAYTRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAYRiva-Melissa Tez
 
12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNICraig Sullivan
 
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Invest Northern Ireland
 
Principles of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDMPrinciples of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDMDigitangle
 
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)favalora
 
Intro to Product Management
Intro to Product Management Intro to Product Management
Intro to Product Management Ria Sankar
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startupmatthewhyatt
 
12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management Playbook12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management PlaybookJeremy Horn
 
How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)Dave McClure
 
Building 500 Startups: #500STRONG
Building 500 Startups: #500STRONGBuilding 500 Startups: #500STRONG
Building 500 Startups: #500STRONGDave McClure
 
Denver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix ThemDenver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix ThemAlli Berry
 
LEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEWLEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEWwe20
 
Web 2.0 Components for Business Websites
Web 2.0 Components for Business WebsitesWeb 2.0 Components for Business Websites
Web 2.0 Components for Business WebsitesGems Solutions
 
How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)Ricardo Dantas
 
Attract traffic with content and social media
Attract traffic with content and social mediaAttract traffic with content and social media
Attract traffic with content and social mediaInfusionsoft
 

Semelhante a Crowdsourcing Techniques and Best Practices (20)

Getting Things Done with Crowdsourcing PWI May-2014
Getting Things Done with Crowdsourcing  PWI May-2014Getting Things Done with Crowdsourcing  PWI May-2014
Getting Things Done with Crowdsourcing PWI May-2014
 
Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012Selling UX in Your Organization - Stir Trek 2012
Selling UX in Your Organization - Stir Trek 2012
 
Selling UX at CodeMash 2012
Selling UX at CodeMash 2012Selling UX at CodeMash 2012
Selling UX at CodeMash 2012
 
Social media class 02062013
Social media class 02062013Social media class 02062013
Social media class 02062013
 
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018 Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
Ria Sankar - How to Build Winning Products - Product School Bellevue - 83018
 
TRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAYTRANSLATING THE SH*T ENTREPRENEURS SAY
TRANSLATING THE SH*T ENTREPRENEURS SAY
 
12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI12 reasons your site sucks - InvestNI
12 reasons your site sucks - InvestNI
 
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
Driving Online Sales - Craig Sullivan, The future of the online marketplace 2...
 
Principles of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDMPrinciples of Website Design - Customer Experience and Usability IDM
Principles of Website Design - Customer Experience and Usability IDM
 
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
Hardware Startups 101: Tips for "career engineers" (NESOSA 2015; Favalora)
 
Intro to Product Management
Intro to Product Management Intro to Product Management
Intro to Product Management
 
Building an Excellent Web Startup
Building an Excellent Web StartupBuilding an Excellent Web Startup
Building an Excellent Web Startup
 
12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management Playbook12 Rules for Building Your Product Management Playbook
12 Rules for Building Your Product Management Playbook
 
How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)How to Pitch a VC (Shanghai, May 2012)
How to Pitch a VC (Shanghai, May 2012)
 
Building 500 Startups: #500STRONG
Building 500 Startups: #500STRONGBuilding 500 Startups: #500STRONG
Building 500 Startups: #500STRONG
 
Denver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix ThemDenver Startup Week: 10 Common Website Mistakes and How to Fix Them
Denver Startup Week: 10 Common Website Mistakes and How to Fix Them
 
LEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEWLEARN STARTUP OVERVIEW
LEARN STARTUP OVERVIEW
 
Web 2.0 Components for Business Websites
Web 2.0 Components for Business WebsitesWeb 2.0 Components for Business Websites
Web 2.0 Components for Business Websites
 
How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)How to pich a VC (by Dave McClure)
How to pich a VC (by Dave McClure)
 
Attract traffic with content and social media
Attract traffic with content and social mediaAttract traffic with content and social media
Attract traffic with content and social media
 

Último

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayNZSG
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataExhibitors Data
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Roland Driesen
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMANIlamathiKannappan
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Lviv Startup Club
 
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...Any kyc Account
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsP&CO
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communicationskarancommunications
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdftbatkhuu1
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Dipal Arora
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...Suhani Kapoor
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxWorkforce Group
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Servicediscovermytutordmt
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfOnline Income Engine
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesDipal Arora
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetDenis Gagné
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...lizamodels9
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Dave Litwiller
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyEthan lee
 

Último (20)

It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
RSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors DataRSA Conference Exhibitor List 2024 - Exhibitors Data
RSA Conference Exhibitor List 2024 - Exhibitors Data
 
Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...Ensure the security of your HCL environment by applying the Zero Trust princi...
Ensure the security of your HCL environment by applying the Zero Trust princi...
 
A DAY IN THE LIFE OF A SALESMAN / WOMAN
A DAY IN THE LIFE OF A  SALESMAN / WOMANA DAY IN THE LIFE OF A  SALESMAN / WOMAN
A DAY IN THE LIFE OF A SALESMAN / WOMAN
 
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
Yaroslav Rozhankivskyy: Три складові і три передумови максимальної продуктивн...
 
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
KYC-Verified Accounts: Helping Companies Handle Challenging Regulatory Enviro...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Pharma Works Profile of Karan Communications
Pharma Works Profile of Karan CommunicationsPharma Works Profile of Karan Communications
Pharma Works Profile of Karan Communications
 
A305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdfA305_A2_file_Batkhuu progress report.pdf
A305_A2_file_Batkhuu progress report.pdf
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
VIP Call Girls Gandi Maisamma ( Hyderabad ) Phone 8250192130 | ₹5k To 25k Wit...
 
Cracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptxCracking the Cultural Competence Code.pptx
Cracking the Cultural Competence Code.pptx
 
Call Girls in Gomti Nagar - 7388211116 - With room Service
Call Girls in Gomti Nagar - 7388211116  - With room ServiceCall Girls in Gomti Nagar - 7388211116  - With room Service
Call Girls in Gomti Nagar - 7388211116 - With room Service
 
Unlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdfUnlocking the Secrets of Affiliate Marketing.pdf
Unlocking the Secrets of Affiliate Marketing.pdf
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature SetCreating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
Creating Low-Code Loan Applications using the Trisotech Mortgage Feature Set
 
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
Call Girls In Holiday Inn Express Gurugram➥99902@11544 ( Best price)100% Genu...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
Enhancing and Restoring Safety & Quality Cultures - Dave Litwiller - May 2024...
 
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case studyThe Coffee Bean & Tea Leaf(CBTL), Business strategy case study
The Coffee Bean & Tea Leaf(CBTL), Business strategy case study
 

Crowdsourcing Techniques and Best Practices

  • 1. • Introduction • Crowd Motivation • Client Motivations and Types of tasks • Scale up with Machine Learning • Quality Management • Workflows for Complex tasks • Reputation Systems • Economic shift PWI - September 29, 2011 corina@waterloohills.com http://bitsofknowledge.waterloohills.com http://bitsofknowledge.waterloohills.com
  • 2. Crowdsourcing Crowd or Community (online audience) 1 2 3 4 http://bitsofknowledge.waterloohills.com
  • 3. Ex: “Adult Websites” Classification • Large number of sites to label • Get people to look at sites and classify them as: –G (general audience) – PG (parental guidance) –R (restricted) –X (porn) [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 4. Ex: “Adult Websites” Classification • Large number of hand‐labeled sites • Get people to look at sites and classify them as: –G (general audience) – PG (parental guidance) –R (restricted) –X (porn) Cost/Speed Statistics: • Undergrad intern: 200 websites/hr, cost: $15/hr • MTurk: 2500 websites/hr, cost: $12/hr [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 5. Crowd Motivation • €,$ = Money! • Self-serving purpose (learning new skills, get recognition, avoid boredom, enjoyment, create a network with other profesionals) • Socializing, feeling of belonging to a community, friendship • Altruism (public good, help others) http://bitsofknowledge.waterloohills.com
  • 6. Examples: Altruism http://bitsofknowledge.waterloohills.com
  • 7. Crowd Demography (background defines motivation) • The 2008 survey at iStockphoto indicates that the crowd is quite homogenous and elite. • Amazon’s Mechanical Turk workers come mainly from 2 countries: a) USA b) India http://bitsofknowledge.waterloohills.com
  • 8. Crowd Demography http://bitsofknowledge.waterloohills.com
  • 9. Client motivation • Need Suppliers: Mass work, Distributed work, or just tedious work  Creative work  Look for specific talent  Testing  Support  To offload peak demands  Tackle problems that need specific communities or human variety  Any work that can be done cheaper this way. http://bitsofknowledge.waterloohills.com
  • 10. Client motivation • Need customers! • Need Funding • Need to be Backed up • Crowdsourcing is your business! http://bitsofknowledge.waterloohills.com
  • 11. Examples of Funding http://bitsofknowledge.waterloohills.com
  • 12. Client Tasks Goals 3 main goals for a task to be done: 1. Minimize Cost (cheap) 2. Minimize Completion Time (fast) 3. Maximize Quality (good)  Remember Crowd Motivation! (ex.: Game-ify your task, explain the final purpose) http://bitsofknowledge.waterloohills.com
  • 13. Examples: Games http://bitsofknowledge.waterloohills.com
  • 15. Pros • Quicker: Parallellism reduces time • Cheap • Creativity, Innovation • Quality (*depends) • Access to scarce resources: The ‘long tail’ • Multiple feedback • Allows to create a community (followers) • Business Agility • Scales up! (*up to a level) http://bitsofknowledge.waterloohills.com
  • 16. Cons • Lack of professionalism: Unverified quality • Too many answers • No standards • Not always cheap: Added costs to bring a project to conclusion • Too few participants if task or pay is not attractive • If worker is not motivated, lower quality of work http://bitsofknowledge.waterloohills.com
  • 17. Scale Up with Machine Learning Build an ‘Adult Website’ Classifier • Crowdsourcing is cheap but not free - Workers cannot do more than xxhours/day, Cannot scale to web without help Build automatic classification models using examples from crowdsourced data http://bitsofknowledge.waterloohills.com
  • 18. Integration with Machine Learning • Humans label training data • Use training data to build model http://bitsofknowledge.waterloohills.com
  • 19. Quality Management Ex: “Adult Website” Classification • Bad news: Spammers! • Worker ATAMRO447HWJQ labeled X (porn) sites as G (general audience) [Panos Ipeirotis. WWW2011 tutorial] http://bitsofknowledge.waterloohills.com
  • 20. Quality Management Majority Voting and Label Quality • Spammers try to go undetected • Good willing workers may have bias  difficult to set apart. 1. Ask multiple labelers 2. Keep majority label as “true” label Use the probability of being correct as the Quality Indicator http://bitsofknowledge.waterloohills.com
  • 21. Complex tasks Handle answers through workflow • Q: “My task does not have discrete answers….” • A: Break into two Human Intelligence Tasks (HITs): – “Create” HIT – “Vote” HIT Vote controls quality of Creation HIT • Redundancy controls quality of Voting HIT http://bitsofknowledge.waterloohills.com
  • 22. Collaboration: Photo description But the free-form answer can be more complex, not just right or wrong… TurkIt toolkit [Little et al., UIST 2010]: http://groups.csail.mit.edu/uid/turkit/ http://bitsofknowledge.waterloohills.com
  • 23. Collaboration: Description Versions 1. A partial view of a pocket calculator together with some coins and a pen. 2. ... 3. A close‐up photograph of the following items: A CASIO multi‐function calculator. A ball point pen, uncapped. Various coins, apparently European, both copper and gold. Seems to be a theme illustration for a brochure or document cover treating finance, probably personal finance. 4. … 8. A close‐up photograph of the following items: A CASIO multi‐function, solar powered scientific calculator. A blue ball point pen with a blue rubber grip and the tip extended. Six British coins; two of £1value, three of 20p value and one of 1p value. Seems to be a theme illustration for a brochure or document cover treating finance ‐ probably personal finance. http://bitsofknowledge.waterloohills.com
  • 24. Collaboration • Exploration / exploitation tradeoff (Independence/or not) – Can accelerate learning, by sharing good solutions – But can lead to premature convergence on suboptimal solution [Mason and Watts, submitted to Science, 2011] http://bitsofknowledge.waterloohills.com
  • 25. Collaboration: Positive • Building iteratively allows better outcomes for the image description task. • In the FoldIt puzzles, workers built on each other’s results. They recently found in 10 days the molecular structure of a protein- cutting enzyme from an AIDS-like virus. http://bitsofknowledge.waterloohills.com
  • 26. Collaboration: Negative Group Thinking Effect • Individual search strategies affect group success: Players copying each other make less exploring  lower probability of finding peak on a round http://bitsofknowledge.waterloohills.com
  • 27. Workflow Patterns • Generate / Create • Find • Improve / Edit / Fix  Creation • Vote for accept‐reject • Vote up, vote down, to generate rank • Vote for best / select top‐k  Quality Control • Split task • Aggregate Flow Control • Iterate  Flow Control http://bitsofknowledge.waterloohills.com
  • 28. AdSafe Crowdsourcing Experience http://bitsofknowledge.waterloohills.com
  • 30. AdSafe Crowdsourcing Experience •Detect pages that discuss swine flu – Pharmaceutical firm had drug “treating” (off-label) swine flu – FDA prohibited pharmaceuticals to display drug ad in pages about swine flu  Two days to comply! • Big fast-food chain does not want ad to appear: – In pages that discuss the brand (99% negative sentiment) – In pages discussing obesity http://bitsofknowledge.waterloohills.com
  • 31. Adsafe Crowdsourcing Experience Workflow to classify URLs • Find URLs for a given topic (hate speech, gambling, alcohol abuse, guns, bombs, celebrity gossip, etc etc) http://url‐collector.appspot.com/allTopics.jsp • Classify URLs into appropriate categories http://url‐annotator.appspot.com/AdminFiles/Categories.jsp • Mesure quality of the labelers and remove spammers http://qmturk.appspot.com/ • Get humans to “beat” the classifier by providing cases where the classifier fails http://adsafe‐beatthemachine.appspot.com/ http://bitsofknowledge.waterloohills.com
  • 32. Crowdsourcing Aggregators Act as Portals • Create a crowd or community. • Create a site to connect a client to the crowd • Deal with workflow of complex tasks, like decomposition into simpler tasks and answer recomposition • Works as Broker and Bank, Mediator  Allow anonymity  Consumers can benefit from a crowd without the need to create it. http://bitsofknowledge.waterloohills.com
  • 33. Market Design: Crude vs Intelligent Crowdsourcing • Intelligent Crowdsourcing uses an organized workflow to tackle CONS of crude crowdsourcing.  Complex task is divided by experts,  Given to relevant crowds, and not to everyone Individual answers are recomposed by experts into general answer http://bitsofknowledge.waterloohills.com
  • 34. Lack of Reputation and Market for Lemons “When quality of sold good is uncertain and hidden before transaction, prize goes to value of lowest valued good” [Akerlof, 1970; Nobel prize winner] • Market evolution steps: 1. Employers pays $10 to good worker, $0.1 to bad worker 2. 50% good workers, 50% bad; indistinguishable from each other 3. Employer offers price in the middle: $5 4. Some good workers leave the market (pay too low) 5. Employer revised prices downwards as % of bad increased 6. More good workers leave the market… death spiral http://en.wikipedia.org/wiki/The_Market_for_Lemons http://bitsofknowledge.waterloohills.com
  • 35. Reputation systems • Challenges: - Insufficient participation - Overwhelmingly positive feedback + Hoping to get a positive ranking in return - Negative feedback avoided for fear of retaliation - Dishonest reports + « Riddle for a PENNY! No shipping-Positive Feedback » - « Bad-mouth » reports • Incentive mechanisms to get honest feedback - pay rater if report matches next; - delay next transaction over time http://bitsofknowledge.waterloohills.com
  • 36. Reputation systems • “Cheap pseudonyms”: easy to disappear and reregister under a new identity with almost no cost. [Friedman and Resnick 2001] Introduce opportunities to misbehave without paying reputational consequences. Increase the difficulty of online identity changes Impose upfront costs to new entrants: allow new identities (forget the past) but make it costly. • 2-sided Reputation Mechanisms – Crowd: To ensure worker quality – Employer: To ensure their trustworthiness http://bitsofknowledge.waterloohills.com
  • 37. Economical Shift • From Social Networking to Social Production through Collaborative Innovation  Mass-Collaboration changes how Products & Services are Designed,Manufactured,Marketed • Classical geo-political and economical organisations do not correspond to new economy  Realignment of competitive advantages  Move towards Collaborative Enterprises based on Open Infrastructure http://bitsofknowledge.waterloohills.com
  • 38. Societal Shift Moral values Reinforcement • Open data access makes actions Transparent • Transparency makes people Accountable • Accountability forces/fosters Integrity • Integrity breeds Community Support  Link between Ethical values and ROI http://bitsofknowledge.waterloohills.com
  • 39. References • Wikipedia,2011 • Dion Hinchcliffe Crowdsourcing: 5 Reasons Its Not Just For Start Ups Anymore,2009 • Tomoko A. Hosaka, MSNBC. "Facebook asks users to translate for free“,2008. • Daren C. Brabham. "Moving the Crowd at iStockphoto: The Composition of the Crowd and Motivations for Participation in a Crowdsourcing Application", First Monday, 13(6),2008. • Karim R. Lakhani, Lars Bo Jeppesen, Peter A. Lohse & Jill A. Panetta. The value of openness in scientific problem solving (Harvard Business School Working Paper No. 07-050),2007. • Klaus-Peter Speidel How to Do Intelligent Crowdsourcing,2011 • Panos Ipeirotis. Managing Crowdsourced Human Computation, WWW2011 tutorial,2011 • Omar Alonso & Matthew Lease. Crowdsourcing 101: Putting the WSDM of Crowds to Work for You, WSDM Hong Kong 2011. • Sanjoy Dasgupta, http://videolectures.net/icml09_dasgupta_langford_actl/,2009 •Don Tapscott, Anthony Williams. Macrowikinomics, 2010. http://bitsofknowledge.waterloohills.com
  • 40. Call For Ideas: If you have a large set of examples or just an idea of application for a program to classify or predict, I would love to hear from you! Questions? corina@waterloohills.com http://bitsofknowledge.waterloohills.com PWI - September 29, 2011 http://bitsofknowledge.waterloohills.com