SlideShare a Scribd company logo
1 of 21
Crowd Sourcing Web Service Annotations




            James Scicluna1, Christoph Blank1, Nathalie Steinmetz1 and Elena Simperl2
                                                1seekda   GmbH, 2Karlsruhe Institute of Technology




                                                                                                 1
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Outline

       Introduction to seekda Web Service search engine

       Web API crawling & identification

       Amazon Mechanical Turk crowdsourcing

       Web Service Annotation wizard




© Copyright 2012 SEEKDA GmbH – www.seekda.com
seekda Web Service Search Engine




                                                                               3
© Copyright 2012 SEEKDA GmbH – www.seekda.com
seekda Web Service Search Engine




                                                                               4
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Why crawl for Web APIs?

       Significant growth of Web APIs
           > 5,400 Web APIs on ProgrammableWeb (including SOAP and
            REST APIs) [end of 2009: ca. 1,500 Web APIs]
           > 6,500 Mashups on ProgrammableWeb (combining Web APIs
            from one or more sources)
       SOAP services are only a small part of the overall available
        public services




                                                                       5
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Web API Crawling

       Problem:
           Web APIs are
            described by regular
            HTML pages
           No standardized
            structure that helps
            with the
            identification




                                                               6
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Web API Identification

       Solution: Crawl for Web APIs
           Approach 1: Manual Feature Identification Approach
              Taking into account HTML structure (e.g., title, mark-up), syntactical
               properties of used language (e.g., camel-cased words), and link
               properties of pages (ratio external links / internal links)
           Approach 2: Automatic Classification Approach
              Text Classification, supervised learning (Support Vector Machine
               model)
              Training set: APIs from ProgrammableWeb


       But: still needed human confirmation to be sure


                                                                                        7
© Copyright 2012 SEEKDA GmbH – www.seekda.com
New Search Engine Prototype




                                                                          8
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Prototype – User Contributions

       Web API – yes/no: confirmation from
        human needed!
       Other annotations that help improve
        the search for Web Services
             Categories
             Tags
             Natural Language descriptions
             Cost: Free or paid service




                                                                             9
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Problem - User Contribution

       Problem:
           Users/developers don’t contribute enough
           Hard to motivate them to provide annotations
           Community recognition or peer respect not enough
       Solution: crowdsourcing the annotations, pay people to
        provide annotations
           Use Amazon Mechanical Turk
           Bootstrap annotations quickly and cheap




                                                                         10
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (1/4)




                                                                             11
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (2/4)




                                                                             12
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (3/4)




                                                                             13
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (4/4)




                                                                             14
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 1

                        Number of Submissions               70
                        Reward per task                    $0.10
                        Restrictions                        none

       Annotation Wizard
             Web API Yes/No
             Assign a category
             Assign tags
             Provide a natural language description
             Determine whether page is documentation, pricing or listing
             Rate the service


                                                                              15
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 1

       Results
             21 APIs correctly identified as APIs
             28 Web documents (non APIs) identified correctly as non APIs
             49/70 correctly identified (70% accuracy)
             Average task completion time: 2:20 min
       But, only:
           4 well done & complete annotations
           8 acceptable annotations (non complete)




                                                                              16
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iterations 2 & 3

                                                Iteration 2   Iteration 3
           Number of Submissions                   100           150
           Reward per task                        $0.20         $0.20
           Restrictions                            yes           yes


       Annotation Wizard
           Removed page type identification & service rating
           For a task to be accepted:
              At least one category must be assigned
              At least 2 tags must be provided
              A meaningful description must be provided


                                                                            17
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 2 & 3

       Results Iteration 2 & 3:
           Ca. 80% of documents correctly identified
           Very satisfying annotations
           Average completion time: 2:36 min




                                                                         18
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Survey

       48 survey submissions
           Female 18, Male 30
           Most popular origins: India (27) and USA (9)
           Popular age groups:
              15-22 (12)
              23-30 (18)
              31-50 (16)
           Most of them worked in some IT profession
              Provided best quality annotations




                                                                              19
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk

       Recommendations for further improvement:
           Improve task description, especially ‘what is a Web API’
           Better examples (e.g., hinting what makes a false page false)
           Allow assignment of multiple categories

       Conclusion:
           Very positive results  good way to get quality annotations
           Results will help provide better search experience to users
           Results can be used as positive set for automatic classification



                                                                               20
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Questions?




                                                             21
© Copyright 2012 SEEKDA GmbH – www.seekda.com

More Related Content

What's hot

Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile development
Michael Chaize
 
Find me if you can – smart fuzzing and discovery! shreeraj shah
Find me if you can – smart fuzzing and discovery!   shreeraj shahFind me if you can – smart fuzzing and discovery!   shreeraj shah
Find me if you can – smart fuzzing and discovery! shreeraj shah
owaspindia
 
Briefing 1.6 Mobile Applications
Briefing 1.6 Mobile ApplicationsBriefing 1.6 Mobile Applications
Briefing 1.6 Mobile Applications
marcusbennett123
 
Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013
Digicomp Academy AG
 
Aras Role Based Clients
Aras Role Based ClientsAras Role Based Clients
Aras Role Based Clients
Prodeos
 
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
InSync2011
 
Anil_Jindal_Resume
Anil_Jindal_ResumeAnil_Jindal_Resume
Anil_Jindal_Resume
Anil Jindal
 

What's hot (19)

Technical trainings
Technical trainingsTechnical trainings
Technical trainings
 
Resume
ResumeResume
Resume
 
Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile development
 
Find me if you can – smart fuzzing and discovery! shreeraj shah
Find me if you can – smart fuzzing and discovery!   shreeraj shahFind me if you can – smart fuzzing and discovery!   shreeraj shah
Find me if you can – smart fuzzing and discovery! shreeraj shah
 
Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012
 
Ejb course in-mumbai
Ejb course in-mumbaiEjb course in-mumbai
Ejb course in-mumbai
 
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
 
Briefing 1.6 Mobile Applications
Briefing 1.6 Mobile ApplicationsBriefing 1.6 Mobile Applications
Briefing 1.6 Mobile Applications
 
From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System
 
Summer training java
Summer training javaSummer training java
Summer training java
 
Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013
 
Develop multi-screen applications with Flex
Develop multi-screen applications with Flex Develop multi-screen applications with Flex
Develop multi-screen applications with Flex
 
Aras Role Based Clients
Aras Role Based ClientsAras Role Based Clients
Aras Role Based Clients
 
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
 
Summer training java
Summer training javaSummer training java
Summer training java
 
Gangadhar_Challa_Profile
Gangadhar_Challa_ProfileGangadhar_Challa_Profile
Gangadhar_Challa_Profile
 
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUGThe Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
 
Anil_Jindal_Resume
Anil_Jindal_ResumeAnil_Jindal_Resume
Anil_Jindal_Resume
 
Resume_Kartheek_Sr.QA
Resume_Kartheek_Sr.QAResume_Kartheek_Sr.QA
Resume_Kartheek_Sr.QA
 

Similar to Crowd Sourcing Web Service Annotations

Model Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & FutureModel Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & Future
elliando dias
 
Apptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptxApptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptx
Khazret Sapenov
 
Aras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration ManagementAras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration Management
Aras
 
Open source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache IncubatorOpen source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache Incubator
Jukka Zitting
 
Interactive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las VegasInteractive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las Vegas
dr.j
 

Similar to Crowd Sourcing Web Service Annotations (20)

seekda's Web Service search engine
seekda's Web Service search engineseekda's Web Service search engine
seekda's Web Service search engine
 
Google App Engine Update 2012
Google App Engine Update 2012Google App Engine Update 2012
Google App Engine Update 2012
 
Standard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data ManagementStandard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data Management
 
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
 
Model Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & FutureModel Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & Future
 
Apptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptxApptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptx
 
Sightly_techInsight
Sightly_techInsightSightly_techInsight
Sightly_techInsight
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)
 
Aras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration ManagementAras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration Management
 
How to design good APIs
How to design good APIsHow to design good APIs
How to design good APIs
 
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
 
OOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA SuiteOOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA Suite
 
Antonio piraino v1
Antonio piraino v1Antonio piraino v1
Antonio piraino v1
 
Grails At Linked
Grails At LinkedGrails At Linked
Grails At Linked
 
Grails at Linkedin
Grails at LinkedinGrails at Linkedin
Grails at Linkedin
 
Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010
 
Open source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache IncubatorOpen source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache Incubator
 
Interactive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las VegasInteractive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las Vegas
 
NetWeaver Gateway- Extend the Reach of SAP Applications
NetWeaver Gateway- Extend the Reach of SAP ApplicationsNetWeaver Gateway- Extend the Reach of SAP Applications
NetWeaver Gateway- Extend the Reach of SAP Applications
 
Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)
 

Recently uploaded

Recently uploaded (20)

Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Crowd Sourcing Web Service Annotations

  • 1. Crowd Sourcing Web Service Annotations James Scicluna1, Christoph Blank1, Nathalie Steinmetz1 and Elena Simperl2 1seekda GmbH, 2Karlsruhe Institute of Technology 1 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 2. Outline  Introduction to seekda Web Service search engine  Web API crawling & identification  Amazon Mechanical Turk crowdsourcing  Web Service Annotation wizard © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 3. seekda Web Service Search Engine 3 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 4. seekda Web Service Search Engine 4 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 5. Why crawl for Web APIs?  Significant growth of Web APIs  > 5,400 Web APIs on ProgrammableWeb (including SOAP and REST APIs) [end of 2009: ca. 1,500 Web APIs]  > 6,500 Mashups on ProgrammableWeb (combining Web APIs from one or more sources)  SOAP services are only a small part of the overall available public services 5 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 6. Web API Crawling  Problem:  Web APIs are described by regular HTML pages  No standardized structure that helps with the identification 6 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 7. Web API Identification  Solution: Crawl for Web APIs  Approach 1: Manual Feature Identification Approach  Taking into account HTML structure (e.g., title, mark-up), syntactical properties of used language (e.g., camel-cased words), and link properties of pages (ratio external links / internal links)  Approach 2: Automatic Classification Approach  Text Classification, supervised learning (Support Vector Machine model)  Training set: APIs from ProgrammableWeb  But: still needed human confirmation to be sure 7 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 8. New Search Engine Prototype 8 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 9. Prototype – User Contributions  Web API – yes/no: confirmation from human needed!  Other annotations that help improve the search for Web Services  Categories  Tags  Natural Language descriptions  Cost: Free or paid service 9 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 10. Problem - User Contribution  Problem:  Users/developers don’t contribute enough  Hard to motivate them to provide annotations  Community recognition or peer respect not enough  Solution: crowdsourcing the annotations, pay people to provide annotations  Use Amazon Mechanical Turk  Bootstrap annotations quickly and cheap 10 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 11. Service Annotation Wizard (1/4) 11 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 12. Service Annotation Wizard (2/4) 12 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 13. Service Annotation Wizard (3/4) 13 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 14. Service Annotation Wizard (4/4) 14 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 15. Amazon Mechanical Turk – Iteration 1 Number of Submissions 70 Reward per task $0.10 Restrictions none  Annotation Wizard  Web API Yes/No  Assign a category  Assign tags  Provide a natural language description  Determine whether page is documentation, pricing or listing  Rate the service 15 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 16. Amazon Mechanical Turk – Iteration 1  Results  21 APIs correctly identified as APIs  28 Web documents (non APIs) identified correctly as non APIs  49/70 correctly identified (70% accuracy)  Average task completion time: 2:20 min  But, only:  4 well done & complete annotations  8 acceptable annotations (non complete) 16 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 17. Amazon Mechanical Turk – Iterations 2 & 3 Iteration 2 Iteration 3 Number of Submissions 100 150 Reward per task $0.20 $0.20 Restrictions yes yes  Annotation Wizard  Removed page type identification & service rating  For a task to be accepted:  At least one category must be assigned  At least 2 tags must be provided  A meaningful description must be provided 17 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 18. Amazon Mechanical Turk – Iteration 2 & 3  Results Iteration 2 & 3:  Ca. 80% of documents correctly identified  Very satisfying annotations  Average completion time: 2:36 min 18 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 19. Amazon Mechanical Turk – Survey  48 survey submissions  Female 18, Male 30  Most popular origins: India (27) and USA (9)  Popular age groups:  15-22 (12)  23-30 (18)  31-50 (16)  Most of them worked in some IT profession  Provided best quality annotations 19 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 20. Amazon Mechanical Turk  Recommendations for further improvement:  Improve task description, especially ‘what is a Web API’  Better examples (e.g., hinting what makes a false page false)  Allow assignment of multiple categories  Conclusion:  Very positive results  good way to get quality annotations  Results will help provide better search experience to users  Results can be used as positive set for automatic classification 20 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 21. Questions? 21 © Copyright 2012 SEEKDA GmbH – www.seekda.com