SlideShare uma empresa Scribd logo
1 de 21
Crowd Sourcing Web Service Annotations




            James Scicluna1, Christoph Blank1, Nathalie Steinmetz1 and Elena Simperl2
                                                1seekda   GmbH, 2Karlsruhe Institute of Technology




                                                                                                 1
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Outline

       Introduction to seekda Web Service search engine

       Web API crawling & identification

       Amazon Mechanical Turk crowdsourcing

       Web Service Annotation wizard




© Copyright 2012 SEEKDA GmbH – www.seekda.com
seekda Web Service Search Engine




                                                                               3
© Copyright 2012 SEEKDA GmbH – www.seekda.com
seekda Web Service Search Engine




                                                                               4
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Why crawl for Web APIs?

       Significant growth of Web APIs
           > 5,400 Web APIs on ProgrammableWeb (including SOAP and
            REST APIs) [end of 2009: ca. 1,500 Web APIs]
           > 6,500 Mashups on ProgrammableWeb (combining Web APIs
            from one or more sources)
       SOAP services are only a small part of the overall available
        public services




                                                                       5
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Web API Crawling

       Problem:
           Web APIs are
            described by regular
            HTML pages
           No standardized
            structure that helps
            with the
            identification




                                                               6
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Web API Identification

       Solution: Crawl for Web APIs
           Approach 1: Manual Feature Identification Approach
              Taking into account HTML structure (e.g., title, mark-up), syntactical
               properties of used language (e.g., camel-cased words), and link
               properties of pages (ratio external links / internal links)
           Approach 2: Automatic Classification Approach
              Text Classification, supervised learning (Support Vector Machine
               model)
              Training set: APIs from ProgrammableWeb


       But: still needed human confirmation to be sure


                                                                                        7
© Copyright 2012 SEEKDA GmbH – www.seekda.com
New Search Engine Prototype




                                                                          8
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Prototype – User Contributions

       Web API – yes/no: confirmation from
        human needed!
       Other annotations that help improve
        the search for Web Services
             Categories
             Tags
             Natural Language descriptions
             Cost: Free or paid service




                                                                             9
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Problem - User Contribution

       Problem:
           Users/developers don’t contribute enough
           Hard to motivate them to provide annotations
           Community recognition or peer respect not enough
       Solution: crowdsourcing the annotations, pay people to
        provide annotations
           Use Amazon Mechanical Turk
           Bootstrap annotations quickly and cheap




                                                                         10
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (1/4)




                                                                             11
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (2/4)




                                                                             12
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (3/4)




                                                                             13
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Service Annotation Wizard (4/4)




                                                                             14
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 1

                        Number of Submissions               70
                        Reward per task                    $0.10
                        Restrictions                        none

       Annotation Wizard
             Web API Yes/No
             Assign a category
             Assign tags
             Provide a natural language description
             Determine whether page is documentation, pricing or listing
             Rate the service


                                                                              15
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 1

       Results
             21 APIs correctly identified as APIs
             28 Web documents (non APIs) identified correctly as non APIs
             49/70 correctly identified (70% accuracy)
             Average task completion time: 2:20 min
       But, only:
           4 well done & complete annotations
           8 acceptable annotations (non complete)




                                                                              16
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iterations 2 & 3

                                                Iteration 2   Iteration 3
           Number of Submissions                   100           150
           Reward per task                        $0.20         $0.20
           Restrictions                            yes           yes


       Annotation Wizard
           Removed page type identification & service rating
           For a task to be accepted:
              At least one category must be assigned
              At least 2 tags must be provided
              A meaningful description must be provided


                                                                            17
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Iteration 2 & 3

       Results Iteration 2 & 3:
           Ca. 80% of documents correctly identified
           Very satisfying annotations
           Average completion time: 2:36 min




                                                                         18
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk – Survey

       48 survey submissions
           Female 18, Male 30
           Most popular origins: India (27) and USA (9)
           Popular age groups:
              15-22 (12)
              23-30 (18)
              31-50 (16)
           Most of them worked in some IT profession
              Provided best quality annotations




                                                                              19
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Amazon Mechanical Turk

       Recommendations for further improvement:
           Improve task description, especially ‘what is a Web API’
           Better examples (e.g., hinting what makes a false page false)
           Allow assignment of multiple categories

       Conclusion:
           Very positive results  good way to get quality annotations
           Results will help provide better search experience to users
           Results can be used as positive set for automatic classification



                                                                               20
© Copyright 2012 SEEKDA GmbH – www.seekda.com
Questions?




                                                             21
© Copyright 2012 SEEKDA GmbH – www.seekda.com

Mais conteúdo relacionado

Mais procurados

Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile developmentMichael Chaize
 
Find me if you can – smart fuzzing and discovery! shreeraj shah
Find me if you can – smart fuzzing and discovery!   shreeraj shahFind me if you can – smart fuzzing and discovery!   shreeraj shah
Find me if you can – smart fuzzing and discovery! shreeraj shahowaspindia
 
Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012joelavery
 
Ejb course in-mumbai
Ejb course in-mumbaiEjb course in-mumbai
Ejb course in-mumbaivibrantuser
 
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise ArchitecturesBIOVIA
 
Briefing 1.6 Mobile Applications
Briefing 1.6 Mobile ApplicationsBriefing 1.6 Mobile Applications
Briefing 1.6 Mobile Applicationsmarcusbennett123
 
From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System Intland Software GmbH
 
Summer training java
Summer training javaSummer training java
Summer training javaArshit Rai
 
Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013Digicomp Academy AG
 
Develop multi-screen applications with Flex
Develop multi-screen applications with Flex Develop multi-screen applications with Flex
Develop multi-screen applications with Flex Codemotion
 
Aras Role Based Clients
Aras Role Based ClientsAras Role Based Clients
Aras Role Based ClientsProdeos
 
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...InSync2011
 
Summer training java
Summer training javaSummer training java
Summer training javaArshit Rai
 
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUGThe Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUGArun Gupta
 
Anil_Jindal_Resume
Anil_Jindal_ResumeAnil_Jindal_Resume
Anil_Jindal_ResumeAnil Jindal
 

Mais procurados (19)

Technical trainings
Technical trainingsTechnical trainings
Technical trainings
 
Resume
ResumeResume
Resume
 
Flex 4.5 and mobile development
Flex 4.5 and mobile developmentFlex 4.5 and mobile development
Flex 4.5 and mobile development
 
Find me if you can – smart fuzzing and discovery! shreeraj shah
Find me if you can – smart fuzzing and discovery!   shreeraj shahFind me if you can – smart fuzzing and discovery!   shreeraj shah
Find me if you can – smart fuzzing and discovery! shreeraj shah
 
Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012Interfacing Banner BEIS With Identity Management - Summit 2012
Interfacing Banner BEIS With Identity Management - Summit 2012
 
Ejb course in-mumbai
Ejb course in-mumbaiEjb course in-mumbai
Ejb course in-mumbai
 
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
(ATS3-GS02) Accelrys Enterprise Platform in Enterprise Architectures
 
Briefing 1.6 Mobile Applications
Briefing 1.6 Mobile ApplicationsBriefing 1.6 Mobile Applications
Briefing 1.6 Mobile Applications
 
From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System From Requirements Management to Release with Git for Android System
From Requirements Management to Release with Git for Android System
 
Summer training java
Summer training javaSummer training java
Summer training java
 
Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013Samuel Zürcher service applications in sp2013
Samuel Zürcher service applications in sp2013
 
Develop multi-screen applications with Flex
Develop multi-screen applications with Flex Develop multi-screen applications with Flex
Develop multi-screen applications with Flex
 
Aras Role Based Clients
Aras Role Based ClientsAras Role Based Clients
Aras Role Based Clients
 
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
New & Emerging _ Mick Andrew _ Adding mobile and web 2.0 UIs to existing appl...
 
Summer training java
Summer training javaSummer training java
Summer training java
 
Gangadhar_Challa_Profile
Gangadhar_Challa_ProfileGangadhar_Challa_Profile
Gangadhar_Challa_Profile
 
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUGThe Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
The Java EE 7 Platform: Productivity & HTML5 at San Francisco JUG
 
Anil_Jindal_Resume
Anil_Jindal_ResumeAnil_Jindal_Resume
Anil_Jindal_Resume
 
Resume_Kartheek_Sr.QA
Resume_Kartheek_Sr.QAResume_Kartheek_Sr.QA
Resume_Kartheek_Sr.QA
 

Destaque

Production Schedule
Production ScheduleProduction Schedule
Production ScheduleIonaGinnion
 
Pemrograman Socket SMTP
Pemrograman Socket SMTPPemrograman Socket SMTP
Pemrograman Socket SMTPDin Afriansyah
 
INSEMTIVES year 2 - Dissemination and Community Building
INSEMTIVES year 2  - Dissemination and Community BuildingINSEMTIVES year 2  - Dissemination and Community Building
INSEMTIVES year 2 - Dissemination and Community BuildingINSEMTIVES project
 
Non parametric inference of causal interactions
Non parametric inference of causal interactionsNon parametric inference of causal interactions
Non parametric inference of causal interactionsRaphael Rodrigues
 
Digital Media for Basic
Digital Media for Basic Digital Media for Basic
Digital Media for Basic Din Afriansyah
 

Destaque (9)

WP2 1st Review
WP2 1st ReviewWP2 1st Review
WP2 1st Review
 
Production Schedule
Production ScheduleProduction Schedule
Production Schedule
 
UAB 2011 - L!nks Showcase
UAB 2011 - L!nks ShowcaseUAB 2011 - L!nks Showcase
UAB 2011 - L!nks Showcase
 
Pemrograman Socket SMTP
Pemrograman Socket SMTPPemrograman Socket SMTP
Pemrograman Socket SMTP
 
INSEMTIVES year 2 - Dissemination and Community Building
INSEMTIVES year 2  - Dissemination and Community BuildingINSEMTIVES year 2  - Dissemination and Community Building
INSEMTIVES year 2 - Dissemination and Community Building
 
Non parametric inference of causal interactions
Non parametric inference of causal interactionsNon parametric inference of causal interactions
Non parametric inference of causal interactions
 
UAB 2011 - Games
UAB 2011 - GamesUAB 2011 - Games
UAB 2011 - Games
 
Broadcasting
BroadcastingBroadcasting
Broadcasting
 
Digital Media for Basic
Digital Media for Basic Digital Media for Basic
Digital Media for Basic
 

Semelhante a Crowd Sourcing Web Service Annotations

seekda's Web Service search engine
seekda's Web Service search engineseekda's Web Service search engine
seekda's Web Service search engineNathalie Steinmetz
 
Google App Engine Update 2012
Google App Engine Update 2012Google App Engine Update 2012
Google App Engine Update 2012David Chandler
 
Standard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data ManagementStandard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data ManagementInside Analysis
 
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...Davalen LLC
 
Model Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & FutureModel Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & Futureelliando dias
 
Apptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptxApptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptxKhazret Sapenov
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)Gert Drapers
 
Aras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration ManagementAras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration ManagementAras
 
How to design good APIs
How to design good APIsHow to design good APIs
How to design good APIsAndré Vieira
 
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017Nishanth Kadiyala
 
OOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA SuiteOOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA SuiteRajesh Raheja
 
Grails At Linked
Grails At LinkedGrails At Linked
Grails At LinkedLinkedIn
 
Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010Mark Ginnebaugh
 
Open source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache IncubatorOpen source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache IncubatorJukka Zitting
 
Interactive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las VegasInteractive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las Vegasdr.j
 
Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)Carolyn Crowe
 

Semelhante a Crowd Sourcing Web Service Annotations (20)

seekda's Web Service search engine
seekda's Web Service search engineseekda's Web Service search engine
seekda's Web Service search engine
 
Google App Engine Update 2012
Google App Engine Update 2012Google App Engine Update 2012
Google App Engine Update 2012
 
Standard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data ManagementStandard Issue: Preparing for the Future of Data Management
Standard Issue: Preparing for the Future of Data Management
 
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
Have your Dojo and eat it too! A Technical Presentations from the 2012 IBM Ex...
 
Model Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & FutureModel Driven Architecture (MDA): Motivations, Status & Future
Model Driven Architecture (MDA): Motivations, Status & Future
 
Apptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptxApptio up cloud conference 2012 [final].pptx
Apptio up cloud conference 2012 [final].pptx
 
Sightly_techInsight
Sightly_techInsightSightly_techInsight
Sightly_techInsight
 
SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)SSDT Workshop @ SQL Bits X (2012-03-29)
SSDT Workshop @ SQL Bits X (2012-03-29)
 
Aras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration ManagementAras ALM Workshop for PLM Configuration Management
Aras ALM Workshop for PLM Configuration Management
 
How to design good APIs
How to design good APIsHow to design good APIs
How to design good APIs
 
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017Deliver Secure SQL Access for Enterprise APIs - August 29 2017
Deliver Secure SQL Access for Enterprise APIs - August 29 2017
 
OOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA SuiteOOW 2012: Integrate Cloud Applications with Oracle SOA Suite
OOW 2012: Integrate Cloud Applications with Oracle SOA Suite
 
Antonio piraino v1
Antonio piraino v1Antonio piraino v1
Antonio piraino v1
 
Grails At Linked
Grails At LinkedGrails At Linked
Grails At Linked
 
Grails at Linkedin
Grails at LinkedinGrails at Linkedin
Grails at Linkedin
 
Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010Microsoft Analysis Services July 2010
Microsoft Analysis Services July 2010
 
Open source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache IncubatorOpen source masterclass - Life in the Apache Incubator
Open source masterclass - Life in the Apache Incubator
 
Interactive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las VegasInteractive Forms Review - SDN Day 2008 - Las Vegas
Interactive Forms Review - SDN Day 2008 - Las Vegas
 
NetWeaver Gateway- Extend the Reach of SAP Applications
NetWeaver Gateway- Extend the Reach of SAP ApplicationsNetWeaver Gateway- Extend the Reach of SAP Applications
NetWeaver Gateway- Extend the Reach of SAP Applications
 
Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)Cast Iron Overview Webinar 6.13.12 Final(Jb)
Cast Iron Overview Webinar 6.13.12 Final(Jb)
 

Mais de INSEMTIVES project

SemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersSemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersINSEMTIVES project
 
SocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationSocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationINSEMTIVES project
 
SemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedSemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedINSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES project
 
INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES project
 
UAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalUAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalINSEMTIVES project
 
UAB 2011- Combining human and computational intelligence
UAB 2011- Combining human and computational intelligenceUAB 2011- Combining human and computational intelligence
UAB 2011- Combining human and computational intelligenceINSEMTIVES project
 
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsWP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsINSEMTIVES project
 
WP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationWP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationINSEMTIVES project
 
INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES project
 

Mais de INSEMTIVES project (18)

SemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing UsersSemTech 2012 - Making your semantic app addictive: Incentivizing Users
SemTech 2012 - Making your semantic app addictive: Incentivizing Users
 
SocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For MotivationSocInfo2011 - Designing For Motivation
SocInfo2011 - Designing For Motivation
 
SemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge UnlockedSemTech2011 - Employee-of-the-Month' Badge Unlocked
SemTech2011 - Employee-of-the-Month' Badge Unlocked
 
INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5INSEMTIVES Tutorial ISWC2011 - Session5
INSEMTIVES Tutorial ISWC2011 - Session5
 
INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4INSEMTIVES Tutorial ISWC2011 - Session4
INSEMTIVES Tutorial ISWC2011 - Session4
 
INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3INSEMTIVES Tutorial ISWC2011 - Session3
INSEMTIVES Tutorial ISWC2011 - Session3
 
INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1INSEMTIVES Tutorial ISWC2011 - Session1
INSEMTIVES Tutorial ISWC2011 - Session1
 
INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2INSEMTIVES Tutorial ISWC2011 - Session2
INSEMTIVES Tutorial ISWC2011 - Session2
 
UAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices PortalUAB 2011 - Seekda Webservices Portal
UAB 2011 - Seekda Webservices Portal
 
UAB 2011- Combining human and computational intelligence
UAB 2011- Combining human and computational intelligenceUAB 2011- Combining human and computational intelligence
UAB 2011- Combining human and computational intelligence
 
L!NKS Showcase
L!NKS ShowcaseL!NKS Showcase
L!NKS Showcase
 
Technology - WP3 and WP4
Technology - WP3 and WP4Technology - WP3 and WP4
Technology - WP3 and WP4
 
Semantic Games
Semantic GamesSemantic Games
Semantic Games
 
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate PortalsWP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
WP8 Okenterprise Use Case - Applying Insemtives to Corporate Portals
 
WP2 2nd Review
WP2 2nd ReviewWP2 2nd Review
WP2 2nd Review
 
WP8 Dissemination and Exploitation
WP8 Dissemination and ExploitationWP8 Dissemination and Exploitation
WP8 Dissemination and Exploitation
 
WP1 1st Review
WP1 1st ReviewWP1 1st Review
WP1 1st Review
 
INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010INSEMTIVES talk at Semtech2010
INSEMTIVES talk at Semtech2010
 

Último

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Crowd Sourcing Web Service Annotations

  • 1. Crowd Sourcing Web Service Annotations James Scicluna1, Christoph Blank1, Nathalie Steinmetz1 and Elena Simperl2 1seekda GmbH, 2Karlsruhe Institute of Technology 1 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 2. Outline  Introduction to seekda Web Service search engine  Web API crawling & identification  Amazon Mechanical Turk crowdsourcing  Web Service Annotation wizard © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 3. seekda Web Service Search Engine 3 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 4. seekda Web Service Search Engine 4 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 5. Why crawl for Web APIs?  Significant growth of Web APIs  > 5,400 Web APIs on ProgrammableWeb (including SOAP and REST APIs) [end of 2009: ca. 1,500 Web APIs]  > 6,500 Mashups on ProgrammableWeb (combining Web APIs from one or more sources)  SOAP services are only a small part of the overall available public services 5 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 6. Web API Crawling  Problem:  Web APIs are described by regular HTML pages  No standardized structure that helps with the identification 6 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 7. Web API Identification  Solution: Crawl for Web APIs  Approach 1: Manual Feature Identification Approach  Taking into account HTML structure (e.g., title, mark-up), syntactical properties of used language (e.g., camel-cased words), and link properties of pages (ratio external links / internal links)  Approach 2: Automatic Classification Approach  Text Classification, supervised learning (Support Vector Machine model)  Training set: APIs from ProgrammableWeb  But: still needed human confirmation to be sure 7 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 8. New Search Engine Prototype 8 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 9. Prototype – User Contributions  Web API – yes/no: confirmation from human needed!  Other annotations that help improve the search for Web Services  Categories  Tags  Natural Language descriptions  Cost: Free or paid service 9 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 10. Problem - User Contribution  Problem:  Users/developers don’t contribute enough  Hard to motivate them to provide annotations  Community recognition or peer respect not enough  Solution: crowdsourcing the annotations, pay people to provide annotations  Use Amazon Mechanical Turk  Bootstrap annotations quickly and cheap 10 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 11. Service Annotation Wizard (1/4) 11 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 12. Service Annotation Wizard (2/4) 12 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 13. Service Annotation Wizard (3/4) 13 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 14. Service Annotation Wizard (4/4) 14 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 15. Amazon Mechanical Turk – Iteration 1 Number of Submissions 70 Reward per task $0.10 Restrictions none  Annotation Wizard  Web API Yes/No  Assign a category  Assign tags  Provide a natural language description  Determine whether page is documentation, pricing or listing  Rate the service 15 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 16. Amazon Mechanical Turk – Iteration 1  Results  21 APIs correctly identified as APIs  28 Web documents (non APIs) identified correctly as non APIs  49/70 correctly identified (70% accuracy)  Average task completion time: 2:20 min  But, only:  4 well done & complete annotations  8 acceptable annotations (non complete) 16 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 17. Amazon Mechanical Turk – Iterations 2 & 3 Iteration 2 Iteration 3 Number of Submissions 100 150 Reward per task $0.20 $0.20 Restrictions yes yes  Annotation Wizard  Removed page type identification & service rating  For a task to be accepted:  At least one category must be assigned  At least 2 tags must be provided  A meaningful description must be provided 17 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 18. Amazon Mechanical Turk – Iteration 2 & 3  Results Iteration 2 & 3:  Ca. 80% of documents correctly identified  Very satisfying annotations  Average completion time: 2:36 min 18 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 19. Amazon Mechanical Turk – Survey  48 survey submissions  Female 18, Male 30  Most popular origins: India (27) and USA (9)  Popular age groups:  15-22 (12)  23-30 (18)  31-50 (16)  Most of them worked in some IT profession  Provided best quality annotations 19 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 20. Amazon Mechanical Turk  Recommendations for further improvement:  Improve task description, especially ‘what is a Web API’  Better examples (e.g., hinting what makes a false page false)  Allow assignment of multiple categories  Conclusion:  Very positive results  good way to get quality annotations  Results will help provide better search experience to users  Results can be used as positive set for automatic classification 20 © Copyright 2012 SEEKDA GmbH – www.seekda.com
  • 21. Questions? 21 © Copyright 2012 SEEKDA GmbH – www.seekda.com