SlideShare uma empresa Scribd logo
1 de 38
Baixar para ler offline
When is a Clone not a Clone?
(and vice-versa)

Contextualized Analysis of Web Services
Douglas Martin
Scott Grant

James R. Cordy
David B. Skillicorn

School of Computing

Kingston, Canada
Motivation
—  The Personal Web
—  Rapidly growing number of web services makes it
increasingly difficult to find and choose the right ones

—  Need a quick and convenient
way to find alternatives

—  Hand tagging impractical –
automation is needed!
Motivation
—  Automation
—  Similarity detection techniques offer solutions!
—  Code clone detection from

software engineering research
can find similar code fragments –
why not similar services?

—  Topic models from data mining

research can find text documents
with similar semantics –
why not similar services?
Web Service Similarity
—  Web services are stored in

service registries, containing
WSDL service description files

—  Could apply clone detection to
entire service descriptions

—  But what we really want are
similar service operations
Let’s try it!
<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“Supplier” type=“xsd:string”/>
<element name=“Warehouse” type=“xsd:string”/>
<element name=“OnHand” type=“xsd:string”/>
<element name=“OnOrder” type=“xsd:string”/>
<element name=“Demand” type=“xsd:string”/>
</sequence>
</complexType >

<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“date” type=“xsd:string”/>
<element name=“open” type=“xsd:float”/>
<element name=“high” type=“xsd:float”/>
<element name=“low” type=“xsd:float”/>
<element name=“close” type=“xsd:float”/>
<element name=“volume” type=“xsd:float”/>
</sequence>
</complexType >
How about these?
<operation name=“DrawRateChartCustom”>
<input message=“DrawRateChartCustomIn”/>
<output message=“DrawRateChartCustomOut”/>
</operation>

<operation name="GetTopicBinaryChartCustom">
<input message="GetTopicBinaryChartCustomSoapIn"/>
<output message="GetTopicBinaryChartCustomSoapOut"/>
</operation>
So what went wrong?
—  At this point we thought maybe our idea wasn’t
going to work

—  Maybe clone detection can’t help with
web service discovery?

—  But why? What’s so special about WSDL?
Web Service Description
Language (WSDL)
—  A WSDL service description has
3 main parts:
Web Service Description
Language (WSDL)
—  A WSDL service description has
3 main parts:

—  a <portType> element where the
operations are declared;
Web Service Description
Language (WSDL)
—  A WSDL service description has
3 main parts:

—  a <portType> element where the

operations are declared;
—  <message> elements
corresponding to inputs, outputs
and faults of the operations;
Web Service Description
Language (WSDL)
—  A WSDL service description has
3 main parts:

—  a <portType> element where the

operations are declared;
—  <message> elements
corresponding to inputs, outputs
and faults of the operations;
—  and a <types> element
containing an XML Schema that
defines the data and structure
types used in the messages
Web Service Description
Language (WSDL)
—  This simple example service
has two operations:
Web Service Description
Language (WSDL)
—  This simple example service
has two operations:
—  ReserveRoom
Web Service Description
Language (WSDL)
—  This simple example service
has two operations:

—  ReserveRoom
—  GetAvailableRooms
Web Service Description
Language (WSDL)
—  WSDL service description files contain descriptions
of the operations that a web service has to offer

—  But the pieces of each operation’s own description
are scattered over different parts of the WSDL file

—  Difficult to identify complete units to analyze and
compare
The Problem
—  This poses a problem for analysis techniques:
—  Operations cannot easily be compared for similarity
using clone detectors, because there are no
contiguous fragments to compare

—  And they cannot be analyzed using data mining topic
models, because there are no separate complete
documents to generate a model from
Our Solution
—  Our solution is to contextualize the original

<operation> elements, to create self-contained
operation descriptions
—  We use source transformation to inline remote
information from the context into the elements
that reference or depend on them

—  We call these contextualized WSDL operations

Web Service Cells, or WSCells
—  The first example of a new kind of clone detection:
contextual clones
Contextualizing WSDL
Operations
Contextual Clone Detection
An Experiment
—  We have run an experiment to investigate the

difference between clone detection on WSCells
and original raw operations

—  Two sets of WSDL service description files:
1,100 operations and 7,500 operations

—  Compared NICAD clone detector results for each
set at various near-miss difference thresholds
0% = exact clone,
10% = 1 line in 10 different, and so on
An Experiment
—  Number of clones decreases with WSCells
Difference	
  
Threshold	
  

Clone	
  Pairs	
  in	
  Set	
  1	
  

Clone	
  Pairs	
  in	
  Set	
  2	
  

Originals	
  

WSCells	
  

Originals	
  

WSCells	
  

0.0	
  

852	
  

705	
  

1434	
  

1066	
  

0.1	
  

852	
  

734	
  

1434	
  

1228	
  

0.2	
  

879	
  

775	
  

1438	
  

1637	
  

0.3	
  

884	
  

813	
  

1469	
  

1637	
  

<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“Supplier” type=“xsd:string”/>
<element name=“Warehouse” type=“xsd:string”/>
<element name=“OnHand” type=“xsd:string”/>
<element name=“OnOrder” type=“xsd:string”/>
<element name=“Demand” type=“xsd:string”/>
</sequence>
</complexType >
<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“date” type=“xsd:string”/>
<element name=“open” type=“xsd:float”/>
<element name=“high” type=“xsd:float”/>
<element name=“low” type=“xsd:float”/>
<element name=“close” type=“xsd:float”/>
<element name=“volume” type=“xsd:float”/>
</sequence>
</complexType >

—  Reduction in

false positives
An Experiment
—  Number of clone classes can increase with WSCells
Difference	
  
Threshold	
  

Clone	
  Classes	
  in	
  Set	
  1	
  

Clone	
  Classes	
  in	
  Set	
  2	
  

Originals	
  

WSCells	
  

Originals	
  

WSCells	
  

0.0	
  

169	
  

187	
  

587	
  

433	
  

0.1	
  

169	
  

139	
  

587	
  

499	
  

0.2	
  

172	
  

142	
  

589	
  

631	
  

0.3	
  

171	
  

136	
  

591	
  

631	
  

<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“Supplier” type=“xsd:string”/>
<element name=“Warehouse” type=“xsd:string”/>
<element name=“OnHand” type=“xsd:string”/>
<element name=“OnOrder” type=“xsd:string”/>
<element name=“Demand” type=“xsd:string”/>
</sequence>
</complexType >
<operation name="GetStock" >
<input message="tns:GetStockRequest" />
<complexType name=“Stock”>
<output message="tns:GetStockResponse" />
<sequence>
</operation>
<element name=“date” type=“xsd:string”/>
<element name=“open” type=“xsd:float”/>
<element name=“high” type=“xsd:float”/>
<element name=“low” type=“xsd:float”/>
<element name=“close” type=“xsd:float”/>
<element name=“volume” type=“xsd:float”/>
</sequence>
</complexType >

—  Splits by deeper
differences –
more precision
Clone Detection for
Web Services
—  Contextual clone detection with WSCells works!
—  Not only finds similar web service operations,

but uncovers similar operations we could not find
in any other way
<operation name=“DrawRateChartCustom”>
<input message=“DrawRateChartCustomIn”/>
<output message=“DrawRateChartCustomOut”/>
</operation>
<operation name="GetRealChartCustom">
<input message="GetRealChartCustomSoapIn"/>
<output message="GetRealChartCustomSoapOut"/>
</operation>
<operation name="GetLastSaleChartCustom">
<input message="GetLastSaleChartCustomSoapIn"/>
<output message="GetLastSaleChartCustomSoapOut"/>
</operation>
<operation name=“DrawYieldCurveCustom”>
<input message=“DrawYieldCurveCustomIn”/>
<output message=“DrawYieldCurveCustomOut”/>
</operation>
<operation name="GetTopicChartCustom">
<input message="GetTopicChartCustomSoapIn" />
<output message="GetTopicChartCustomSoapOut" />
<operation name="GetTopicBinaryChartCustom">
</operation>
<input message="GetTopicBinaryChartCustomSoapIn"/>
<output message="GetTopicBinaryChartCustomSoapOut"/>
</operation>
Semantic Analysis of
Web Services
—  Contextualized WSCells also make it possible to use
data mining topic models to do semantic analysis
of web services
—  Because they provide self-contained documents of
significant size

—  Might topic models provide a different view
of web service similarity?
Latent Dirichlet Allocation
—  Latent Dirichlet Allocation (LDA) :
—  A statistical model to uncover latent topics
—  Identifies the correlation between documents in
terms of shared latent topics (sets of tokens)

—  Accepts a set of documents (e.g., source files) as

input, returns probability distributions over inferred
topics (a topic model) as output
—  Each document has some probability of being related

to topic 1, another probability for topic 2, and so on
—  Similar documents should be related to similar topics
Latent Dirichlet Allocation
—  Documents are represented in the model in terms
of probability distributions over topics

—  Similarity between documents is found using the

Hellinger Distance
—  A measure of how much agreement there is between

the shared topics of two documents
—  Almost identical documents have a small Hellinger
Distance since they will be related to the same topics
—  In terms of web services, small Hellinger Distances
indicate highly related operations
Evaluating WSCells
—  To evaluate the use of WSCells with LDA, we :
—  Generate an LDA model for the original <operation>
elements, and another for the contextualized WSCells
—  Explore the Global and Local Similarity between each
pair of operations in the models

—  Global Similarity an overall view of the most closely
related web service operations in the service set

—  Local Similarity a per-operation view of the other
most related web service operations for each
operation
Global Similarity
—  We look at Global Similarity using a visualization
called Bluevis

—  Bluevis shows the global conceptual structure of a
system by highlighting similar operations using an
illuminated line from left-to-right
—  Plot some top fraction of similar operations
(top 25,000 in our examples)
—  Use a consistently ordered list of web service
operations for the LDA model to view the differences
—  If a display is noisy, it is often an indication that the
model is not identifying meaningful data
Global Similarity
Global Similarity
—  For original raw operations:
—  Bluevis highlights the LDA
most similar operations
—  Some clear structure
—  However, most of this is
due to shared keywords,
like get and SOAP

—  This uncontextualized

model has very little value
Global Similarity
Global Similarity
—  For contextualized WSCells:
—  A clearer semantic

structure, less noise overall
—  Operation similarity
becomes meaningful

—  Services with semantic
similarity discovered
—  E.g., Operations with

similar parameters or
faults, such as those that
manipulate holiday dates
or financial rates
Local Similarity
—  We can also examine the local similarity for each

individual operation
—  Identify the complete ordered list of similarity scores
for an operation in the data set

—  Using the top similarity scores, evaluate how

meaningful the data is from a user's perspective
—  For example, how can I find the most similar web
service operations to the one I am using now?

—  We use a tool called POCO (Pairwise Observation of
Concepts) to examine the most similar operations
Local Similarity
Local Similarity
Operation

Most similar WSCell

Most similar original raw
WSDL operation

ListFinancials

GetFinancialServicesFromList

LanguagesList

ExportShipsAndCategories ExportIteneraryAndSteps

Search

GetIssueData

GetFlightData

word_cloud

GetWeatherReport

GetWeather

GetIndices

GetAIDIBOR

GetTRLIBOR

GetCarriers

searchByIdentifier

searchByNameAndAddress

GetLastSecurityHeadlines

ToolsAndHardwareBox

KitchenAndHousewareBox

ListRenditions

GetReservations

GetRoomAvailabilityForDay

GetSOFIBOR

GetOtherProductInfo

NextOtherProductPortion

GetParkingInfo

GetAllSplitsByExchange

GetAllCashDividendsByExchange

GetTeamLoyalties2
Summary
—  Very-high-level domain-specific languages such as
WSDL make poor targets for similarity analysis
using clone detection and topic models
—  Lack of local context prevents meaningful results

—  Contextualizing using WSCells exposes both cloning
and semantic relationships between web operations
—  Clone detection of WSCells identifies similar web
service operations
—  Topic models of WSCells expose both global
system-wide semantic relationships and local
individual relationships between operations
Current & Future
—  Continue analysis of web services for the Personal
Web using our results

—  Apply contextualization to similarity analysis of

other modeling and specification languages
(currently Simulink, Stateflow and UML sequence
diagrams)

—  Experiment with effect of contextualization on
clone and topic model analysis of
traditional languages such as Java and C
(“contextual clones”)
When is a Clone not a Clone?
(and vice-versa)

Contextualized Analysis of Web Services
Douglas Martin
Scott Grant

James R. Cordy
David B. Skillicorn

Questions?

Mais conteúdo relacionado

Mais procurados

Simplify AJAX using jQuery
Simplify AJAX using jQuerySimplify AJAX using jQuery
Simplify AJAX using jQuerySiva Arunachalam
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE worldUwe Friedrichsen
 
Knot.x: when Vert.x and RxJava meet
Knot.x: when Vert.x and RxJava meetKnot.x: when Vert.x and RxJava meet
Knot.x: when Vert.x and RxJava meetTomasz Michalak
 
Andrzej Ludwikowski - Event Sourcing - co może pójść nie tak?
Andrzej Ludwikowski -  Event Sourcing - co może pójść nie tak?Andrzej Ludwikowski -  Event Sourcing - co może pójść nie tak?
Andrzej Ludwikowski - Event Sourcing - co może pójść nie tak?SegFaultConf
 
Jsonix - Talking to OGC Web Services in JSON
Jsonix - Talking to OGC Web Services in JSONJsonix - Talking to OGC Web Services in JSON
Jsonix - Talking to OGC Web Services in JSONorless
 
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.Thomas Vendetta
 
Consume Spring Data Rest with Angularjs
Consume Spring Data Rest with AngularjsConsume Spring Data Rest with Angularjs
Consume Spring Data Rest with AngularjsCorneil du Plessis
 
JavaScript Fundamentals & JQuery
JavaScript Fundamentals & JQueryJavaScript Fundamentals & JQuery
JavaScript Fundamentals & JQueryJamshid Hashimi
 
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"ITGinGer
 
Introduction to Restkit
Introduction to RestkitIntroduction to Restkit
Introduction to Restkitpetertmarks
 
Learn javascript easy steps
Learn javascript easy stepsLearn javascript easy steps
Learn javascript easy stepsprince Loffar
 
e-suap - client technologies- english version
e-suap - client technologies- english versione-suap - client technologies- english version
e-suap - client technologies- english versionSabino Labarile
 

Mais procurados (20)

Simplify AJAX using jQuery
Simplify AJAX using jQuerySimplify AJAX using jQuery
Simplify AJAX using jQuery
 
How to survive in a BASE world
How to survive in a BASE worldHow to survive in a BASE world
How to survive in a BASE world
 
Java script -23jan2015
Java script -23jan2015Java script -23jan2015
Java script -23jan2015
 
javaScript and jQuery
javaScript and jQueryjavaScript and jQuery
javaScript and jQuery
 
Knot.x: when Vert.x and RxJava meet
Knot.x: when Vert.x and RxJava meetKnot.x: when Vert.x and RxJava meet
Knot.x: when Vert.x and RxJava meet
 
Andrzej Ludwikowski - Event Sourcing - co może pójść nie tak?
Andrzej Ludwikowski -  Event Sourcing - co może pójść nie tak?Andrzej Ludwikowski -  Event Sourcing - co może pójść nie tak?
Andrzej Ludwikowski - Event Sourcing - co może pójść nie tak?
 
Java script basics
Java script basicsJava script basics
Java script basics
 
Jsonix - Talking to OGC Web Services in JSON
Jsonix - Talking to OGC Web Services in JSONJsonix - Talking to OGC Web Services in JSON
Jsonix - Talking to OGC Web Services in JSON
 
Ajax and Jquery
Ajax and JqueryAjax and Jquery
Ajax and Jquery
 
jQuery Ajax
jQuery AjaxjQuery Ajax
jQuery Ajax
 
Javascript 2
Javascript 2Javascript 2
Javascript 2
 
Ajax
AjaxAjax
Ajax
 
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.
Integrate Ruby on Rails with Avectra's NetFORUM xWeb API.
 
Consume Spring Data Rest with Angularjs
Consume Spring Data Rest with AngularjsConsume Spring Data Rest with Angularjs
Consume Spring Data Rest with Angularjs
 
JavaScript Fundamentals & JQuery
JavaScript Fundamentals & JQueryJavaScript Fundamentals & JQuery
JavaScript Fundamentals & JQuery
 
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"
MPD2011 | Сергей Клюев "RESTfull iOS with RestKit"
 
Introduction to Restkit
Introduction to RestkitIntroduction to Restkit
Introduction to Restkit
 
Learn javascript easy steps
Learn javascript easy stepsLearn javascript easy steps
Learn javascript easy steps
 
Xml http request
Xml http requestXml http request
Xml http request
 
e-suap - client technologies- english version
e-suap - client technologies- english versione-suap - client technologies- english version
e-suap - client technologies- english version
 

Destaque

Software Visualization (EVO 2008)
Software Visualization (EVO 2008)Software Visualization (EVO 2008)
Software Visualization (EVO 2008)Tudor Girba
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software AnalyticsMargaret-Anne Storey
 
Visualization in Software Product Lines
Visualization in Software Product LinesVisualization in Software Product Lines
Visualization in Software Product Linesthiagofernandes
 
Software Visualization Today - Systematic Literature Review
Software Visualization Today - Systematic Literature ReviewSoftware Visualization Today - Systematic Literature Review
Software Visualization Today - Systematic Literature ReviewMindtrek
 
Software Visualization 101+
Software Visualization 101+Software Visualization 101+
Software Visualization 101+Michele Lanza
 

Destaque (6)

Software Visualization (EVO 2008)
Software Visualization (EVO 2008)Software Visualization (EVO 2008)
Software Visualization (EVO 2008)
 
Empirical Results on Cloning and Clone Detection
Empirical Results on Cloning and Clone DetectionEmpirical Results on Cloning and Clone Detection
Empirical Results on Cloning and Clone Detection
 
Visualization for Software Analytics
Visualization for Software AnalyticsVisualization for Software Analytics
Visualization for Software Analytics
 
Visualization in Software Product Lines
Visualization in Software Product LinesVisualization in Software Product Lines
Visualization in Software Product Lines
 
Software Visualization Today - Systematic Literature Review
Software Visualization Today - Systematic Literature ReviewSoftware Visualization Today - Systematic Literature Review
Software Visualization Today - Systematic Literature Review
 
Software Visualization 101+
Software Visualization 101+Software Visualization 101+
Software Visualization 101+
 

Semelhante a 130919 jim cordy - when is a clone not a clone

Using the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsUsing the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsSalesforce Developers
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017delagoya
 
Using the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsUsing the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsDaniel Ballinger
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersBen van Mol
 
Web services in java
Web services in javaWeb services in java
Web services in javamaabujji
 
Query service in vCloud Director
Query service in vCloud DirectorQuery service in vCloud Director
Query service in vCloud DirectorMayank Goyal
 
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & RulesAmazon Web Services
 
Microservices with .Net - NDC Sydney, 2016
Microservices with .Net - NDC Sydney, 2016Microservices with .Net - NDC Sydney, 2016
Microservices with .Net - NDC Sydney, 2016Richard Banks
 

Semelhante a 130919 jim cordy - when is a clone not a clone (20)

Wsdl
WsdlWsdl
Wsdl
 
Steps india technologies
Steps india technologiesSteps india technologies
Steps india technologies
 
Steps india technologies .com
Steps india technologies .comSteps india technologies .com
Steps india technologies .com
 
Web service introduction
Web service introductionWeb service introduction
Web service introduction
 
WCF 4 Overview
WCF 4 OverviewWCF 4 Overview
WCF 4 Overview
 
Using the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsUsing the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service Clients
 
Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017Nyc big datagenomics-pizarroa-sept2017
Nyc big datagenomics-pizarroa-sept2017
 
Using the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsUsing the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service Clients
 
Web services overview
Web services overviewWeb services overview
Web services overview
 
ElasticSearch for .NET Developers
ElasticSearch for .NET DevelopersElasticSearch for .NET Developers
ElasticSearch for .NET Developers
 
Ntg web services
Ntg   web servicesNtg   web services
Ntg web services
 
Web services
Web servicesWeb services
Web services
 
Web services in java
Web services in javaWeb services in java
Web services in java
 
Query service in vCloud Director
Query service in vCloud DirectorQuery service in vCloud Director
Query service in vCloud Director
 
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules
(MBL312) NEW! AWS IoT: Programming a Physical World w/ Shadows & Rules
 
Microservices with .Net - NDC Sydney, 2016
Microservices with .Net - NDC Sydney, 2016Microservices with .Net - NDC Sydney, 2016
Microservices with .Net - NDC Sydney, 2016
 
SQLCLR Tips & Trics
SQLCLR Tips & TricsSQLCLR Tips & Trics
SQLCLR Tips & Trics
 
Web Services
Web Services Web Services
Web Services
 
Web Service Basics and NWS Setup
Web Service  Basics and NWS SetupWeb Service  Basics and NWS Setup
Web Service Basics and NWS Setup
 
Ajax Lecture Notes
Ajax Lecture NotesAjax Lecture Notes
Ajax Lecture Notes
 

Mais de Ptidej Team

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software MiniaturisationPtidej Team
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel BriandPtidej Team
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel AbdellatifPtidej Team
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh KermansaraviPtidej Team
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel GrichiPtidej Team
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano PolitowskiPtidej Team
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisisPtidej Team
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptPtidej Team
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptPtidej Team
 

Mais de Ptidej Team (20)

From IoT to Software Miniaturisation
From IoT to Software MiniaturisationFrom IoT to Software Miniaturisation
From IoT to Software Miniaturisation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation
PresentationPresentation
Presentation
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Manel Abdellatif
Manel AbdellatifManel Abdellatif
Manel Abdellatif
 
Azadeh Kermansaravi
Azadeh KermansaraviAzadeh Kermansaravi
Azadeh Kermansaravi
 
Mouna Abidi
Mouna AbidiMouna Abidi
Mouna Abidi
 
CSED - Manel Grichi
CSED - Manel GrichiCSED - Manel Grichi
CSED - Manel Grichi
 
Cristiano Politowski
Cristiano PolitowskiCristiano Politowski
Cristiano Politowski
 
Will io t trigger the next software crisis
Will io t trigger the next software crisisWill io t trigger the next software crisis
Will io t trigger the next software crisis
 
MIPA
MIPAMIPA
MIPA
 
Thesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.pptThesis+of+laleh+eshkevari.ppt
Thesis+of+laleh+eshkevari.ppt
 
Thesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.pptThesis+of+nesrine+abdelkafi.ppt
Thesis+of+nesrine+abdelkafi.ppt
 
Medicine15.ppt
Medicine15.pptMedicine15.ppt
Medicine15.ppt
 
Qrs17b.ppt
Qrs17b.pptQrs17b.ppt
Qrs17b.ppt
 
Icpc11c.ppt
Icpc11c.pptIcpc11c.ppt
Icpc11c.ppt
 
Icsme16.ppt
Icsme16.pptIcsme16.ppt
Icsme16.ppt
 
Msr17a.ppt
Msr17a.pptMsr17a.ppt
Msr17a.ppt
 
Icsoc15.ppt
Icsoc15.pptIcsoc15.ppt
Icsoc15.ppt
 

Último

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKJago de Vreede
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 

Último (20)

Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

130919 jim cordy - when is a clone not a clone

  • 1. When is a Clone not a Clone? (and vice-versa) Contextualized Analysis of Web Services Douglas Martin Scott Grant James R. Cordy David B. Skillicorn School of Computing Kingston, Canada
  • 2. Motivation —  The Personal Web —  Rapidly growing number of web services makes it increasingly difficult to find and choose the right ones —  Need a quick and convenient way to find alternatives —  Hand tagging impractical – automation is needed!
  • 3. Motivation —  Automation —  Similarity detection techniques offer solutions! —  Code clone detection from software engineering research can find similar code fragments – why not similar services? —  Topic models from data mining research can find text documents with similar semantics – why not similar services?
  • 4. Web Service Similarity —  Web services are stored in service registries, containing WSDL service description files —  Could apply clone detection to entire service descriptions —  But what we really want are similar service operations
  • 5. Let’s try it! <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“Supplier” type=“xsd:string”/> <element name=“Warehouse” type=“xsd:string”/> <element name=“OnHand” type=“xsd:string”/> <element name=“OnOrder” type=“xsd:string”/> <element name=“Demand” type=“xsd:string”/> </sequence> </complexType > <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“date” type=“xsd:string”/> <element name=“open” type=“xsd:float”/> <element name=“high” type=“xsd:float”/> <element name=“low” type=“xsd:float”/> <element name=“close” type=“xsd:float”/> <element name=“volume” type=“xsd:float”/> </sequence> </complexType >
  • 6. How about these? <operation name=“DrawRateChartCustom”> <input message=“DrawRateChartCustomIn”/> <output message=“DrawRateChartCustomOut”/> </operation> <operation name="GetTopicBinaryChartCustom"> <input message="GetTopicBinaryChartCustomSoapIn"/> <output message="GetTopicBinaryChartCustomSoapOut"/> </operation>
  • 7. So what went wrong? —  At this point we thought maybe our idea wasn’t going to work —  Maybe clone detection can’t help with web service discovery? —  But why? What’s so special about WSDL?
  • 8. Web Service Description Language (WSDL) —  A WSDL service description has 3 main parts:
  • 9. Web Service Description Language (WSDL) —  A WSDL service description has 3 main parts: —  a <portType> element where the operations are declared;
  • 10. Web Service Description Language (WSDL) —  A WSDL service description has 3 main parts: —  a <portType> element where the operations are declared; —  <message> elements corresponding to inputs, outputs and faults of the operations;
  • 11. Web Service Description Language (WSDL) —  A WSDL service description has 3 main parts: —  a <portType> element where the operations are declared; —  <message> elements corresponding to inputs, outputs and faults of the operations; —  and a <types> element containing an XML Schema that defines the data and structure types used in the messages
  • 12. Web Service Description Language (WSDL) —  This simple example service has two operations:
  • 13. Web Service Description Language (WSDL) —  This simple example service has two operations: —  ReserveRoom
  • 14. Web Service Description Language (WSDL) —  This simple example service has two operations: —  ReserveRoom —  GetAvailableRooms
  • 15. Web Service Description Language (WSDL) —  WSDL service description files contain descriptions of the operations that a web service has to offer —  But the pieces of each operation’s own description are scattered over different parts of the WSDL file —  Difficult to identify complete units to analyze and compare
  • 16. The Problem —  This poses a problem for analysis techniques: —  Operations cannot easily be compared for similarity using clone detectors, because there are no contiguous fragments to compare —  And they cannot be analyzed using data mining topic models, because there are no separate complete documents to generate a model from
  • 17. Our Solution —  Our solution is to contextualize the original <operation> elements, to create self-contained operation descriptions —  We use source transformation to inline remote information from the context into the elements that reference or depend on them —  We call these contextualized WSDL operations Web Service Cells, or WSCells —  The first example of a new kind of clone detection: contextual clones
  • 20. An Experiment —  We have run an experiment to investigate the difference between clone detection on WSCells and original raw operations —  Two sets of WSDL service description files: 1,100 operations and 7,500 operations —  Compared NICAD clone detector results for each set at various near-miss difference thresholds 0% = exact clone, 10% = 1 line in 10 different, and so on
  • 21. An Experiment —  Number of clones decreases with WSCells Difference   Threshold   Clone  Pairs  in  Set  1   Clone  Pairs  in  Set  2   Originals   WSCells   Originals   WSCells   0.0   852   705   1434   1066   0.1   852   734   1434   1228   0.2   879   775   1438   1637   0.3   884   813   1469   1637   <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“Supplier” type=“xsd:string”/> <element name=“Warehouse” type=“xsd:string”/> <element name=“OnHand” type=“xsd:string”/> <element name=“OnOrder” type=“xsd:string”/> <element name=“Demand” type=“xsd:string”/> </sequence> </complexType > <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“date” type=“xsd:string”/> <element name=“open” type=“xsd:float”/> <element name=“high” type=“xsd:float”/> <element name=“low” type=“xsd:float”/> <element name=“close” type=“xsd:float”/> <element name=“volume” type=“xsd:float”/> </sequence> </complexType > —  Reduction in false positives
  • 22. An Experiment —  Number of clone classes can increase with WSCells Difference   Threshold   Clone  Classes  in  Set  1   Clone  Classes  in  Set  2   Originals   WSCells   Originals   WSCells   0.0   169   187   587   433   0.1   169   139   587   499   0.2   172   142   589   631   0.3   171   136   591   631   <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“Supplier” type=“xsd:string”/> <element name=“Warehouse” type=“xsd:string”/> <element name=“OnHand” type=“xsd:string”/> <element name=“OnOrder” type=“xsd:string”/> <element name=“Demand” type=“xsd:string”/> </sequence> </complexType > <operation name="GetStock" > <input message="tns:GetStockRequest" /> <complexType name=“Stock”> <output message="tns:GetStockResponse" /> <sequence> </operation> <element name=“date” type=“xsd:string”/> <element name=“open” type=“xsd:float”/> <element name=“high” type=“xsd:float”/> <element name=“low” type=“xsd:float”/> <element name=“close” type=“xsd:float”/> <element name=“volume” type=“xsd:float”/> </sequence> </complexType > —  Splits by deeper differences – more precision
  • 23. Clone Detection for Web Services —  Contextual clone detection with WSCells works! —  Not only finds similar web service operations, but uncovers similar operations we could not find in any other way <operation name=“DrawRateChartCustom”> <input message=“DrawRateChartCustomIn”/> <output message=“DrawRateChartCustomOut”/> </operation> <operation name="GetRealChartCustom"> <input message="GetRealChartCustomSoapIn"/> <output message="GetRealChartCustomSoapOut"/> </operation> <operation name="GetLastSaleChartCustom"> <input message="GetLastSaleChartCustomSoapIn"/> <output message="GetLastSaleChartCustomSoapOut"/> </operation> <operation name=“DrawYieldCurveCustom”> <input message=“DrawYieldCurveCustomIn”/> <output message=“DrawYieldCurveCustomOut”/> </operation> <operation name="GetTopicChartCustom"> <input message="GetTopicChartCustomSoapIn" /> <output message="GetTopicChartCustomSoapOut" /> <operation name="GetTopicBinaryChartCustom"> </operation> <input message="GetTopicBinaryChartCustomSoapIn"/> <output message="GetTopicBinaryChartCustomSoapOut"/> </operation>
  • 24. Semantic Analysis of Web Services —  Contextualized WSCells also make it possible to use data mining topic models to do semantic analysis of web services —  Because they provide self-contained documents of significant size —  Might topic models provide a different view of web service similarity?
  • 25. Latent Dirichlet Allocation —  Latent Dirichlet Allocation (LDA) : —  A statistical model to uncover latent topics —  Identifies the correlation between documents in terms of shared latent topics (sets of tokens) —  Accepts a set of documents (e.g., source files) as input, returns probability distributions over inferred topics (a topic model) as output —  Each document has some probability of being related to topic 1, another probability for topic 2, and so on —  Similar documents should be related to similar topics
  • 26. Latent Dirichlet Allocation —  Documents are represented in the model in terms of probability distributions over topics —  Similarity between documents is found using the Hellinger Distance —  A measure of how much agreement there is between the shared topics of two documents —  Almost identical documents have a small Hellinger Distance since they will be related to the same topics —  In terms of web services, small Hellinger Distances indicate highly related operations
  • 27. Evaluating WSCells —  To evaluate the use of WSCells with LDA, we : —  Generate an LDA model for the original <operation> elements, and another for the contextualized WSCells —  Explore the Global and Local Similarity between each pair of operations in the models —  Global Similarity an overall view of the most closely related web service operations in the service set —  Local Similarity a per-operation view of the other most related web service operations for each operation
  • 28. Global Similarity —  We look at Global Similarity using a visualization called Bluevis —  Bluevis shows the global conceptual structure of a system by highlighting similar operations using an illuminated line from left-to-right —  Plot some top fraction of similar operations (top 25,000 in our examples) —  Use a consistently ordered list of web service operations for the LDA model to view the differences —  If a display is noisy, it is often an indication that the model is not identifying meaningful data
  • 30. Global Similarity —  For original raw operations: —  Bluevis highlights the LDA most similar operations —  Some clear structure —  However, most of this is due to shared keywords, like get and SOAP —  This uncontextualized model has very little value
  • 32. Global Similarity —  For contextualized WSCells: —  A clearer semantic structure, less noise overall —  Operation similarity becomes meaningful —  Services with semantic similarity discovered —  E.g., Operations with similar parameters or faults, such as those that manipulate holiday dates or financial rates
  • 33. Local Similarity —  We can also examine the local similarity for each individual operation —  Identify the complete ordered list of similarity scores for an operation in the data set —  Using the top similarity scores, evaluate how meaningful the data is from a user's perspective —  For example, how can I find the most similar web service operations to the one I am using now? —  We use a tool called POCO (Pairwise Observation of Concepts) to examine the most similar operations
  • 35. Local Similarity Operation Most similar WSCell Most similar original raw WSDL operation ListFinancials GetFinancialServicesFromList LanguagesList ExportShipsAndCategories ExportIteneraryAndSteps Search GetIssueData GetFlightData word_cloud GetWeatherReport GetWeather GetIndices GetAIDIBOR GetTRLIBOR GetCarriers searchByIdentifier searchByNameAndAddress GetLastSecurityHeadlines ToolsAndHardwareBox KitchenAndHousewareBox ListRenditions GetReservations GetRoomAvailabilityForDay GetSOFIBOR GetOtherProductInfo NextOtherProductPortion GetParkingInfo GetAllSplitsByExchange GetAllCashDividendsByExchange GetTeamLoyalties2
  • 36. Summary —  Very-high-level domain-specific languages such as WSDL make poor targets for similarity analysis using clone detection and topic models —  Lack of local context prevents meaningful results —  Contextualizing using WSCells exposes both cloning and semantic relationships between web operations —  Clone detection of WSCells identifies similar web service operations —  Topic models of WSCells expose both global system-wide semantic relationships and local individual relationships between operations
  • 37. Current & Future —  Continue analysis of web services for the Personal Web using our results —  Apply contextualization to similarity analysis of other modeling and specification languages (currently Simulink, Stateflow and UML sequence diagrams) —  Experiment with effect of contextualization on clone and topic model analysis of traditional languages such as Java and C (“contextual clones”)
  • 38. When is a Clone not a Clone? (and vice-versa) Contextualized Analysis of Web Services Douglas Martin Scott Grant James R. Cordy David B. Skillicorn Questions?