SlideShare uma empresa Scribd logo
1 de 11
Baixar para ler offline
Class'Level'Fault'PredicCon''
using'SoMware'Clustering''
Giuseppe'Scanniello ''
'
Carmine'Gravino '
1

2

'Andrian'Marcus3''

Tim'Menzies4'
'
'

giuseppe.scanniello@unibas.it,''gravino@unisa.it'
'amarcus@wayne.edu,'''Cm@menzies.us'''
'
1'University'of'Basilicata,'Italy''
2'Italy'University'of'Salerno,'Italy''
3'Wayne'State'University,'USA''
4'West'Virginia'University,'USA'

'
'
'
This talk=
BorderFlow clustering for defect prediction
•  !Defect!predic+on!
–  Sort'modules'by'odds'of'having'defects'
–  Used'to'prioriCzing'subsequent'work'
'

•  Borderflow'
–  Finds'code'clusters'with'
•  High'cohesion'
•  Low'coupling'
'

•  Produces'beRer'defect'predictors.'

2'
Q:'Why?''
A:'Too'much'blah,'blah'
•  SoMware'is'being'wriRen'by'
–  More'people'
–  For'more'tasks'
–  Using'changing'tools'
–  On'ever'changing'
plaWorms'
'
'
•  Any'claim'that'X'is'always'THE'prime''determiner'of''
defects,'efforts,'livelocks,'etc'etc,'etc'is'…..'
–  Trite'simplificaCons'of'a'more'complex'issue'
3'
Local'lessons'>'Trite'global'claims'
•  Cluster'data'
•  Learn'1'model'per'cluster'
– 
– 
– 
– 
– 
– 
– 

Context^specific'soluCons'
BeRer'predicCons,''
lower'variance','
lower'false'alarm,'
faster'runCmes'
beRer'explanaCon'
etc.'

•  As'recommended'by''
'
any'number'of'papers'
– 
– 
– 
– 
– 

'

Turhan:ESEj’09;//
Menzies:ASE’11,TSE’13;//
Be:enburg:MSR’12;//
Yang:IST’13//
etc,'etc,'etc.'
'

4'
Related'Work'
•  !How'to'cluster?''

–  By'intra^module'features?''

•  Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12/

–  By'performance'deltas'of'models'learned''
from'different'straCficaCons?'''

Need'principles'
for'reducing'
opCon'space'

•  Yang:IST’13;/He:ASEj’12;/He:ESEM’13/

•  And'what'features'to'use?'
–  Just'staCc'code'measures:'

•  'Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12'

–  Using'design'+'code'arCfacts?''''
•  Schroter:ESEM’06/

–  Using'soMware'process'+product'measures'?'
•  Kamei:ICSM’10/

–  Using'synthesized'aRributes'using'PCA'or'LSI?'
•  'Nagappan:ICSE’06,'Tan:WCRE’11/
/

Hence,''
this'paper'

•  Premise'of'this'paper:'

–  These'How'and'What'are'related'

5'
New'idea''
•  SoMware'has''
natural'clusters'
–  Regions'of'high'cohesion'
and'low'coupling'
'

•  What'if'we'exploited'
that'naturally'occurring'
structure?'
'

•  So'cluster'not'by'intra7module!features;''
–  e.g.'as'done'by'Menzies:ASE’11,TSE’13;/Be:enburg:MSR’12;/etc/
–  But'by'inter7module!features!
!
•  Use'a'clusterer'that'understands'cohesion'and'coupling'
–  This'talk:'BorderFlow'clustering'for'defect!predic+on!

6'
Target'domain'
•  Defect'predicCon'from'staCc'
code'features'
•  Easy'to'use:''
–  scalable'feature'extractors'+'logs'of'
defects'found'

•  Widely'used:'
–  PrioriCze'inspecCons:'find'20%'of'code'
with'80%'of'errors'
•  Ostrand:ISSTA’04;'Nagappan:ICSE’06;'
Menzies:ASEj’10;'Tosun:IAAI’10;'etc'etc'

•  Useful'to'use:'
–  Compared'to'(some)'samples'of'
industrial'pracCces…'
•  Finds'more'defects:'Menzies:TSE’07;''
7'
Borderflow'
Ngomo:CLCing’09/
•  Graph'G'='(V,E);'V'are'classes'''
'
•  e(ci','cj')' 'E'if''cj''references'ci''
–  In'class'instanCaCon,'method'invocaCon,'or'field'access'
–  JRIPPLES:'hRp://jripples.sourceforge.net/'
'

•  A'cluster'X'is'a'subset'of'V'that'maximizes:'at'
–  F(X)' '='Ω(b(X),X)''/'Ω(b(X'),'n(X'))'
–  'b(X)' '=''border'nodes'inside'X'
–  'n(X)' '='direct'neighbors'of'b(X),'outside'of'X'
–  Ω''
'=''number'of'the'edges''between''subsets'of'V'';'''
'
'''''Ω(X,Y)=Σ'e(ci,cj)|ci' X'and'cj' Y''
'

•  'IteraCvely'inserts''nodes'in'n(X)'Cll''F(X)'is'maximized.'
–  1)''Candidates:'find''C('X')'='nodes'not'in'X''where'''F(X+C(X))'>'F(X)'
–  2)''Prune:''subset'Y''in''C(X)'that''maximize'Ω(Y,'n(X')).'
–  3)'Test:'if'F(X+Y)'>='F(X),'then'X'='X'+'Y'

8'
Experiment:'
leave^one^out,'JAVA'classes,''
learn'from'clusters'vs'learn'from'all''
! 
! 

! 

Dependent'variable'
!  ClassFault.'
'
independent:''
!  WMC''
!  Weighted'Methods''
per'Class)'
!  DIT''
!  Depth'Inheritance'Tree''
!  NOC''
!  Number'Of'Children)'
!  CBO''
!  Coupling'Between''
Object'classes'
!  RFC''
!  Response'For'Class'
!  LCOM''
!  Lack'of'Cohesion'in'
Methods'
!  NPM''
!  Number'of''
Public'Methods'
!  LOC''
!  Lines'Of'Code'
'
Hypothesis'test:Mann^Whitney'(5%)'

Java'systems'''from'promisedata.googleode.com'
X'='one'of''Ant,Jedit,Lucene,POI,Synapse,Velocity,Xalan''Xerces'
'
'
Version'='one'version'of'X''
Clusters'='BorderFlow('Version')'
'
'
For'Cluster'in'Clusters/
'''''''For'Class'in'Version/
'''''''''''''Test '= 'Class/
''''''''''''''''a '= 'Class.Faults'
'''''''''''Train '= 'Version'–'Test''
'''''''Model0 '= 'SwLSR'('Train')!#!baseline:!global!model'
'''''''''''''''''p0'= 'Model0('Test')!
'''''''Model1 '= 'SwLSR('Cluster'–'Class)'#!!local!model'
'''''''''''''''''p1'= 'Model1('Class/)'
9'
Results:'
Error'='mean'absolute'residuals'='p^a'
(more'='worse)'

^0.5'

'JEdit'4.0'''
'Velocity'1.6.1''
'Velocity'1.5''

0'

'JEdit'4.3''

–  See'Velocity,''
Jedit'(but'only'
some'versions)'

0.5'

'JEdit'4.2''

•  ExcepCons:''

1'

'Velocity'1.4''

–  Has'less'error'
–  SomeCmes,''
much'beRer'

1.5'

'JEdit'3.2.1''
'JEdit'4.1''

•  Usually,''
local'is'best'

^1'
global'

local'

delta'
10'
Summary!

Future!Work!

•  Too'many'opCons'

•  Repeat'on'more'data'sets'
'
•  Compare'with'other'local'learners'
'
•  Test'if'inter^module'always'best'

–  Need'principles'to'design'data'
miners'for'soMware'engineering'
'

•  We'applied'a'core'SE'principle'
–  Coupling'and'cohesion'
'

•  Used'it'to'select'both'
–  A'data'miner:'BorderFlow'
–  And'the'aRributes'it'explores'
•  Inter^module'features'
'

•  Obtained'beRer'results'
11'

Mais conteúdo relacionado

Mais de CS, NcState

Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
CS, NcState
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
CS, NcState
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
CS, NcState
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
CS, NcState
 

Mais de CS, NcState (20)

Talks2015 novdec
Talks2015 novdecTalks2015 novdec
Talks2015 novdec
 
Future se oct15
Future se oct15Future se oct15
Future se oct15
 
GALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software EngineeringGALE: Geometric active learning for Search-Based Software Engineering
GALE: Geometric active learning for Search-Based Software Engineering
 
Big Data: the weakest link
Big Data: the weakest linkBig Data: the weakest link
Big Data: the weakest link
 
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...Three Laws of Trusted Data Sharing:(Building a Better Business Case for Dat...
Three Laws of Trusted Data Sharing: (Building a Better Business Case for Dat...
 
Lexisnexis june9
Lexisnexis june9Lexisnexis june9
Lexisnexis june9
 
Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).Welcome to ICSE NIER’15 (new ideas and emerging results).
Welcome to ICSE NIER’15 (new ideas and emerging results).
 
Icse15 Tech-briefing Data Science
Icse15 Tech-briefing Data ScienceIcse15 Tech-briefing Data Science
Icse15 Tech-briefing Data Science
 
Kits to Find the Bits that Fits
Kits to Find  the Bits that Fits Kits to Find  the Bits that Fits
Kits to Find the Bits that Fits
 
Ai4se lab template
Ai4se lab templateAi4se lab template
Ai4se lab template
 
Automated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSUAutomated Software Enging, Fall 2015, NCSU
Automated Software Enging, Fall 2015, NCSU
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia172529main ken and_tim_software_assurance_research_at_west_virginia
172529main ken and_tim_software_assurance_research_at_west_virginia
 
Automated Software Engineering
Automated Software EngineeringAutomated Software Engineering
Automated Software Engineering
 
Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)Next Generation “Treatment Learning” (finding the diamonds in the dust)
Next Generation “Treatment Learning” (finding the diamonds in the dust)
 
Tim Menzies, directions in Data Science
Tim Menzies, directions in Data ScienceTim Menzies, directions in Data Science
Tim Menzies, directions in Data Science
 
Goldrush
GoldrushGoldrush
Goldrush
 
Dagstuhl14 intro-v1
Dagstuhl14 intro-v1Dagstuhl14 intro-v1
Dagstuhl14 intro-v1
 
Know thy tools
Know thy toolsKnow thy tools
Know thy tools
 
The Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software DataThe Art and Science of Analyzing Software Data
The Art and Science of Analyzing Software Data
 

Último

Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Precisely
 

Último (20)

Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Design Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptxDesign Guidelines for Passkeys 2024.pptx
Design Guidelines for Passkeys 2024.pptx
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!State of the Smart Building Startup Landscape 2024!
State of the Smart Building Startup Landscape 2024!
 
2024 May Patch Tuesday
2024 May Patch Tuesday2024 May Patch Tuesday
2024 May Patch Tuesday
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial IntelligenceRevolutionizing SAP® Processes with Automation and Artificial Intelligence
Revolutionizing SAP® Processes with Automation and Artificial Intelligence
 

Ase2013