Mais conteúdo relacionado Ase20132. This talk=
BorderFlow clustering for defect prediction
• !Defect!predic+on!
– Sort'modules'by'odds'of'having'defects'
– Used'to'prioriCzing'subsequent'work'
'
• Borderflow'
– Finds'code'clusters'with'
• High'cohesion'
• Low'coupling'
'
• Produces'beRer'defect'predictors.'
2'
5. Related'Work'
• !How'to'cluster?''
– By'intra^module'features?''
• Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12/
– By'performance'deltas'of'models'learned''
from'different'straCficaCons?'''
Need'principles'
for'reducing'
opCon'space'
• Yang:IST’13;/He:ASEj’12;/He:ESEM’13/
• And'what'features'to'use?'
– Just'staCc'code'measures:'
• 'Menzies:ASE’11,TSE’13;//Be:enburg:MSR’12'
– Using'design'+'code'arCfacts?''''
• Schroter:ESEM’06/
– Using'soMware'process'+product'measures'?'
• Kamei:ICSM’10/
– Using'synthesized'aRributes'using'PCA'or'LSI?'
• 'Nagappan:ICSE’06,'Tan:WCRE’11/
/
Hence,''
this'paper'
• Premise'of'this'paper:'
– These'How'and'What'are'related'
5'
7. Target'domain'
• Defect'predicCon'from'staCc'
code'features'
• Easy'to'use:''
– scalable'feature'extractors'+'logs'of'
defects'found'
• Widely'used:'
– PrioriCze'inspecCons:'find'20%'of'code'
with'80%'of'errors'
• Ostrand:ISSTA’04;'Nagappan:ICSE’06;'
Menzies:ASEj’10;'Tosun:IAAI’10;'etc'etc'
• Useful'to'use:'
– Compared'to'(some)'samples'of'
industrial'pracCces…'
• Finds'more'defects:'Menzies:TSE’07;''
7'
8. Borderflow'
Ngomo:CLCing’09/
• Graph'G'='(V,E);'V'are'classes'''
'
• e(ci','cj')' 'E'if''cj''references'ci''
– In'class'instanCaCon,'method'invocaCon,'or'field'access'
– JRIPPLES:'hRp://jripples.sourceforge.net/'
'
• A'cluster'X'is'a'subset'of'V'that'maximizes:'at'
– F(X)' '='Ω(b(X),X)''/'Ω(b(X'),'n(X'))'
– 'b(X)' '=''border'nodes'inside'X'
– 'n(X)' '='direct'neighbors'of'b(X),'outside'of'X'
– Ω''
'=''number'of'the'edges''between''subsets'of'V'';'''
'
'''''Ω(X,Y)=Σ'e(ci,cj)|ci' X'and'cj' Y''
'
• 'IteraCvely'inserts''nodes'in'n(X)'Cll''F(X)'is'maximized.'
– 1)''Candidates:'find''C('X')'='nodes'not'in'X''where'''F(X+C(X))'>'F(X)'
– 2)''Prune:''subset'Y''in''C(X)'that''maximize'Ω(Y,'n(X')).'
– 3)'Test:'if'F(X+Y)'>='F(X),'then'X'='X'+'Y'
8'
9. Experiment:'
leave^one^out,'JAVA'classes,''
learn'from'clusters'vs'learn'from'all''
!
!
!
Dependent'variable'
! ClassFault.'
'
independent:''
! WMC''
! Weighted'Methods''
per'Class)'
! DIT''
! Depth'Inheritance'Tree''
! NOC''
! Number'Of'Children)'
! CBO''
! Coupling'Between''
Object'classes'
! RFC''
! Response'For'Class'
! LCOM''
! Lack'of'Cohesion'in'
Methods'
! NPM''
! Number'of''
Public'Methods'
! LOC''
! Lines'Of'Code'
'
Hypothesis'test:Mann^Whitney'(5%)'
Java'systems'''from'promisedata.googleode.com'
X'='one'of''Ant,Jedit,Lucene,POI,Synapse,Velocity,Xalan''Xerces'
'
'
Version'='one'version'of'X''
Clusters'='BorderFlow('Version')'
'
'
For'Cluster'in'Clusters/
'''''''For'Class'in'Version/
'''''''''''''Test '= 'Class/
''''''''''''''''a '= 'Class.Faults'
'''''''''''Train '= 'Version'–'Test''
'''''''Model0 '= 'SwLSR'('Train')!#!baseline:!global!model'
'''''''''''''''''p0'= 'Model0('Test')!
'''''''Model1 '= 'SwLSR('Cluster'–'Class)'#!!local!model'
'''''''''''''''''p1'= 'Model1('Class/)'
9'
11. Summary!
Future!Work!
• Too'many'opCons'
• Repeat'on'more'data'sets'
'
• Compare'with'other'local'learners'
'
• Test'if'inter^module'always'best'
– Need'principles'to'design'data'
miners'for'soMware'engineering'
'
• We'applied'a'core'SE'principle'
– Coupling'and'cohesion'
'
• Used'it'to'select'both'
– A'data'miner:'BorderFlow'
– And'the'aRributes'it'explores'
• Inter^module'features'
'
• Obtained'beRer'results'
11'