SlideShare uma empresa Scribd logo
1 de 61
Predicting Defects
using Network Analysis
on Dependency Graphs
Thomas Zimmermann, University of Calgary, Canada
Nachiappan Nagappan, Microsoft Research, USA
Bugs are everywhere
Bugs are everywhere
Bugs are everywhere
Quality assurance is limited...

   ...by time...
Quality assurance is limited...

   ...by time...   ...and by money.
Spent resources on the
components that need it most,
  i.e., are most likely to fail.
Meet Jacob
Meet Jacob

• Your QA manager
Meet Jacob

• Your QA manager
• Ten years knowledge
  of your project
Meet Jacob

• Your QA manager
• Ten years knowledge
  of your project
• Aware of its history
  and the hot spots
But then Jacob left...
Meet Emily

  • Your new QA manager
    (replaces Jacob)
  • Not much experience
    with your project yet
  • How can she allocate
    resources effectively?
Meet Emily

  • Your new QA manager
    (replaces Jacob)
  • Not much experience
    with your project yet
  • How can she allocate
    resources effectively?
Indicators of defects
•   Code complexity
    -   Basili et al. 1996, Subramanyam and Krishnan 2003,
    -   Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006
Indicators of defects
•   Code complexity
    -   Basili et al. 1996, Subramanyam and Krishnan 2003,
    -   Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006

•   Code churn
    -   Nagappan and Ball 2005
Indicators of defects
•   Code complexity
    -   Basili et al. 1996, Subramanyam and Krishnan 2003,
    -   Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006

•   Code churn
    -   Nagappan and Ball 2005

•   Historical data
    -   Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007,
    -   Ostrand et al. 2005, Mockus et al. 2005
Indicators of defects
•   Code complexity
    -   Basili et al. 1996, Subramanyam and Krishnan 2003,
    -   Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006

•   Code churn
    -   Nagappan and Ball 2005

•   Historical data
    -   Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007,
    -   Ostrand et al. 2005, Mockus et al. 2005

•   Code dependencies
    -   Nagappan and Ball 2007, Schröter et al. 2006
    -   Zimmermann and Nagappan 2007
Centrality
Hypothesis


Network measures on dependency graphs
 - correlate with the number of post-release defects (H1)
 - can predict the number of post-release defects (H2)
 - can indicate critical “escrow” binaries (H3)
DATA.   .
2252 Binaries
28.3 MLOC
Windows Server layout
Windows Server layout
Windows Server layout
Windows Server layout
Data collection

 Release point for
Windows Server 2003
Data collection

 Release point for
Windows Server 2003




Complexity Metrics

  Dependencies

Network Measures
Data collection
                      six months
 Release point for
                       to collect
Windows Server 2003
                        defects



Complexity Metrics

  Dependencies

Network Measures       Defects
Dependencies
• Directed relationship between two pieces
  of code (here: binaries)
• MaX dependency analysis framework
  -Caller-callee dependencies
  - Imports and exports
  - RPC, COM
  - Runtime dependencies (such as LoadLibrary)
  - Registry access
  - etc.
Centrality
• Degreethe number dependencies
          centrality
   -
   counts

• Closeness centrality binaries into account
   -
   takes distance to all other
   - Closeness: How close are the other binaries?
   - Reach: How many binaries can be reached (weighted)?
   - Eigenvector: similar to Pagerank
• Betweenness centrality paths through a binary
   -
   counts the number of shortest
Structural holes


 A
            B

 C
No structural hole
Structural holes


 A                    A
            B                    B

 C                    C
No structural hole   No structural hole
                     between B and C
Ego networks




    EGO
Ego networks




    EGO




   INOUT
Ego networks




     EGO




IN
     INOUT
Ego networks




     EGO




IN           OUT
     INOUT
Complexity metrics
Group                  Metrics                                 Aggregation
Module metrics         # functions in B
for a binary B         # global variables in B
                       # executable lines in f()
                       # parameters in f()
Per-function metrics                                              Total
                       # functions calling f()
for a function f()                                                Max
                       # functions called by f()
                       McCabe’s cyclomatic complexity of f()
                       # methods in C
                       # subclasses of C
OO metrics                                                        Total
                       Depth of C in the inheritance tree
for a class C                                                     Max
                       Coupling between classes
                       Cyclic coupling between classes
RESULTS.   .
1 PATTERNS
Star pattern

     With defects




               No defects
Undirected cliques



           ...       ...
Undirected cliques
Undirected cliques




    Average number of defects is
 higher for binaries in large cliques.
2 PREDICTION
Prediction

                             Model
Input metrics and measures                Prediction
                               PCA
                             Regression
Prediction

                             Model
Input metrics and measures                Prediction
                               PCA
                             Regression
  Metrics
                 SNA

 Metrics+SNA
Prediction

                             Model
Input metrics and measures                Prediction
                               PCA
                             Regression
  Metrics                                     Classification
                 SNA

 Metrics+SNA                                   Ranking
Classification


Has a binary a defect or not?




            or
Ranking


Which binaries have the most defects?




    or                or ... or
Random splits
Random splits




4×50×
Classification
 (logistic regression)
Classification
            (logistic regression)




SNA increases the recall by 0.10 (at p=0.01)
  while precision remains comparable.
Ranking
(linear regression)
Ranking
          (linear regression)




SNA+METRICS increases the correlation
    by 0.10 (significant at p=0.01)
3 ESCROW
Escrow binaries

• Escrowcritical binaries for Windows Server 2003
            binaries
   -list of
   - development teams select binaries for escrow based
       on (past) experience

• Special protocol for escrow binaries
   -involves more testing, code reviews
Predicting escrow binaries
 Network measures           Recall
 GlobalInClosenessFreeman   0.60
 GlobalIndwReach            0.60
 EgoInSize                  0.55
 EgoInPairs                 0.55
 EgoInBroker                0.55
 EgoInTies                  0.50
 GlobalInDegree             0.50
 GlobalBetweenness          0.50
 ...                         ...
 Complexity metrics         Recall
 TotalParameters            0.30
 TotalComplexity            0.30
 TotalLines                 0.30
 TotalFanIn                 0.30
 TotalFanOut                0.30
 ...                         ...
Predicting escrow binaries
 Network measures                      Recall
 GlobalInClosenessFreeman               0.60
 GlobalIndwReach                        0.60
 EgoInSize                              0.55
 EgoInPairs                             0.55
 EgoInBroker                            0.55
 EgoInTies                              0.50
 GlobalInDegree                         0.50
 GlobalBetweenness                      0.50
 ...                                     ...
 Complexity metrics                    Recall
 TotalParameters                        0.30
 TotalComplexity                        0.30
 TotalLines                             0.30
 TotalFanIn                             0.30
     Network measures predict twice as 0.30
                                        many
 TotalFanOut
 ... escrow binaries as complexity metrics do.
                                         ...
CONCLUSION. .
• Classification measures is 0.10 higher than for
  -Recall for network
    complexity metrics.
  - The precision remains comparable.
• Ranking network mesures with complexity metrics
  -Combining
      increases the correlation by 0.10.

• Escrow metrics fail to predict escrow binaries.
  - Complexity
  - Network measures predict 60% of escrow binaries.

Mais conteúdo relacionado

Semelhante a Predicting Defects using Network Analysis on Dependency Graphs

Profiling distributed Java applications
Profiling distributed Java applicationsProfiling distributed Java applications
Profiling distributed Java applications
Constantine Slisenka
 
Design Verification: The Past, Present and Futurere
Design Verification: The Past, Present and FuturereDesign Verification: The Past, Present and Futurere
Design Verification: The Past, Present and Futurere
DVClub
 
Design verification--the-past-present-and-future
Design verification--the-past-present-and-futureDesign verification--the-past-present-and-future
Design verification--the-past-present-and-future
Obsidian Software
 
Rails Software Metrics
Rails Software MetricsRails Software Metrics
Rails Software Metrics
chiel
 
13986149 c-pgming-for-embedded-systems
13986149 c-pgming-for-embedded-systems13986149 c-pgming-for-embedded-systems
13986149 c-pgming-for-embedded-systems
PRADEEP
 
2010 06-24 karlsruher entwicklertag
2010 06-24 karlsruher entwicklertag2010 06-24 karlsruher entwicklertag
2010 06-24 karlsruher entwicklertag
Marcel Bruch
 
Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality	Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality
ICSM 2010
 

Semelhante a Predicting Defects using Network Analysis on Dependency Graphs (20)

Magic behind the numbers - software metrics in practice
Magic behind the numbers - software metrics in practiceMagic behind the numbers - software metrics in practice
Magic behind the numbers - software metrics in practice
 
Profiling distributed Java applications
Profiling distributed Java applicationsProfiling distributed Java applications
Profiling distributed Java applications
 
Design Verification: The Past, Present and Futurere
Design Verification: The Past, Present and FuturereDesign Verification: The Past, Present and Futurere
Design Verification: The Past, Present and Futurere
 
Design verification--the-past-present-and-future
Design verification--the-past-present-and-futureDesign verification--the-past-present-and-future
Design verification--the-past-present-and-future
 
Scam12.ppt
Scam12.pptScam12.ppt
Scam12.ppt
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
深度學習在AOI的應用
深度學習在AOI的應用深度學習在AOI的應用
深度學習在AOI的應用
 
Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities Predicting Subsystem Defects using Dependency Graph Complexities
Predicting Subsystem Defects using Dependency Graph Complexities
 
Rails Software Metrics
Rails Software MetricsRails Software Metrics
Rails Software Metrics
 
Predicting Defects in SAP Java Code: An Experience Report
Predicting Defects in SAP Java Code: An Experience ReportPredicting Defects in SAP Java Code: An Experience Report
Predicting Defects in SAP Java Code: An Experience Report
 
13986149 c-pgming-for-embedded-systems
13986149 c-pgming-for-embedded-systems13986149 c-pgming-for-embedded-systems
13986149 c-pgming-for-embedded-systems
 
2010 06-24 karlsruher entwicklertag
2010 06-24 karlsruher entwicklertag2010 06-24 karlsruher entwicklertag
2010 06-24 karlsruher entwicklertag
 
Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality	Studying the impact of dependency network measures on software quality
Studying the impact of dependency network measures on software quality
 
BSSML17 - Deepnets
BSSML17 - DeepnetsBSSML17 - Deepnets
BSSML17 - Deepnets
 
Coding Naked
Coding NakedCoding Naked
Coding Naked
 
MARS presentation
MARS presentationMARS presentation
MARS presentation
 
CCNA training 101
CCNA training 101CCNA training 101
CCNA training 101
 
Measuring maintainability; software metrics explained
Measuring maintainability; software metrics explainedMeasuring maintainability; software metrics explained
Measuring maintainability; software metrics explained
 
Transcription Factor DNA Binding Prediction
Transcription Factor DNA Binding PredictionTranscription Factor DNA Binding Prediction
Transcription Factor DNA Binding Prediction
 
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
Certification Study Group - Professional ML Engineer Session 3 (Machine Learn...
 

Mais de Thomas Zimmermann

Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
Thomas Zimmermann
 

Mais de Thomas Zimmermann (20)

Software Analytics = Sharing Information
Software Analytics = Sharing InformationSoftware Analytics = Sharing Information
Software Analytics = Sharing Information
 
MSR 2013 Preview
MSR 2013 PreviewMSR 2013 Preview
MSR 2013 Preview
 
Predicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode OperationsPredicting Method Crashes with Bytecode Operations
Predicting Method Crashes with Bytecode Operations
 
Analytics for smarter software development
Analytics for smarter software development Analytics for smarter software development
Analytics for smarter software development
 
Characterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get ReopenedCharacterizing and Predicting Which Bugs Get Reopened
Characterizing and Predicting Which Bugs Get Reopened
 
Klingon Countdown Timer
Klingon Countdown TimerKlingon Countdown Timer
Klingon Countdown Timer
 
Data driven games user research
Data driven games user researchData driven games user research
Data driven games user research
 
Not my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignmentsNot my bug! Reasons for software bug report reassignments
Not my bug! Reasons for software bug report reassignments
 
Empirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft ResearchEmpirical Software Engineering at Microsoft Research
Empirical Software Engineering at Microsoft Research
 
Security trend analysis with CVE topic models
Security trend analysis with CVE topic modelsSecurity trend analysis with CVE topic models
Security trend analysis with CVE topic models
 
Analytics for software development
Analytics for software developmentAnalytics for software development
Analytics for software development
 
Characterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixedCharacterizing and predicting which bugs get fixed
Characterizing and predicting which bugs get fixed
 
Cross-project defect prediction
Cross-project defect predictionCross-project defect prediction
Cross-project defect prediction
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Quality of Bug Reports in Open Source
Quality of Bug Reports in Open SourceQuality of Bug Reports in Open Source
Quality of Bug Reports in Open Source
 
Meet Tom and his Fish
Meet Tom and his FishMeet Tom and his Fish
Meet Tom and his Fish
 
Got Myth? Myths in Software Engineering
Got Myth? Myths in Software EngineeringGot Myth? Myths in Software Engineering
Got Myth? Myths in Software Engineering
 
Mining Workspace Updates in CVS
Mining Workspace Updates in CVSMining Workspace Updates in CVS
Mining Workspace Updates in CVS
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Unit testing with JUnit
Unit testing with JUnitUnit testing with JUnit
Unit testing with JUnit
 

Último

( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
dipikadinghjn ( Why You Choose Us? ) Escorts
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
dipikadinghjn ( Why You Choose Us? ) Escorts
 
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort : 9352852248 Make on-demand Arrangements Near yOU
 
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
dipikadinghjn ( Why You Choose Us? ) Escorts
 
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
dipikadinghjn ( Why You Choose Us? ) Escorts
 

Último (20)

Navi Mumbai Cooperetive Housewife Call Girls-9833754194-Natural Panvel Enjoye...
Navi Mumbai Cooperetive Housewife Call Girls-9833754194-Natural Panvel Enjoye...Navi Mumbai Cooperetive Housewife Call Girls-9833754194-Natural Panvel Enjoye...
Navi Mumbai Cooperetive Housewife Call Girls-9833754194-Natural Panvel Enjoye...
 
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Shivane  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Shivane 6297143586 Call Hot Indian Gi...
 
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Dighi ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure serviceWhatsApp 📞 Call : 9892124323  ✅Call Girls In Chembur ( Mumbai ) secure service
WhatsApp 📞 Call : 9892124323 ✅Call Girls In Chembur ( Mumbai ) secure service
 
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
( Jasmin ) Top VIP Escorts Service Dindigul 💧 7737669865 💧 by Dindigul Call G...
 
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
Vip Call US 📞 7738631006 ✅Call Girls In Sakinaka ( Mumbai )
 
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
VIP Call Girl in Mira Road 💧 9920725232 ( Call Me ) Get A New Crush Everyday ...
 
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
Call Girls in New Friends Colony Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escort...
 
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
Diva-Thane European Call Girls Number-9833754194-Diva Busty Professional Call...
 
Stock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdfStock Market Brief Deck (Under Pressure).pdf
Stock Market Brief Deck (Under Pressure).pdf
 
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Banaswadi Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
Solution Manual for Principles of Corporate Finance 14th Edition by Richard B...
 
Indore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdfIndore Real Estate Market Trends Report.pdf
Indore Real Estate Market Trends Report.pdf
 
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
From Luxury Escort Service Kamathipura : 9352852248 Make on-demand Arrangemen...
 
Top Rated Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated  Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...Top Rated  Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
Top Rated Pune Call Girls Aundh ⟟ 6297143586 ⟟ Call Me For Genuine Sex Servi...
 
Top Rated Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
Top Rated  Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...Top Rated  Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
Top Rated Pune Call Girls Shikrapur ⟟ 6297143586 ⟟ Call Me For Genuine Sex S...
 
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
VIP Call Girl in Mumbai Central 💧 9920725232 ( Call Me ) Get A New Crush Ever...
 
Top Rated Pune Call Girls Lohegaon ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated  Pune Call Girls Lohegaon ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...Top Rated  Pune Call Girls Lohegaon ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
Top Rated Pune Call Girls Lohegaon ⟟ 6297143586 ⟟ Call Me For Genuine Sex Se...
 
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...Top Rated  Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
Top Rated Pune Call Girls Viman Nagar ⟟ 6297143586 ⟟ Call Me For Genuine Sex...
 
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
VIP Independent Call Girls in Mira Bhayandar 🌹 9920725232 ( Call Me ) Mumbai ...
 

Predicting Defects using Network Analysis on Dependency Graphs

  • 1. Predicting Defects using Network Analysis on Dependency Graphs Thomas Zimmermann, University of Calgary, Canada Nachiappan Nagappan, Microsoft Research, USA
  • 5. Quality assurance is limited... ...by time...
  • 6. Quality assurance is limited... ...by time... ...and by money.
  • 7. Spent resources on the components that need it most, i.e., are most likely to fail.
  • 10. Meet Jacob • Your QA manager • Ten years knowledge of your project
  • 11. Meet Jacob • Your QA manager • Ten years knowledge of your project • Aware of its history and the hot spots
  • 12. But then Jacob left...
  • 13. Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
  • 14. Meet Emily • Your new QA manager (replaces Jacob) • Not much experience with your project yet • How can she allocate resources effectively?
  • 15. Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006
  • 16. Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005
  • 17. Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005 • Historical data - Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, - Ostrand et al. 2005, Mockus et al. 2005
  • 18. Indicators of defects • Code complexity - Basili et al. 1996, Subramanyam and Krishnan 2003, - Binkley and Schach 1998, Ohlsson and Alberg 1996, Nagappan et al. 2006 • Code churn - Nagappan and Ball 2005 • Historical data - Khoshgoftaar et al. 1996, Graves et al. 2000, Kim et al. 2007, - Ostrand et al. 2005, Mockus et al. 2005 • Code dependencies - Nagappan and Ball 2007, Schröter et al. 2006 - Zimmermann and Nagappan 2007
  • 20. Hypothesis Network measures on dependency graphs - correlate with the number of post-release defects (H1) - can predict the number of post-release defects (H2) - can indicate critical “escrow” binaries (H3)
  • 21. DATA. .
  • 27. Data collection Release point for Windows Server 2003
  • 28. Data collection Release point for Windows Server 2003 Complexity Metrics Dependencies Network Measures
  • 29. Data collection six months Release point for to collect Windows Server 2003 defects Complexity Metrics Dependencies Network Measures Defects
  • 30. Dependencies • Directed relationship between two pieces of code (here: binaries) • MaX dependency analysis framework -Caller-callee dependencies - Imports and exports - RPC, COM - Runtime dependencies (such as LoadLibrary) - Registry access - etc.
  • 31. Centrality • Degreethe number dependencies centrality - counts • Closeness centrality binaries into account - takes distance to all other - Closeness: How close are the other binaries? - Reach: How many binaries can be reached (weighted)? - Eigenvector: similar to Pagerank • Betweenness centrality paths through a binary - counts the number of shortest
  • 32. Structural holes A B C No structural hole
  • 33. Structural holes A A B B C C No structural hole No structural hole between B and C
  • 35. Ego networks EGO INOUT
  • 36. Ego networks EGO IN INOUT
  • 37. Ego networks EGO IN OUT INOUT
  • 38. Complexity metrics Group Metrics Aggregation Module metrics # functions in B for a binary B # global variables in B # executable lines in f() # parameters in f() Per-function metrics Total # functions calling f() for a function f() Max # functions called by f() McCabe’s cyclomatic complexity of f() # methods in C # subclasses of C OO metrics Total Depth of C in the inheritance tree for a class C Max Coupling between classes Cyclic coupling between classes
  • 41. Star pattern With defects No defects
  • 44. Undirected cliques Average number of defects is higher for binaries in large cliques.
  • 46. Prediction Model Input metrics and measures Prediction PCA Regression
  • 47. Prediction Model Input metrics and measures Prediction PCA Regression Metrics SNA Metrics+SNA
  • 48. Prediction Model Input metrics and measures Prediction PCA Regression Metrics Classification SNA Metrics+SNA Ranking
  • 49. Classification Has a binary a defect or not? or
  • 50. Ranking Which binaries have the most defects? or or ... or
  • 54. Classification (logistic regression) SNA increases the recall by 0.10 (at p=0.01) while precision remains comparable.
  • 56. Ranking (linear regression) SNA+METRICS increases the correlation by 0.10 (significant at p=0.01)
  • 58. Escrow binaries • Escrowcritical binaries for Windows Server 2003 binaries -list of - development teams select binaries for escrow based on (past) experience • Special protocol for escrow binaries -involves more testing, code reviews
  • 59. Predicting escrow binaries Network measures Recall GlobalInClosenessFreeman 0.60 GlobalIndwReach 0.60 EgoInSize 0.55 EgoInPairs 0.55 EgoInBroker 0.55 EgoInTies 0.50 GlobalInDegree 0.50 GlobalBetweenness 0.50 ... ... Complexity metrics Recall TotalParameters 0.30 TotalComplexity 0.30 TotalLines 0.30 TotalFanIn 0.30 TotalFanOut 0.30 ... ...
  • 60. Predicting escrow binaries Network measures Recall GlobalInClosenessFreeman 0.60 GlobalIndwReach 0.60 EgoInSize 0.55 EgoInPairs 0.55 EgoInBroker 0.55 EgoInTies 0.50 GlobalInDegree 0.50 GlobalBetweenness 0.50 ... ... Complexity metrics Recall TotalParameters 0.30 TotalComplexity 0.30 TotalLines 0.30 TotalFanIn 0.30 Network measures predict twice as 0.30 many TotalFanOut ... escrow binaries as complexity metrics do. ...
  • 61. CONCLUSION. . • Classification measures is 0.10 higher than for -Recall for network complexity metrics. - The precision remains comparable. • Ranking network mesures with complexity metrics -Combining increases the correlation by 0.10. • Escrow metrics fail to predict escrow binaries. - Complexity - Network measures predict 60% of escrow binaries.