SlideShare uma empresa Scribd logo
1 de 58
Predicting Defects in
SAP Java Code
An Experience Report
                       by Tilman Holschuh
                         (SQS AG)
                         Markus Päuser
                         (SAP AG)
                         Kim Herzig
                         (Saarland University)
                         Thomas Zimmermann
                         (Microsoft Research)
                         Rahul Premraj
                         (Vrije University Amsterdam)
                         Andreas Zeller
                         (Saarland University)
Motivation
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation


Quality Manager
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge
Motivation
                              Problems




Quality Manager   Resources     Time     Knowledge




Where do we put the most effort?
Replicated 2 Studies
Replicated 2 Studies
1
Replicated 2 Studies
1



    Source
     code


    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive


      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
1



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code       McCabe
                FanOut
                LoC
                Coupling
    Version
    archive                      Predictor
                     Component
                     Quality

      Bug
    database
Replicated 2 Studies
2



    Source
     code          McCabe
                   FanOut
               Dependencies
                   LoC
                   Coupling
    Version
    archive                         Predictor
                        Component
                        Quality

      Bug
    database
The Product

‣   SAP Standard Software
‣   Large scale Java software system ( > 10M LoC )
‣   Separated in projects
‣   Service pack release cycles
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution




            graphic created with TreeMap (University of Maryland)
                          see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Defect Distribution
20% of the code
contain ~75% of defects




Upper bound for
prediction




                          graphic created with TreeMap (University of Maryland)
                                        see http://www.cs.umd.edu/hcil/treemap
Basics


         Predictor
Input     Model      Output
How to collect
    Input Data?

1               2
     McCabe
     FanOut
     LoC            Dependencies
     Coupling
Collecting Metric Data

1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut
     LoC
     Coupling
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
Collecting Metric Data
                ‣ Metric tools: ckjm,
                  JDepend, ephyra
1
     McCabe
     FanOut     ‣ Static code checkers:
     LoC
     Coupling     PMD, FindBugs
                ‣ Change frequency


                  JDepend               ckjm
Collecting
    Dependency Data
2
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
Collecting
    Dependency Data
2                  ‣ extracting package
                     import relations
    Dependencies
                   ‣ Tool: JDepend

                      JDepend
How to measure
Component Quality?


Input ✔   Predictor
           Model      Output
Component Quality
Component Quality
  Bug
database




Version-
 archive
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality
  Bug               Bug 42233
                    FileSystemPreferences
database            lockFile() should close
                    ...




                                   Fixed Bug
                                   42233




Version-
 archive    v1.17                         v1.18   v1.19
Component Quality


                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
Component Quality

                                         #defects + 1
                             Fixed Bug
                             42233




Maintenance branch
                     v1.17      v1.18    v1.19


     Version-
      archive        v1.17      v1.18    v1.19
How to build
Predictor Models?

 Linear Regression     Support Vector
  Y = Xβ + ε           Machine
      McCabe         McCabe
      FanOut         FanOut
      LoC            LoC        Dependencies
      Coupling       Coupling
Forward Prediction


                          t
V1     V2



               static analysis
               training bug data
               test bug data
Results
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Metric Correlations
    Metric                Level: package     Class
                           Project 2       Project 4
                    Sum       0.583          0.377
     LoC
    Prediction is more precise at
                    Max       0.587           n/a
                    Sum       0.583          0.299
   McCabe
       higher granularity levels
                    Max       0.588          0.261
                              0.608           n/a
Efferent Coupling

                    Sum       0.557          0.264
  Design Rules
                    Max       0.578           n/a
                    Sum       0.308          0.403
  Changes
                    Max       0.240           n/a
Hit Rate
          actual   predicted
             1         4
             2         9    Hit rate = 50%
             3         2
Top 20%      4        11
             5         6
             6         1
             7         3
             8         5
             9        10
            10         8
            11         7
McCabe
FanOut
LoC
                 Predictions using
                 Linear Regression
Coupling




                       Top 5%   Top 20%
      All projects      46%      55%
           Group 1      47%      63%
           Project 1    21%      43%
           Project 2    42%      64%
           Project 3    41%      55%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                        Top 5%   Top 20%
          Machine
           Group 1       26%      43%

          Project 1      38%      50%

          Project 2      36%      46%

          Project 3      46%      49%
Dependencies
                Predicting from
                Dependencies
       Support Vector
                         Top 5%      Top 20%
          Machine
            Stable
           Group 1  prediction results 43%
                          26%
                  across projects
          Project 1       38%         50%

          Project 2       36%         46%

          Project 3       46%         49%
Compare Results
                           Dependencies     Metrics
           80%



           60%
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Compare Results
                           Dependencies     Metrics
           80%



           Complexity metrics have higher
           60%

                 predictive power
Hit rate




           40%



           20%



           0%
                 Group 1     Project 1    Project 2   Project 3
Lessons Learned
                 Nagappan   Schröter
                   et al.     et al.   our study
metrics defect
 correlation       ✔          n/a        ✔
  prediction
   possible        ✔         ✔           ✔
   forward
  prediction       ✘         ✘           ✔
  universal
  predictor        ✘         ✘           ✘
Lessons Learned
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data
Lessons Learned
 Predictions based on static code features provide
limited results and depend on the project context


        Software archives are reliable and
      easily accessible source of defect data


     Defects have many sources, and code is
                just one of them
SQS Software Quality Systems AG

Stollwerckstraße 11
51149 Cologne, Germany
Phone: + 49 22 03 91 54 - 7149
Fax: + 49 22 03 91 54 - 15
Email: tilman.holschuh@sqs.de

Internet: www.sqs-group.com
Thank you!
         SQS Software Quality Systems AG

         Stollwerckstraße 11
         51149 Cologne, Germany
         Phone: + 49 22 03 91 54 - 7149
         Fax: + 49 22 03 91 54 - 15
         Email: tilman.holschuh@sqs.de

         Internet: www.sqs-group.com
Predicting Defects in SAP Java Code: An Experience Report

Mais conteúdo relacionado

Destaque

Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver JavaLeland Bartlett
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)ERPScan
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012hwilming
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java appsSimon Ritter
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFCMonsif Elaissoussi
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingIosif Itkin
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcacheHyeonSeok Choi
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcacheChris Westin
 

Destaque (8)

Intro To Sap Netweaver Java
Intro To Sap Netweaver JavaIntro To Sap Netweaver Java
Intro To Sap Netweaver Java
 
Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)Practical SAP pentesting workshop (NullCon Goa)
Practical SAP pentesting workshop (NullCon Goa)
 
Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012Integrating SAP the Java EE Way - JBoss One Day talk 2012
Integrating SAP the Java EE Way - JBoss One Day talk 2012
 
Low latency Java apps
Low latency Java appsLow latency Java apps
Low latency Java apps
 
Sap java connector / Hybris RFC
Sap java connector / Hybris RFCSap java connector / Hybris RFC
Sap java connector / Hybris RFC
 
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of TestingTMPA-2017: Defect Report Classification in Accordance with Areas of Testing
TMPA-2017: Defect Report Classification in Accordance with Areas of Testing
 
Overview of the ehcache
Overview of the ehcacheOverview of the ehcache
Overview of the ehcache
 
Building low latency java applications with ehcache
Building low latency java applications with ehcacheBuilding low latency java applications with ehcache
Building low latency java applications with ehcache
 

Semelhante a Predicting Defects in SAP Java Code: An Experience Report

How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsAlex Soto
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureMasud Rahman
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityNicolas Bettenburg
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesThomas Zimmermann
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration HellDatabricks
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Predictionsjust
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsThomas Zimmermann
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .netMarco Parenzan
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsMongoDB
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentThomas Zimmermann
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)Anubhav Jain
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelchk49
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Chris Fregly
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with TransformersDatabricks
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008ChemAxon
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationYury Chemerkin
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render EngineMatthias Jugel
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」Satoshi Goto
 

Semelhante a Predicting Defects in SAP Java Code: An Experience Report (20)

How to Test Enterprise Java Applications
How to Test Enterprise Java ApplicationsHow to Test Enterprise Java Applications
How to Test Enterprise Java Applications
 
CMPT470-usask-guest-lecture
CMPT470-usask-guest-lectureCMPT470-usask-guest-lecture
CMPT470-usask-guest-lecture
 
Studying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software QualityStudying the impact of Social Structures on Software Quality
Studying the impact of Social Structures on Software Quality
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
BioWeka
BioWekaBioWeka
BioWeka
 
The Pill for Your Migration Hell
The Pill for Your Migration HellThe Pill for Your Migration Hell
The Pill for Your Migration Hell
 
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug PredictionIt's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
 
Predicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency GraphsPredicting Defects using Network Analysis on Dependency Graphs
Predicting Defects using Network Analysis on Dependency Graphs
 
Anomaly Detection with Azure and .net
Anomaly Detection with Azure and .netAnomaly Detection with Azure and .net
Anomaly Detection with Azure and .net
 
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica setsBack to Basics Spanish Webinar 3 - Introducción a los replica sets
Back to Basics Spanish Webinar 3 - Introducción a los replica sets
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Overview of DuraMat software tool development (poster version)
Overview of DuraMat software tool development(poster version)Overview of DuraMat software tool development(poster version)
Overview of DuraMat software tool development (poster version)
 
Parsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernelParsing and Type checking all 2^10000 configurations of the Linux kernel
Parsing and Type checking all 2^10000 configurations of the Linux kernel
 
Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016Atlanta Spark User Meetup 09 22 2016
Atlanta Spark User Meetup 09 22 2016
 
Object Detection with Transformers
Object Detection with TransformersObject Detection with Transformers
Object Detection with Transformers
 
Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008Pipelining Chem Axon: US UGM 2008
Pipelining Chem Axon: US UGM 2008
 
Dmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentationDmitriy evdokimov. light and dark side of code instrumentation
Dmitriy evdokimov. light and dark side of code instrumentation
 
The Radeox Wiki Render Engine
The Radeox Wiki Render EngineThe Radeox Wiki Render Engine
The Radeox Wiki Render Engine
 
Debugging TV Frame 0x13
Debugging TV Frame 0x13Debugging TV Frame 0x13
Debugging TV Frame 0x13
 
ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」ドワンゴでのScala活用事例「ニコニコandroid」
ドワンゴでのScala活用事例「ニコニコandroid」
 

Último

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 

Último (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 

Predicting Defects in SAP Java Code: An Experience Report

  • 1. Predicting Defects in SAP Java Code An Experience Report by Tilman Holschuh (SQS AG) Markus Päuser (SAP AG) Kim Herzig (Saarland University) Thomas Zimmermann (Microsoft Research) Rahul Premraj (Vrije University Amsterdam) Andreas Zeller (Saarland University)
  • 7. Motivation Problems Quality Manager Resources Time Knowledge
  • 8. Motivation Problems Quality Manager Resources Time Knowledge Where do we put the most effort?
  • 11. Replicated 2 Studies 1 Source code Version archive Bug database
  • 12. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Bug database
  • 13. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Component Quality Bug database
  • 14. Replicated 2 Studies 1 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 15. Replicated 2 Studies 2 Source code McCabe FanOut LoC Coupling Version archive Predictor Component Quality Bug database
  • 16. Replicated 2 Studies 2 Source code McCabe FanOut Dependencies LoC Coupling Version archive Predictor Component Quality Bug database
  • 17. The Product ‣ SAP Standard Software ‣ Large scale Java software system ( > 10M LoC ) ‣ Separated in projects ‣ Service pack release cycles
  • 18. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 19. Defect Distribution graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 20. Defect Distribution 20% of the code contain ~75% of defects graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 21. Defect Distribution 20% of the code contain ~75% of defects Upper bound for prediction graphic created with TreeMap (University of Maryland) see http://www.cs.umd.edu/hcil/treemap
  • 22. Basics Predictor Input Model Output
  • 23. How to collect Input Data? 1 2 McCabe FanOut LoC Dependencies Coupling
  • 24. Collecting Metric Data 1 McCabe FanOut LoC Coupling
  • 25. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut LoC Coupling
  • 26. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs
  • 27. Collecting Metric Data ‣ Metric tools: ckjm, JDepend, ephyra 1 McCabe FanOut ‣ Static code checkers: LoC Coupling PMD, FindBugs ‣ Change frequency JDepend ckjm
  • 28. Collecting Dependency Data 2 Dependencies
  • 29. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies
  • 30. Collecting Dependency Data 2 ‣ extracting package import relations Dependencies ‣ Tool: JDepend JDepend
  • 31. How to measure Component Quality? Input ✔ Predictor Model Output
  • 33. Component Quality Bug database Version- archive
  • 34. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Version- archive v1.17 v1.18 v1.19
  • 35. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 36. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 37. Component Quality Bug Bug 42233 FileSystemPreferences database lockFile() should close ... Fixed Bug 42233 Version- archive v1.17 v1.18 v1.19
  • 38. Component Quality Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 39. Component Quality #defects + 1 Fixed Bug 42233 Maintenance branch v1.17 v1.18 v1.19 Version- archive v1.17 v1.18 v1.19
  • 40. How to build Predictor Models? Linear Regression Support Vector Y = Xβ + ε Machine McCabe McCabe FanOut FanOut LoC LoC Dependencies Coupling Coupling
  • 41. Forward Prediction t V1 V2 static analysis training bug data test bug data
  • 43. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Max 0.587 n/a Sum 0.583 0.299 McCabe Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 44. Metric Correlations Metric Level: package Class Project 2 Project 4 Sum 0.583 0.377 LoC Prediction is more precise at Max 0.587 n/a Sum 0.583 0.299 McCabe higher granularity levels Max 0.588 0.261 0.608 n/a Efferent Coupling Sum 0.557 0.264 Design Rules Max 0.578 n/a Sum 0.308 0.403 Changes Max 0.240 n/a
  • 45. Hit Rate actual predicted 1 4 2 9 Hit rate = 50% 3 2 Top 20% 4 11 5 6 6 1 7 3 8 5 9 10 10 8 11 7
  • 46. McCabe FanOut LoC Predictions using Linear Regression Coupling Top 5% Top 20% All projects 46% 55% Group 1 47% 63% Project 1 21% 43% Project 2 42% 64% Project 3 41% 55%
  • 47. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Group 1 26% 43% Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 48. Dependencies Predicting from Dependencies Support Vector Top 5% Top 20% Machine Stable Group 1 prediction results 43% 26% across projects Project 1 38% 50% Project 2 36% 46% Project 3 46% 49%
  • 49. Compare Results Dependencies Metrics 80% 60% Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 50. Compare Results Dependencies Metrics 80% Complexity metrics have higher 60% predictive power Hit rate 40% 20% 0% Group 1 Project 1 Project 2 Project 3
  • 51. Lessons Learned Nagappan Schröter et al. et al. our study metrics defect correlation ✔ n/a ✔ prediction possible ✔ ✔ ✔ forward prediction ✘ ✘ ✔ universal predictor ✘ ✘ ✘
  • 53. Lessons Learned Predictions based on static code features provide limited results and depend on the project context
  • 54. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data
  • 55. Lessons Learned Predictions based on static code features provide limited results and depend on the project context Software archives are reliable and easily accessible source of defect data Defects have many sources, and code is just one of them
  • 56. SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: tilman.holschuh@sqs.de Internet: www.sqs-group.com
  • 57. Thank you! SQS Software Quality Systems AG Stollwerckstraße 11 51149 Cologne, Germany Phone: + 49 22 03 91 54 - 7149 Fax: + 49 22 03 91 54 - 15 Email: tilman.holschuh@sqs.de Internet: www.sqs-group.com