SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
Software metrics are usually right-skewed


                                 Histogram of SLOC(org.argouml.ui)
                  25
                  20
                  15
      Frequency

                  10
                  5
                  0




                       0   100             200              300          400   500

                                    SLOC for classes in org.argouml.ui
2/11




Aggregation of software metrics using the
          “softnometric” index

               Bogdan Vasilescu
         b.n.vasilescu@student.tue.nl

          Eindhoven University of Technology
                  The Netherlands


                 March 9, 2011
Aggregation techniques                                          3/11




                                          Inequality indices:
Classical:        Distribution fitting:
                                              Theil
    Mean              Log-normal
                                              Gini
    Sum               Exponential
                                              Kolm
    Cardinality       Negative binomial
                                              Atkinson
Aggregation techniques                                          3/11




                                          Inequality indices:
Classical:        Distribution fitting:
                                              Theil
    Mean              Log-normal
                                              Gini
    Sum               Exponential
                                              Kolm
    Cardinality       Negative binomial
                                              Atkinson
Gini index                                                            4/11

The Gini index is based on the Lorenz curve:
     proportion of the total income of the population (y-axis)
     cumulatively earned by the bottom x% of the people.
     0     perfect equality: every person receives the same income.
     1     perfect inequality: one person receives all the income.
IGini (X ) =     A
               A +B
Gini index                                                            4/11

The Gini index is based on the Lorenz curve:
     proportion of the total income of the population (y-axis)
     cumulatively earned by the bottom x% of the people.
     0     perfect equality: every person receives the same income.
     1     perfect inequality: one person receives all the income.
IGini (X ) =     A
               A +B
Theoretical comparison                                                     5/11




Criteria:
     Domain → determines applicability

     Range → determines interpretation
     Invariance
        •   w.r.t. addition → LOC, ignore headers
        •   w.r.t. multiplication → LOC, percentages vs. absolute values

     Decomposability → explain inequality by partitioning the
     population into groups
Theoretical comparison                                             6/11




Agg. technique   Domain   Range          Invariance   Decomposability
Mean             R        R              -            N/A
Sum              R        R              -            N/A
Cardinality      R        N              -            N/A
Gini Index       R+       [0, 1]         mult.        -
                 R        R              mult.        -
Theil Index      R+       [0, log n]     mult.        yes
Kolm Index       R        R+             add.         yes
Atkinson Index   R+       [0, 1 − 1/n]   mult.        -
Empirical comparison                                                  7/11




Research questions:

    Does LOC relate to bugs?

    Do the aggregation techniques influence the presence/strength of
    this relation?

    Is there any difference between the aggregation techniques?
    Do they express the same thing?
Empirical comparison                                    8/11




Case study: ArgoUML
    Open-source, ∼ 1200 Java classes, ∼ 100 packages.
Empirical comparison                                                    8/11




Case study: ArgoUML
    Open-source, ∼ 1200 Java classes, ∼ 100 packages.

Methodology:
    Tool chain to automatically process issue tracker and version
    control system data.
    Mapped defects to Java classes and then packages.
    Measured SLOC of each class, aggregated to package level.
    For each aggregation technique, statistically studied correlation
    with bugs.
Results                                                                                                         9/11




                 mean             IGini           ITheil          IKolm          IAtkinson           defects
mean                            0.170           0.192           0.6761             0.203             0.0096
IGini                                           0.908            0.467             0.903                0.27
ITheil                                                           0.488             0.918              0.273
IKolm                                                                              0.501              0.119
IAtkinson                                                                                             0.229

     IGini , ITheil and IAtkinson indicate the strongest and also statistically
     significant correlation with the number of defects.
     However, high and statistically significant correlation between
     them.
     Mean indicates the lowest correlation with the number of defects.



 1 statistically significant correlations, with two-sided p-values not exceeding 0.01, are typeset in boldface
Threats to validity                                                  10/11




No control over the issue tracker → mapping of defects to classes.
    bugs missing from the issue tracker.
    bug fixes not showing up in the commit log.

How representative is the case? How about the version?
    replicate on more systems and more versions.

Is LOC the most suitable metric?
    replicate with more metrics.
Conclusions                                                                                                                                              11/11


            Software metrics are not distributed normally.

                           Histogram of SLOC(org.argouml.ui)
                                                                               Theoretical comparison.
            25




                                                                                Agg. technique       Domain       Range             Invariance     Decomposability
            20




                                                                                Mean                 R            R                 -              N/A
                                                                                Sum                  R            R                 -              N/A
            15
Frequency




                                                                                Cardinality          R            N                 -              N/A
            10




                                                                                Gini Index           R+           [0, 1]            mult.          -
                                                                                                     R            R                 mult.          -
            5




                                                                                Theil Index          R+           [0, log n]        mult.          yes
            0




                 0   100             200              300          400   500
                                                                                Kolm Index           R            R+                add.           yes
                              SLOC for classes in org.argouml.ui                Atkinson Index       R+           [0, 1 − 1/n]      mult.          -


                                                                               Empirical comparison.
                                                                                              mean         Gini      Theil        Kolm      Atkinson     defects
                                                                                mean                      0.170     0.192        0.676         0.203     0.0096
                                                                                Gini                                0.908        0.467         0.903        0.27
                                                                                Theil                                            0.488         0.918      0.273
                                                                                Kolm                                                           0.501      0.119
                                                                                Atkinson                                                                  0.229



            Classical aggregation techniques have problems when distributions are
            skewed. Inequality indices look more promising.

Mais conteúdo relacionado

Último

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajanpragatimahajan3
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsTechSoup
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Celine George
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxShobhayan Kirtania
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...fonyou31
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactdawncurless
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...Pooja Nehwal
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room servicediscovermytutordmt
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 

Último (20)

Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
social pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajansocial pharmacy d-pharm 1st year by Pragati K. Mahajan
social pharmacy d-pharm 1st year by Pragati K. Mahajan
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17Advanced Views - Calendar View in Odoo 17
Advanced Views - Calendar View in Odoo 17
 
The byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptxThe byproduct of sericulture in different industries.pptx
The byproduct of sericulture in different industries.pptx
 
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
Ecosystem Interactions Class Discussion Presentation in Blue Green Lined Styl...
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Accessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impactAccessible design: Minimum effort, maximum impact
Accessible design: Minimum effort, maximum impact
 
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
Mattingly "AI & Prompt Design: Structured Data, Assistants, & RAG"
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...Russian Call Girls in Andheri Airport Mumbai WhatsApp  9167673311 💞 Full Nigh...
Russian Call Girls in Andheri Airport Mumbai WhatsApp 9167673311 💞 Full Nigh...
 
The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
9548086042 for call girls in Indira Nagar with room service
9548086042  for call girls in Indira Nagar  with room service9548086042  for call girls in Indira Nagar  with room service
9548086042 for call girls in Indira Nagar with room service
 
Advance Mobile Application Development class 07
Advance Mobile Application Development class 07Advance Mobile Application Development class 07
Advance Mobile Application Development class 07
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 

Destaque

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

Destaque (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Benevol 2010

  • 1. Software metrics are usually right-skewed Histogram of SLOC(org.argouml.ui) 25 20 15 Frequency 10 5 0 0 100 200 300 400 500 SLOC for classes in org.argouml.ui
  • 2. 2/11 Aggregation of software metrics using the “softnometric” index Bogdan Vasilescu b.n.vasilescu@student.tue.nl Eindhoven University of Technology The Netherlands March 9, 2011
  • 3. Aggregation techniques 3/11 Inequality indices: Classical: Distribution fitting: Theil Mean Log-normal Gini Sum Exponential Kolm Cardinality Negative binomial Atkinson
  • 4. Aggregation techniques 3/11 Inequality indices: Classical: Distribution fitting: Theil Mean Log-normal Gini Sum Exponential Kolm Cardinality Negative binomial Atkinson
  • 5. Gini index 4/11 The Gini index is based on the Lorenz curve: proportion of the total income of the population (y-axis) cumulatively earned by the bottom x% of the people. 0 perfect equality: every person receives the same income. 1 perfect inequality: one person receives all the income. IGini (X ) = A A +B
  • 6. Gini index 4/11 The Gini index is based on the Lorenz curve: proportion of the total income of the population (y-axis) cumulatively earned by the bottom x% of the people. 0 perfect equality: every person receives the same income. 1 perfect inequality: one person receives all the income. IGini (X ) = A A +B
  • 7. Theoretical comparison 5/11 Criteria: Domain → determines applicability Range → determines interpretation Invariance • w.r.t. addition → LOC, ignore headers • w.r.t. multiplication → LOC, percentages vs. absolute values Decomposability → explain inequality by partitioning the population into groups
  • 8. Theoretical comparison 6/11 Agg. technique Domain Range Invariance Decomposability Mean R R - N/A Sum R R - N/A Cardinality R N - N/A Gini Index R+ [0, 1] mult. - R R mult. - Theil Index R+ [0, log n] mult. yes Kolm Index R R+ add. yes Atkinson Index R+ [0, 1 − 1/n] mult. -
  • 9. Empirical comparison 7/11 Research questions: Does LOC relate to bugs? Do the aggregation techniques influence the presence/strength of this relation? Is there any difference between the aggregation techniques? Do they express the same thing?
  • 10. Empirical comparison 8/11 Case study: ArgoUML Open-source, ∼ 1200 Java classes, ∼ 100 packages.
  • 11. Empirical comparison 8/11 Case study: ArgoUML Open-source, ∼ 1200 Java classes, ∼ 100 packages. Methodology: Tool chain to automatically process issue tracker and version control system data. Mapped defects to Java classes and then packages. Measured SLOC of each class, aggregated to package level. For each aggregation technique, statistically studied correlation with bugs.
  • 12. Results 9/11 mean IGini ITheil IKolm IAtkinson defects mean 0.170 0.192 0.6761 0.203 0.0096 IGini 0.908 0.467 0.903 0.27 ITheil 0.488 0.918 0.273 IKolm 0.501 0.119 IAtkinson 0.229 IGini , ITheil and IAtkinson indicate the strongest and also statistically significant correlation with the number of defects. However, high and statistically significant correlation between them. Mean indicates the lowest correlation with the number of defects. 1 statistically significant correlations, with two-sided p-values not exceeding 0.01, are typeset in boldface
  • 13. Threats to validity 10/11 No control over the issue tracker → mapping of defects to classes. bugs missing from the issue tracker. bug fixes not showing up in the commit log. How representative is the case? How about the version? replicate on more systems and more versions. Is LOC the most suitable metric? replicate with more metrics.
  • 14. Conclusions 11/11 Software metrics are not distributed normally. Histogram of SLOC(org.argouml.ui) Theoretical comparison. 25 Agg. technique Domain Range Invariance Decomposability 20 Mean R R - N/A Sum R R - N/A 15 Frequency Cardinality R N - N/A 10 Gini Index R+ [0, 1] mult. - R R mult. - 5 Theil Index R+ [0, log n] mult. yes 0 0 100 200 300 400 500 Kolm Index R R+ add. yes SLOC for classes in org.argouml.ui Atkinson Index R+ [0, 1 − 1/n] mult. - Empirical comparison. mean Gini Theil Kolm Atkinson defects mean 0.170 0.192 0.676 0.203 0.0096 Gini 0.908 0.467 0.903 0.27 Theil 0.488 0.918 0.273 Kolm 0.501 0.119 Atkinson 0.229 Classical aggregation techniques have problems when distributions are skewed. Inequality indices look more promising.