SlideShare uma empresa Scribd logo
1 de 61
Mining Cause-Effect-Chains
                   from Version Histories




Funded by
                             Kim Herzig & Andreas Zeller
            Faculty Grant         from Saarland University, Germany
Cause Effect Chain




    Initial Change
Cause Effect Chain




    Initial Change
Cause Effect Chain




    Initial Change
Cause Effect Chain




                Initial Change



What is the long-term impact of the initial change?
             and can we predict them?
What does this has to do
    with reliability?
What does this has to do
    with reliability?
         Funded by



                     Faculty Grant
Web-Application
            Development
• Changing   live
 systems
• One   large repository
• Direct impact on
 functionality,
 stability,
Web-Application
            Development
• Changing   live
 systems
• One   large repository   Funded by




• Direct impact on                     Faculty Grant



 functionality,
 stability,
Cause Effect chain




    Initial Change
Cause Effect chain




    Initial Change




                     C1                Cn
                          dependency
Change Genealogies

Change C2 depends on change                      1. Analyzing source code changes
            C1                                        (not every revision can be compiled)


Change C2 can only be applied                    2. Extracting method definitions
     after applying C1.                             and method calls from changes
                                                      (added, modified, deleted)

  T1              T2                             3. Computing dependencies
                               ✓
          ••

                                                    between method definition
  T1              T2
                                                    and call changes
          ••
                               ✗                      (e.g. call depends on previous definition)

       [1] Capturing the long-term impact of changes , Herzig, ICSE ’10 Ph.D. Symposium
Change Genealogies
                                  - int A.foo(int)
File 1   + int A.foo(int)
                                  + float A.foo(float)

File 2   + int B.bar(int)        + d = A.foo(5d)

                                                   - x = B.bar(5)
File 3                 + B.bar(5)
                                                   + x = A.foo(5f)

File 4                                             + d = A.foo(d) = A.foo(-1f)
                                                              +e




               C1           C2          C3             C4            C5




                                                                     time
Change Genealogies
                                  - int A.foo(int)
File 1   + int A.foo(int)
                                  + float A.foo(float)

File 2   + int B.bar(int)        + d = A.foo(5d)

                                                   - x = B.bar(5)
File 3                 + B.bar(5)
                                                   + x = A.foo(5f)

File 4                                             + d = A.foo(d) = A.foo(-1f)
                                                              +e




               C1           C2          C3             C4            C5




                                                                     time
Change Genealogies

 • Graph       structure
     ‣   models structural dependencies


 • directed       & acyclic
✗    ‣   future cannot influence past


 •2      dimensional (time & space)
2D   ‣   vertex annotation: changed files, bug
Change Genealogies
                  of formal methods!
 a llo ws the use
 • Graph       structure
     ‣   models structural dependencies


 • directed       & acyclic
✗    ‣   future cannot influence past


 •2      dimensional (time & space)
2D   ‣   vertex annotation: changed files, bug
Long Term Couplings
  example on genealogy usage




                 known from Amazon
Long Term Couplings
             example on genealogy usage




                                                             known from Amazon
          developers who changed this artifact also
                                         changed ...
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, Zeller, ICSE '04
Long Term Couplings
             example on genealogy usage




                                                             known from Amazon
          developers who changed this artifact also
                                         changed ...
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, Zeller, ICSE '04



              this file got frequently changed by ...
           [3] Codebook Discovering and Exploiting Relationships in Software Repositories,
                               Begel, Phang, Zimmermann, FSE '10
Long Term Couplings
            example on genealogy usage




                                                        known from Amazon
          developers who changed this artifact also ime!
                                                                                       in t
                                         changed ...
                                                                 lim       ited Zeller, ICSE '04
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl,



            this file got frequently changed by ...                                        spa ce!
          [3] Codebook Discovering and Exploiting Relationships in Software t
                                                                              e d in
                                                                          i Repositories,
                              Begel, Phang, Zimmermann, FSE '10
                                                                    lim
Long Term Couplings
             example on genealogy usage

          developers who changed this artifact also ime!
                                                                                       in t
                                         changed ...                       ited Zeller, ICSE '04
                                                                          lim
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl,



                                                                                                spa ce!
              this file got frequently changed by ...
                                              e d in
                                                                         li mit
            [3] Codebook Discovering and Exploiting Relationships in Software Repositories,
                                Begel, Phang, Zimmermann, FSE '10



 changing this artifact always eventually causes ...
  [4] Using multivariate time series and association rules to detect logical change coupling: an empirical
                  study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010.
Long Term Couplings
             example on genealogy usage

          developers who changed this artifact also ime!
                                                                                       in t
                                         changed ...                       ited Zeller, ICSE '04
                                                                          lim
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl,



                                                                                                spa ce!
              this file got frequently changed by ...
                                              e d in
                                                                         li mit
            [3] Codebook Discovering and Exploiting Relationships in Software Repositories,
                                Begel, Phang, Zimmermann, FSE '10



 changing this artifact always eventually causes ...
  [4] Using multivariate time series and association rules to detect logical change coupling: an empirical
                  study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010.

     does not consider structural dependencies!
Long Term Couplings
             example on genealogy usage

          developers who changed this artifact also ime!
                                                                                       in t
                                         changed ...                       ited Zeller, ICSE '04
                                                                          lim
[2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl,



                                                                                                spa ce!
              this file within time window
                       got frequently changed by ...
                                              e d in
                                                                         li mit
            [3] Codebook Discovering and Exploiting Relationships in Software Repositories,
                                Begel, Phang, Zimmermann, FSE '10



 changing this artifact always eventually causes ...
  [4] Using multivariate time series and association rules to detect logical change coupling: an empirical
                  study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010.

     does not consider structural dependencies!
Long Term Couplings
      example on genealogy usage




changing this artifact always eventually causes ...


                ai ) EF          aj
          Computational Tree Logic (CTL)
Long Term Couplings
      example on genealogy usage
          within time window



changing this artifact always eventually causes ...


                ai ) EF          aj
          Computational Tree Logic (CTL)
Model Checking
           Genealogies

  A
                  E
          C


      B
              D




    extract
    change
genealogy from
version archive
Model Checking
           Genealogies

  A                     A
                  E                     E
          C                     C


      B                     B
              D                     D




    extract
                       extract valid
    change
                      CTL rules using
genealogy from
                      model checking
version archive
Model Checking
           Genealogies

                                                recommendations

  A                     A
                                              1 a1 ) EF a2
                  E                     E
          C                     C
                                              2 a1 ) EF (a2 ^ a3 )
      B                     B
              D                     D
                                              3 a1 ) AG (a2 ) EF a3 )

                                                         ...

                                               transform
    extract                                     frequent
                       extract valid
    change                                  occurring rules
                      CTL rules using
genealogy from                                     into
                      model checking
version archive                             recommendatio
                                                    ns
Recommendation
    Generation
C1       C2        C3



     time window        time
Recommendation
    Generation
C1       C2        C3



     time window         time

                                Extract subgraph
                                and add final state
 S        C2        C3
Recommendation
    Generation
C1       C2        C3



     time window             time

                                    Extract subgraph
                                    and add final state
 S        C2        C3   F
Recommendation
    Generation
C1       C2        C3



     time window               time

                                      Extract subgraph
                                      and add final state
 S        C2        C3     F
                                      Change labels
                                      with names
                                      of corresponding
                                      changed files
F1,F2      F2      F1,F3   F
Using CTL Templates

    F1,F2   F2   F1,F3   F
Using CTL Templates

                F1,F2   F2   F1,F3   F



EF Fx
EF (Fx ^ Fy )
(EF Fx ) ^ (EF Fy )
AG (Fx ) EF Fy )

   CTL Templates
Using CTL Templates

                F1,F2   F2   F1,F3   F



EF Fx
        ✔
EF (Fx ^ Fy )
(EF Fx ) ^ (EF Fy )
AG (Fx ) EF Fy )

   CTL Templates
Using CTL Templates

                F1,F2   F2   F1,F3   F



EF Fx
        ✔                                ...
EF (Fx ^ Fy )
                                F1 ) EF F3
(EF Fx ) ^ (EF Fy )
AG (Fx ) EF Fy )
                                F2 ) EF F3
                                         ...
   CTL Templates                 Recommendations
Recommendation Ranking
     <premiss> ) <implication>




       confidence(F)       support(F
                               )
                     # times premiss true


   support(F)
                # Kripke structures
                 formula F evaluated true
Conditional Rules
• Genealogyvertices annotated with
 change properties
 ‣   bug fix, big change, modified definition,
     authors, dependency types, ...
Conditional Rules
• Genealogyvertices annotated with
 change properties
 ‣   bug fix, big change, modified definition,
     authors, dependency types, ...


          ai ^ ‘ ‘is fix” ) EF                 aj
Conditional Rules
• Genealogyvertices annotated with
 change properties
 ‣   bug fix, big change, modified definition,
     authors, dependency types, ...


          ai ^ ‘ ‘is fix” ) EF                 aj

• Certain
        rules become only important
 under certain conditions!
Recommendations
             Examples from JRuby


  VariableCompiler ) EF         StandardASMCompiler
• Never   changed together, support=20, confidence=0.8
Recommendations
             Examples from JRuby


  VariableCompiler ) EF         StandardASMCompiler
• Never   changed together, support=20, confidence=0.8


           MainTestSuite ) (EF      RubyObject)
Recommendations
             Examples from JRuby


  VariableCompiler ) EF         StandardASMCompiler
• Never   changed together, support=20, confidence=0.8


           MainTestSuite ) (EF      RubyObject)
RubyIO ) ((EF         RubyStructure) ^ (EF    Visibility))
• support=20,
            confidence=0.5 / if bug fix:
 confidence=0.8
Experimental Setup

10% training phase

                       project history
Experimental Setup

10% training phase

                                                  project history



                use the top 3 ranked recommendations
              to predict files that will change in time
                      window. (ranked by confidence, support)

                     support > 2, confidence ≥ 0.5
Experimental Setup

10% training phase

                                                       project history



                     use the top 3 ranked recommendations
                   to predict files that will change in time
                           window. (ranked by confidence, support)
use formulas for
further training          support > 2, confidence ≥ 0.5
Experimental Setup

training phase

                                                    project history



                     use the top 3 ranked recommendations
                   to predict files that will change in time
                           window. (ranked by confidence, support)
use formulas for
further training          support > 2, confidence ≥ 0.5
Benchmark Model



                      the three top most
Constant ly predicts
               iles to cha nge again!
   changed f
Precision of
     Recommendations
           file changes predicted and
           applied within time window


                      #true positives
precision =   #true positives+#false positives


         file changes predicted but not
           applied within time window
Precision of
    Recommendations
                    #true positives
recall =    #true positives+#false negatives


           file changes not predicted but
             applied within time window
Precision of
       Recommendations
                      #true positives
  recall =    #true positives+#false negatives


             file changes not predicted but
               applied within time window



How much of a systems’s future evolution can
       be predicted from the past?
Precision of
       Recommendations
                      y na ture!
          ery l  w b positives e sense
                o #true
 recall =V#true positives+#falsek
                      s not ma negatives
         asur e doe
 This me          h ere!
           file changes not predicted but
             applied within time window



How much of a systems’s future evolution can
       be predicted from the past?
Project Details
              ArgoUML      Jaxen     JRuby     XStream
  history     12 years    9 years    9 years    7 years
#transactio    16,481     1,353      11,060     1,683
    ns
 #authors        50         20         66           11
  #files        16,658     9,831      15,029     1,188
#LTC >0.7        94         10         99           19
 #vertices      8,716     1,330      11,055     1,680
  ∅ out           8         7          10           5
 degree
  time         16 days    9 days     8 days     4 days
 window
          time window = median gap between vertex
                     and youngest child
How Precise are our
                        Predictions?
                  Precision
              CTL              Benchmark

0.8

                  0.7         0.7
                                          0.6
      0.6                           0.6

0.4

            0.3
                        0.3                     0.3



 0
      ArgoUML       Jaxen      JRuby      XStream
How Precise are our
                        Predictions?
                  Precision                            Avg. rank of highest hit
              CTL              Benchmark                                CTL            Benchmark

0.8                                                   2.4
                                                                                2.4
                  0.7         0.7                                                           2.1         2.1
                                                                  2.0
                                          0.6                                         1.9
      0.6                           0.6                     1.8          1.8                      1.8

0.4                                                   1.2

            0.3
                        0.3                     0.3



 0                                                     0
      ArgoUML       Jaxen      JRuby      XStream           ArgoUML           Jaxen    JRuby       XStream
Including Inner-Commit Rules

                  Precision                                 Avg. rank of highest hit
           with inner rules        w/o inner rules                 with inner rules             w/o inner rules
           Benchmark                                               Benchmark
0.8                                                        2.4
                                                                                      2.4
                     0.7        0.70.7                                                            2.1 2.1 2.1
      0.7                                   0.7                  2.0 2.0 2.0                2.0
                                               0.6                                             1.9
         0.6     0.6                  0.6                           1.8     1.8                          1.8

0.4                                                        1.2

           0.3
                          0.3                        0.3



 0                                                          0
      ArgoUML        Jaxen        JRuby      XStream             ArgoUML       Jaxen          JRuby    XStream
% Commits all Predictions True

          w/o inner-commit rules        with inner-commit rules

    70
                        68.8

                                      58.0
         52.3                                       54.2
    47                                       49.1
                48.0           47.8
                                                           43.8


    23




     0
          ArgoUML          Jaxen        JRuby         XStream
Funded

         Faculty
Funded

         Faculty
Funded

         Faculty
Funded

         Faculty

Mais conteúdo relacionado

Semelhante a Mining Cause Effect Chains from Version Archives - ISSRE 2011

20091029%20 l edit%20by%20cwchang%20(for%20std)
20091029%20 l edit%20by%20cwchang%20(for%20std)20091029%20 l edit%20by%20cwchang%20(for%20std)
20091029%20 l edit%20by%20cwchang%20(for%20std)
ashishkkr
 

Semelhante a Mining Cause Effect Chains from Version Archives - ISSRE 2011 (20)

Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
Approximating Change Sets at Philips Healthcare: A Case Study
Approximating Change Sets at Philips Healthcare: A Case StudyApproximating Change Sets at Philips Healthcare: A Case Study
Approximating Change Sets at Philips Healthcare: A Case Study
 
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceCOMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
 
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development ActivitiesChanges and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
 
semantic and social intraweb for corporate intelligence and watch
semantic and social intraweb for corporate intelligence and watchsemantic and social intraweb for corporate intelligence and watch
semantic and social intraweb for corporate intelligence and watch
 
ICMS08b.ppt
ICMS08b.pptICMS08b.ppt
ICMS08b.ppt
 
Mining Software Archives to Support Software Development
Mining Software Archives to Support Software DevelopmentMining Software Archives to Support Software Development
Mining Software Archives to Support Software Development
 
Xcore meets IncQuery: How the New Generation of DSLs are Made
Xcore meets IncQuery: How the New Generation of DSLs are MadeXcore meets IncQuery: How the New Generation of DSLs are Made
Xcore meets IncQuery: How the New Generation of DSLs are Made
 
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
WebFML: Synthesizing Feature Models Everywhere (@ SPLC 2014)
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Anchor Modeling ER09 Presentation
Anchor Modeling ER09 PresentationAnchor Modeling ER09 Presentation
Anchor Modeling ER09 Presentation
 
Anchor Modeling
Anchor ModelingAnchor Modeling
Anchor Modeling
 
Agile comparison with requriement approaches
Agile comparison with requriement approachesAgile comparison with requriement approaches
Agile comparison with requriement approaches
 
SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]
SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]
SYN: Ultra-Scale
Software Evolution Comprehension [ICPC 2023]
 
Programming Design Guidelines
Programming Design GuidelinesProgramming Design Guidelines
Programming Design Guidelines
 
20091029%20 l edit%20by%20cwchang%20(for%20std)
20091029%20 l edit%20by%20cwchang%20(for%20std)20091029%20 l edit%20by%20cwchang%20(for%20std)
20091029%20 l edit%20by%20cwchang%20(for%20std)
 
Cognos Framework Manager
Cognos Framework ManagerCognos Framework Manager
Cognos Framework Manager
 
Gfs论文
Gfs论文Gfs论文
Gfs论文
 
The google file system
The google file systemThe google file system
The google file system
 
Complexity metrics and models
Complexity metrics and modelsComplexity metrics and models
Complexity metrics and models
 

Mais de Kim Herzig

Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
Kim Herzig
 
Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015
Kim Herzig
 
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
Kim Herzig
 
Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)
Kim Herzig
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
Kim Herzig
 
Network vs. Code Metrics to Predict Defects: A Replication Study
Network vs. Code Metrics  to Predict Defects: A Replication StudyNetwork vs. Code Metrics  to Predict Defects: A Replication Study
Network vs. Code Metrics to Predict Defects: A Replication Study
Kim Herzig
 
Software Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software ArchivesSoftware Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software Archives
Kim Herzig
 

Mais de Kim Herzig (11)

Keynote AST 2016
Keynote AST 2016Keynote AST 2016
Keynote AST 2016
 
Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
Empirically Detecting False Test Alarms Using Association Rules @ ICSE 2015
 
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
The Art of Testing Less without Sacrificing Quality @ ICSE 2015The Art of Testing Less without Sacrificing Quality @ ICSE 2015
The Art of Testing Less without Sacrificing Quality @ ICSE 2015
 
Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015Code Ownership and Software Quality: A Replication Study @ MSR 2015
Code Ownership and Software Quality: A Replication Study @ MSR 2015
 
Issre2014 test defectprediction
Issre2014 test defectpredictionIssre2014 test defectprediction
Issre2014 test defectprediction
 
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...The Impact of Test Ownership and Team Structure on the Reliability and Effect...
The Impact of Test Ownership and Team Structure on the Reliability and Effect...
 
Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)Predicting Defects Using Change Genealogies (ISSE 2013)
Predicting Defects Using Change Genealogies (ISSE 2013)
 
The Impact of Tangled Code Changes
The Impact of Tangled Code ChangesThe Impact of Tangled Code Changes
The Impact of Tangled Code Changes
 
Network vs. Code Metrics to Predict Defects: A Replication Study
Network vs. Code Metrics  to Predict Defects: A Replication StudyNetwork vs. Code Metrics  to Predict Defects: A Replication Study
Network vs. Code Metrics to Predict Defects: A Replication Study
 
Capturing the Long Term Impact of Changes
Capturing the Long Term Impact of ChangesCapturing the Long Term Impact of Changes
Capturing the Long Term Impact of Changes
 
Software Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software ArchivesSoftware Engineering Course 2009 - Mining Software Archives
Software Engineering Course 2009 - Mining Software Archives
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 

Mining Cause Effect Chains from Version Archives - ISSRE 2011

  • 1. Mining Cause-Effect-Chains from Version Histories Funded by Kim Herzig & Andreas Zeller Faculty Grant from Saarland University, Germany
  • 2. Cause Effect Chain Initial Change
  • 3. Cause Effect Chain Initial Change
  • 4. Cause Effect Chain Initial Change
  • 5. Cause Effect Chain Initial Change What is the long-term impact of the initial change? and can we predict them?
  • 6. What does this has to do with reliability?
  • 7. What does this has to do with reliability? Funded by Faculty Grant
  • 8. Web-Application Development • Changing live systems • One large repository • Direct impact on functionality, stability,
  • 9. Web-Application Development • Changing live systems • One large repository Funded by • Direct impact on Faculty Grant functionality, stability,
  • 10. Cause Effect chain Initial Change
  • 11. Cause Effect chain Initial Change C1 Cn dependency
  • 12. Change Genealogies Change C2 depends on change 1. Analyzing source code changes C1 (not every revision can be compiled) Change C2 can only be applied 2. Extracting method definitions after applying C1. and method calls from changes (added, modified, deleted) T1 T2 3. Computing dependencies ✓ •• between method definition T1 T2 and call changes •• ✗ (e.g. call depends on previous definition) [1] Capturing the long-term impact of changes , Herzig, ICSE ’10 Ph.D. Symposium
  • 13. Change Genealogies - int A.foo(int) File 1 + int A.foo(int) + float A.foo(float) File 2 + int B.bar(int) + d = A.foo(5d) - x = B.bar(5) File 3 + B.bar(5) + x = A.foo(5f) File 4 + d = A.foo(d) = A.foo(-1f) +e C1 C2 C3 C4 C5 time
  • 14. Change Genealogies - int A.foo(int) File 1 + int A.foo(int) + float A.foo(float) File 2 + int B.bar(int) + d = A.foo(5d) - x = B.bar(5) File 3 + B.bar(5) + x = A.foo(5f) File 4 + d = A.foo(d) = A.foo(-1f) +e C1 C2 C3 C4 C5 time
  • 15. Change Genealogies • Graph structure ‣ models structural dependencies • directed & acyclic ✗ ‣ future cannot influence past •2 dimensional (time & space) 2D ‣ vertex annotation: changed files, bug
  • 16. Change Genealogies of formal methods! a llo ws the use • Graph structure ‣ models structural dependencies • directed & acyclic ✗ ‣ future cannot influence past •2 dimensional (time & space) 2D ‣ vertex annotation: changed files, bug
  • 17. Long Term Couplings example on genealogy usage known from Amazon
  • 18. Long Term Couplings example on genealogy usage known from Amazon developers who changed this artifact also changed ... [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, Zeller, ICSE '04
  • 19. Long Term Couplings example on genealogy usage known from Amazon developers who changed this artifact also changed ... [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, Zeller, ICSE '04 this file got frequently changed by ... [3] Codebook Discovering and Exploiting Relationships in Software Repositories, Begel, Phang, Zimmermann, FSE '10
  • 20. Long Term Couplings example on genealogy usage known from Amazon developers who changed this artifact also ime! in t changed ... lim ited Zeller, ICSE '04 [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, this file got frequently changed by ... spa ce! [3] Codebook Discovering and Exploiting Relationships in Software t e d in i Repositories, Begel, Phang, Zimmermann, FSE '10 lim
  • 21. Long Term Couplings example on genealogy usage developers who changed this artifact also ime! in t changed ... ited Zeller, ICSE '04 lim [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, spa ce! this file got frequently changed by ... e d in li mit [3] Codebook Discovering and Exploiting Relationships in Software Repositories, Begel, Phang, Zimmermann, FSE '10 changing this artifact always eventually causes ... [4] Using multivariate time series and association rules to detect logical change coupling: an empirical study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010.
  • 22. Long Term Couplings example on genealogy usage developers who changed this artifact also ime! in t changed ... ited Zeller, ICSE '04 lim [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, spa ce! this file got frequently changed by ... e d in li mit [3] Codebook Discovering and Exploiting Relationships in Software Repositories, Begel, Phang, Zimmermann, FSE '10 changing this artifact always eventually causes ... [4] Using multivariate time series and association rules to detect logical change coupling: an empirical study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010. does not consider structural dependencies!
  • 23. Long Term Couplings example on genealogy usage developers who changed this artifact also ime! in t changed ... ited Zeller, ICSE '04 lim [2] Mining Version Histories to Guide Software Changes, Zimmermann, Weißgerber, Diehl, spa ce! this file within time window got frequently changed by ... e d in li mit [3] Codebook Discovering and Exploiting Relationships in Software Repositories, Begel, Phang, Zimmermann, FSE '10 changing this artifact always eventually causes ... [4] Using multivariate time series and association rules to detect logical change coupling: an empirical study, G. Canfora, M. Ceccarelli, L. Cerulo, and M. Di Penta, ICSM 2010. does not consider structural dependencies!
  • 24. Long Term Couplings example on genealogy usage changing this artifact always eventually causes ... ai ) EF aj Computational Tree Logic (CTL)
  • 25. Long Term Couplings example on genealogy usage within time window changing this artifact always eventually causes ... ai ) EF aj Computational Tree Logic (CTL)
  • 26. Model Checking Genealogies A E C B D extract change genealogy from version archive
  • 27. Model Checking Genealogies A A E E C C B B D D extract extract valid change CTL rules using genealogy from model checking version archive
  • 28. Model Checking Genealogies recommendations A A 1 a1 ) EF a2 E E C C 2 a1 ) EF (a2 ^ a3 ) B B D D 3 a1 ) AG (a2 ) EF a3 ) ... transform extract frequent extract valid change occurring rules CTL rules using genealogy from into model checking version archive recommendatio ns
  • 29. Recommendation Generation C1 C2 C3 time window time
  • 30. Recommendation Generation C1 C2 C3 time window time Extract subgraph and add final state S C2 C3
  • 31. Recommendation Generation C1 C2 C3 time window time Extract subgraph and add final state S C2 C3 F
  • 32. Recommendation Generation C1 C2 C3 time window time Extract subgraph and add final state S C2 C3 F Change labels with names of corresponding changed files F1,F2 F2 F1,F3 F
  • 33. Using CTL Templates F1,F2 F2 F1,F3 F
  • 34. Using CTL Templates F1,F2 F2 F1,F3 F EF Fx EF (Fx ^ Fy ) (EF Fx ) ^ (EF Fy ) AG (Fx ) EF Fy ) CTL Templates
  • 35. Using CTL Templates F1,F2 F2 F1,F3 F EF Fx ✔ EF (Fx ^ Fy ) (EF Fx ) ^ (EF Fy ) AG (Fx ) EF Fy ) CTL Templates
  • 36. Using CTL Templates F1,F2 F2 F1,F3 F EF Fx ✔ ... EF (Fx ^ Fy ) F1 ) EF F3 (EF Fx ) ^ (EF Fy ) AG (Fx ) EF Fy ) F2 ) EF F3 ... CTL Templates Recommendations
  • 37. Recommendation Ranking <premiss> ) <implication> confidence(F) support(F ) # times premiss true support(F) # Kripke structures formula F evaluated true
  • 38. Conditional Rules • Genealogyvertices annotated with change properties ‣ bug fix, big change, modified definition, authors, dependency types, ...
  • 39. Conditional Rules • Genealogyvertices annotated with change properties ‣ bug fix, big change, modified definition, authors, dependency types, ... ai ^ ‘ ‘is fix” ) EF aj
  • 40. Conditional Rules • Genealogyvertices annotated with change properties ‣ bug fix, big change, modified definition, authors, dependency types, ... ai ^ ‘ ‘is fix” ) EF aj • Certain rules become only important under certain conditions!
  • 41. Recommendations Examples from JRuby VariableCompiler ) EF StandardASMCompiler • Never changed together, support=20, confidence=0.8
  • 42. Recommendations Examples from JRuby VariableCompiler ) EF StandardASMCompiler • Never changed together, support=20, confidence=0.8 MainTestSuite ) (EF RubyObject)
  • 43. Recommendations Examples from JRuby VariableCompiler ) EF StandardASMCompiler • Never changed together, support=20, confidence=0.8 MainTestSuite ) (EF RubyObject) RubyIO ) ((EF RubyStructure) ^ (EF Visibility)) • support=20, confidence=0.5 / if bug fix: confidence=0.8
  • 44. Experimental Setup 10% training phase project history
  • 45. Experimental Setup 10% training phase project history use the top 3 ranked recommendations to predict files that will change in time window. (ranked by confidence, support) support > 2, confidence ≥ 0.5
  • 46. Experimental Setup 10% training phase project history use the top 3 ranked recommendations to predict files that will change in time window. (ranked by confidence, support) use formulas for further training support > 2, confidence ≥ 0.5
  • 47. Experimental Setup training phase project history use the top 3 ranked recommendations to predict files that will change in time window. (ranked by confidence, support) use formulas for further training support > 2, confidence ≥ 0.5
  • 48. Benchmark Model the three top most Constant ly predicts iles to cha nge again! changed f
  • 49. Precision of Recommendations file changes predicted and applied within time window #true positives precision = #true positives+#false positives file changes predicted but not applied within time window
  • 50. Precision of Recommendations #true positives recall = #true positives+#false negatives file changes not predicted but applied within time window
  • 51. Precision of Recommendations #true positives recall = #true positives+#false negatives file changes not predicted but applied within time window How much of a systems’s future evolution can be predicted from the past?
  • 52. Precision of Recommendations y na ture! ery l w b positives e sense o #true recall =V#true positives+#falsek s not ma negatives asur e doe This me h ere! file changes not predicted but applied within time window How much of a systems’s future evolution can be predicted from the past?
  • 53. Project Details ArgoUML Jaxen JRuby XStream history 12 years 9 years 9 years 7 years #transactio 16,481 1,353 11,060 1,683 ns #authors 50 20 66 11 #files 16,658 9,831 15,029 1,188 #LTC >0.7 94 10 99 19 #vertices 8,716 1,330 11,055 1,680 ∅ out 8 7 10 5 degree time 16 days 9 days 8 days 4 days window time window = median gap between vertex and youngest child
  • 54. How Precise are our Predictions? Precision CTL Benchmark 0.8 0.7 0.7 0.6 0.6 0.6 0.4 0.3 0.3 0.3 0 ArgoUML Jaxen JRuby XStream
  • 55. How Precise are our Predictions? Precision Avg. rank of highest hit CTL Benchmark CTL Benchmark 0.8 2.4 2.4 0.7 0.7 2.1 2.1 2.0 0.6 1.9 0.6 0.6 1.8 1.8 1.8 0.4 1.2 0.3 0.3 0.3 0 0 ArgoUML Jaxen JRuby XStream ArgoUML Jaxen JRuby XStream
  • 56. Including Inner-Commit Rules Precision Avg. rank of highest hit with inner rules w/o inner rules with inner rules w/o inner rules Benchmark Benchmark 0.8 2.4 2.4 0.7 0.70.7 2.1 2.1 2.1 0.7 0.7 2.0 2.0 2.0 2.0 0.6 1.9 0.6 0.6 0.6 1.8 1.8 1.8 0.4 1.2 0.3 0.3 0.3 0 0 ArgoUML Jaxen JRuby XStream ArgoUML Jaxen JRuby XStream
  • 57. % Commits all Predictions True w/o inner-commit rules with inner-commit rules 70 68.8 58.0 52.3 54.2 47 49.1 48.0 47.8 43.8 23 0 ArgoUML Jaxen JRuby XStream
  • 58. Funded Faculty
  • 59. Funded Faculty
  • 60. Funded Faculty
  • 61. Funded Faculty

Notas do Editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n
  15. \n
  16. \n
  17. \n
  18. \n
  19. Imagine the following project history: 5 transactions, 4 files (simple)\n[changes]\nEach transaction gets a node\nDependencies between transactions get edges\n[explain edges]\n
  20. Imagine the following project history: 5 transactions, 4 files (simple)\n[changes]\nEach transaction gets a node\nDependencies between transactions get edges\n[explain edges]\n
  21. Imagine the following project history: 5 transactions, 4 files (simple)\n[changes]\nEach transaction gets a node\nDependencies between transactions get edges\n[explain edges]\n
  22. Imagine the following project history: 5 transactions, 4 files (simple)\n[changes]\nEach transaction gets a node\nDependencies between transactions get edges\n[explain edges]\n
  23. Imagine the following project history: 5 transactions, 4 files (simple)\n[changes]\nEach transaction gets a node\nDependencies between transactions get edges\n[explain edges]\n
  24. \n
  25. \n
  26. \n
  27. \n
  28. \n
  29. \n
  30. \n
  31. \n
  32. \n
  33. \n
  34. \n
  35. \n
  36. \n
  37. \n
  38. \n
  39. \n
  40. \n
  41. \n
  42. \n
  43. \n
  44. \n
  45. \n
  46. \n
  47. \n
  48. \n
  49. \n
  50. \n
  51. \n
  52. \n
  53. \n
  54. \n
  55. \n
  56. \n
  57. \n
  58. \n
  59. \n
  60. \n
  61. \n
  62. \n
  63. \n
  64. \n
  65. \n
  66. \n
  67. \n
  68. \n
  69. \n
  70. \n
  71. \n
  72. \n
  73. \n
  74. \n
  75. \n
  76. \n
  77. \n
  78. \n
  79. \n
  80. \n
  81. \n
  82. \n
  83. \n
  84. \n
  85. \n
  86. \n
  87. \n
  88. \n
  89. \n
  90. \n
  91. \n
  92. \n
  93. \n