SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
Change-Based Test Selection
                       in the Presence of Developer Tests


                                        Quinten David Soetens
                                             Serge Demeyer
                                             Andy Zaidman




Bon Giorno, my name is Quinten, I’m from the University of Antwerp.

I will be showing a technique that we investigated to reduce the size of a test suite.
Test Suites Grow



                                                                                                                  2

As a Software System grows, so does its Test Suite. And they can grow very large indeed!

We talked to a couple of companies in industry and they confirmed that this is indeed a relevant problem for them. ... Why?
un
                                 o                                                                      R
                             rs t
                           ou
                        e H
                    Ta k


                                                                                                                 3

Because large test suites lead to inefficient testing -- it takes too long to run all the tests

One company we talked to mentioned that their tests take up to 13 hours to run. They start the tests in the evening and when
they come back in the morning for their daily standup scrum meeting the testing is still going on.
4

This leads in turn leads to delays -- delays in executing the test as well as delays in the updating of the test cases.

This leads to to a reduced test coverage and larger feedback cycles.
It takes a lot longer for a developer to know when his code was good or not.
R un T
                     ests i
                                                                   n Par
                                                                        allel




                                                                                                                      5

One solution to this problem could be to run the tests in parallel to save time.

For instance another company that we talked to had tests that run 8 hours. Their solution was to run the tests in parallel in 16
different machines effectively reducing the runtime of their testsuite from 8 hours to half an hour. Which in my opinion is still a
long time to wait. Especially as a developer who just wants to check if his code is OK.
Which
                                              tests should I
                                              run when
                                              changing this
                                                method?




                                                                                                                 6

In light of this, developers are faced with a problem:

      Which test(s) should they run when changing a particular part of a system?

Currently developers use their own gut feeling, common knowledge in the company or expert knowledge of a collegue to select
a subset of tests that could be relevant for the code he is working on. However tool support to aid in this task is desirable.

We therefore need to find which tests are relevant for that particular change. We can do this when we have recorded the fine
grained changes made during the development.
ChEOPSJ



                                                              Applications               TestSelection



                                                                       Model
                                                                ChangeRecorders                          Change
                                                                                                         Distiller

                                                  Logger                             Distiller
                                                                                                         SVNKit




      ChEOPSJ: Change-Based Test Optimization
      Quinten David Soetens and Serge Demeyer
      In "Proceedings of 16th European Conference on Software Maintenance and Reengineering, CSMR 2012
                                                                                                                     7

This approach was implemented in a tool called ChEOPSJ, which I presented at last years CSMR.
ChEOPSJ



                                                              Applications               TestSelection



                                                                       Model
                                                                ChangeRecorders                          Change
                                                                                                         Distiller

                                                  Logger                             Distiller
                                                                                                         SVNKit




      ChEOPSJ: Change-Based Test Optimization
      Quinten David Soetens and Serge Demeyer
      In "Proceedings of 16th European Conference on Software Maintenance and Reengineering, CSMR 2012
                                                                                                                     8

We consider changes made to the source code as first class objects, -- tangible entities that we can analyze and manipulate.

Basically it’s a tool that can record changes in the background while you are programming.
And in order to work with real world cases we also have the capability of recovering changes from source code repositories.

Once a change model is instantiated for a system we can analyze the change model and run different applications (for now only
the test selection application).
First Class Change Objects




                                           Changes act on Source Code
                                                (FAMIX) Entities
                                        (e.g. AddClassChange, AddMethodChange, etc.)




                                                                                       9

For instance adding a new class will result in an Add-Class change
First Class Change Objects




                          Changes have Structural Dependencies
                            (e.g. AddMethod ---> AddClass--->AddPackage etc.)




                                                                                                              10

We can also define dependencies between these changes. For instance adding a method to a class requires the class to be added
first.
Therefor there is a dependency between the AddMethodChange and the AddClassChange.
First Class Change Objects
                                                               Traceability via Dependencies
                                                              between Test and Program Code




                                                                                                         Changes to
                             Changes to
                                                                                                       Program Code
                             Test Code



                                                                                                                   11

It’s these dependencies that we can use to find relevant tests. Since Tests are also source code, so we can find a series of
dependencies between the test code and the source code.

These dependencies form a live traceability link between the test code and the source code. Using these links we can select
relevant tests for a particular change.
Research Questions
            Compare test subset against “retest all”


           Size Reduction?



                                                          Quality?



                                                                                               Accuracy?




                                                                                                                 12

We evaluated our approach on two open source cases: Cruisecontrol and PMD.
For each class we searched for the relevant test classes. (Using the changes in the class).
We could then compare the found subset(s) of tests against the entire (larger) test suite.

And we compared this on three criteria.

    How much did we actually reduce the test suite?

    What was the quality of the reduced test suite? Is this the same or worse? And we used a metric called Mutation Coverage to
gauge the quality of a set of tests.

    And finally we also looked at the accuracy of our approach, which means we looked at precision and recall.

The First Question is: When we reduce the test suite to a subset of tests, how much did we actually reduce it?
Size
                                                                                                    Red
                                                                                                       ucti
                                                                                                              on?


                                                                 Cruisecontrol,

                                                                                          295 Tests
                                                                                       Reduced to 1 Test



                                                                     PMD$

                                                                                          215 Tests
                                                                                       Reduced to 1 Test




                                                                                                      13

For   54%   and   44%   of the classes we found that there was only 1 relevant test.
For   11%   and   20%   of the classes we found there were 2 relevant tests.
For   13%   and   10%   of the classes had 3 relevant tests.
For   21%   and   26%   there were 4 or more relevant tests.

Cruisecontrol:
         1 (54.5%)
 2 (11.4%)
 3 (13.1%)
 >=4 (21.0%)
PMD:

     
            1 (44.0%)
 2 (19.9%)
 3 (09.9%)
 >=4 (26.2%)
Size
                                                                                              Red
                                                                                                 ucti
                                                                                                        on?


                                                                 Cruisecontrol,


                295 Tests
            Reduced to 2 Tests


                                                                     PMD$



           215 Tests
        Reduced to 2 Test




                                                                                                14

For   54%   and   44%   of the classes we found that there was only 1 relevant test.
For   11%   and   20%   of the classes we found there were 2 relevant tests.
For   13%   and   10%   of the classes had 3 relevant tests.
For   21%   and   26%   there were 4 or more relevant tests.

Cruisecontrol:
         1 (54.5%)
 2 (11.4%)
 3 (13.1%)
 >=4 (21.0%)
PMD:

     
            1 (44.0%)
 2 (19.9%)
 3 (09.9%)
 >=4 (26.2%)
Size
                                                                                              Red
                                                                                                 ucti
                                                                                                        on?


                                                                 Cruisecontrol,
                295 Tests
            Reduced to 3 Tests




                                                                     PMD$
           215 Tests
        Reduced to 3 Test




                                                                                                15

For   54%   and   44%   of the classes we found that there was only 1 relevant test.
For   11%   and   20%   of the classes we found there were 2 relevant tests.
For   13%   and   10%   of the classes had 3 relevant tests.
For   21%   and   26%   there were 4 or more relevant tests.

Cruisecontrol:
         1 (54.5%)
 2 (11.4%)
 3 (13.1%)
 >=4 (21.0%)
PMD:

     
            1 (44.0%)
 2 (19.9%)
 3 (09.9%)
 >=4 (26.2%)
Size
                                                                                              Red
                                                                                                 ucti
                                                                                                        on?

      295 Tests Reduced to
                                                                 Cruisecontrol,
        4 or more Tests
           (max = 22)



      215 Tests Reduced
      to 4 or more Test                                              PMD$

         (max = 37)




                                                                                                16

For   54%   and   44%   of the classes we found that there was only 1 relevant test.
For   11%   and   20%   of the classes we found there were 2 relevant tests.
For   13%   and   10%   of the classes had 3 relevant tests.
For   21%   and   26%   there were 4 or more relevant tests.

Cruisecontrol:
         1 (54.5%)
 2 (11.4%)
 3 (13.1%)
 >=4 (21.0%)
PMD:

     
            1 (44.0%)
 2 (19.9%)
 3 (09.9%)
 >=4 (26.2%)
Test Suites Grow



                                                      17

As such we can say that we can reduce ALL the tests
Size
                                                                                         Red
                                                                                            ucti
                                                                                                   on?




                                                                                           18

to a handful of tests. -- 80 to 90 % of the classes had up to 5 relevant tests!
Research Questions
            Compare test subset against “retest all”


           Size Reduction?



                                                         Quality?



                                                                                                Accuracy?




                                                                                                               19

Next Question was: Does the quality of the reduced test sets remain the same or is it worse than retest all?
Qua
             Mutation Testing                                                                                                  lity?

                 package engine;
                 import java.util.*;

                 public class SuffixTree {
                    int[] hdlabel = new int[10000];
                    int[] ithSuf;
                    int ithSufLength;
                    int ithSufBegin;
                    int[] firstSuf;
                    public Vertex root= null;
                    private Vector pStringV;
                    private int[] a;
                    public Vector pMatches = new Vector();
                    Vector inputFiles;
                    int in;//de in-de suffix
                    public SuffixTree(Vector symbolen, Vector files) {
                      pStringV = symbolen;
                      inputFiles = files;
                      ithSuf = new int[pStringV.size()];
                      firstSuf = new int[pStringV.size()];
                    }




                              SOURCE
                    private void ithSuffix1(){//nieuwe versie, nu voor i=1
                      for(int j=0;j<=pStringV.size()-1;j++){
                        Symbool s = (Symbool)pStringV.elementAt(j);
                        if(s.parameter==false){
                            firstSuf[j]=s.symbool;
                            ithSuf[j]=s.symbool;
                        }
                        else{
                            firstSuf[j]=s.dTotVorige;
                            ithSuf[j]=s.dTotVorige;
                        }
                      }
                      ithSufLength = pStringV.size();
                      ithSufBegin=0;




                               CODE
                    }
                    private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                      ithSufBegin = i-1;
                      Symbool sym = (Symbool)pStringV.elementAt(i-2);
                      if(sym.parameter==true && sym.dTotVolgende!=0){
                        ithSuf[i-2+sym.dTotVolgende] = 0;
                      }
                      ithSufLength = pStringV.size()-i+1;
                 //     return ithSufClone;//nodig?


                                                                                                           All Tests Pass
                    }
                    public void berekenDTotVorige(){
                      for(int j=0;j<=pStringV.size()-1;j++){
                        Symbool s = (Symbool)pStringV.elementAt(j);
                        if(s.parameter==true){
                          int vorigePos=-1;
                          for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                             Symbool sym = (Symbool)pStringV.elementAt(k);
                             if(sym.symbool==s.symbool) {vorigePos=k;break;}
                          }//is er een probleem als een n-par en een par dezelfde int hebben?
                          if(vorigePos==-1) s.dTotVorige=0;
                          else s.dTotVorige = j-vorigePos;
                        }
                      }
                    }
                    public void berekenDTotVorige2(){
                      Hashtable ht = new Hashtable();
                      for(int j=0;j<=pStringV.size()-1;j++){
                        Symbool s = (Symbool)pStringV.elementAt(j);
                        if(s.parameter==true){
                          Integer i = new Integer(s.symbool);
                          if(!ht.containsKey(i)){
                             s.dTotVorige=0;
                             ht.put(i,new Integer(j));
                          }
                          else{
                             int vorigeIndex = ((Integer)ht.get(i)).intValue();
                             ht.put(i,new Integer(j));
                             s.dTotVorige = j-vorigeIndex;
                          }
                        }
                      }
                    }
                    public void berekenDTotVolgende(){
                      for(int j=0;j<=pStringV.size()-1;j++){
                        Symbool s = (Symbool)pStringV.elementAt(j);
                        if(s.parameter==true){
                          int volgendePos=-1;
                          for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                             Symbool sym = (Symbool)pStringV.elementAt(k);
                             if(sym.symbool==s.symbool) {volgendePos=k;break;}
                          }
                          if(volgendePos==-1) s.dTotVolgende=0;
                          else s.dTotVolgende = volgendePos-j;
                        }




                              SOURCE
                               CODE




 © ≈ http://pitest.org ≈
                                                                                                                               20

To asses the quality of a set of tests, we used mutation testing.

In short. this is inserting a fault into the code and checking if your test set fails (mutation killed) or not (mutation survived).
We used PIT as a tool to do this automatically for us.

We start with a green test suite (i.e. all tests pass)
Qua
            Mutation Testing                                                                                                                   lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                                                                                                                             ≈ pitest.org
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){


                                                                                                          Introduce Mutant
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }


                                                                                                            + Rerun Tests
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?
                   }
                   public void berekenDTotVorige(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }
                     }
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }
                     }
                   }
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                                               21

After inserting a mutation we run the tests. If the tests still pass, we say that the mutation survived (Which is BAD, because you
introduced a bug in your system and the tests did not catch it.)
Qua
            Mutation Testing                                                                                                  lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?


                                                                                                          All Tests Pass
                   }
                   public void berekenDTotVorige(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }
                     }
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){




                                                                                                          Mutation
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }




                                                                                                          Survived
                     }
                   }
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                              22

After inserting a mutation we run the tests. If the tests still pass, we say that the mutation survived (Which is BAD, because you
introduced a bug in your system and the tests did not catch it.)
Qua
            Mutation Testing                                                                                                                   lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?
                   }




                                                                                                                             ≈ pitest.org
                   public void berekenDTotVorige(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;


                                                                                                          Introduce Mutant
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }


                                                                                                            + Rerun Tests
                     }
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }
                     }
                   }
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                                               23

After inserting another mutation we run the tests again. Now some of tests fail, so we can say that this mutation was killed (This
is GOOD)
Qua
            Mutation Testing                                                                                                   lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?
                   }



                                                                                                          Some Tests Fail
                   public void berekenDTotVorige(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }
                     }
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){




                                                                                                          Mutation
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }
                     }
                   }




                                                                                                           Killed
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                               24

After inserting another mutation we run the tests again. Now some of tests fail, so we can say that this mutation was killed (This
is GOOD)
Qua
            Mutation Testing                                                                                                                     lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }




                                                                                                                               ≈ pitest.org
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){


                                                                                                            Repeat For All
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?
                   }
                   public void berekenDTotVorige(){


                                                                                                          Possible Mutations
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }
                     }
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }
                     }
                   }
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                                                 25

We do this for all mutations and we get a metric: Mutation Coverage which is the percentage of the number of mutants killed out
of the total number of mutants introduced.

We can use this metric to gauge the quality of a set of tests. And we now want to see if for a particular class the quality remains
the same? when only using a reduced set of tests.
Qua
            Mutation Testing                                                                                                                        lity?

                package engine;
                import java.util.*;

                public class SuffixTree {
                   int[] hdlabel = new int[10000];
                   int[] ithSuf;
                   int ithSufLength;
                   int ithSufBegin;
                   int[] firstSuf;
                   public Vertex root= null;
                   private Vector pStringV;
                   private int[] a;
                   public Vector pMatches = new Vector();
                   Vector inputFiles;
                   int in;//de in-de suffix
                   public SuffixTree(Vector symbolen, Vector files) {
                     pStringV = symbolen;
                     inputFiles = files;
                     ithSuf = new int[pStringV.size()];
                     firstSuf = new int[pStringV.size()];
                   }




                             SOURCE
                   private void ithSuffix1(){//nieuwe versie, nu voor i=1
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==false){
                           firstSuf[j]=s.symbool;
                           ithSuf[j]=s.symbool;
                       }
                       else{
                           firstSuf[j]=s.dTotVorige;
                           ithSuf[j]=s.dTotVorige;
                       }
                     }
                     ithSufLength = pStringV.size();
                     ithSufBegin=0;




                              CODE
                   }




                                                                                                                                 ≈ ipitest.org
                   private void ithSuffix(int i){//nieuwste versie, niet voor i=1
                     ithSufBegin = i-1;
                     Symbool sym = (Symbool)pStringV.elementAt(i-2);
                     if(sym.parameter==true && sym.dTotVolgende!=0){


                                                                                                            Repeat For All
                       ithSuf[i-2+sym.dTotVolgende] = 0;
                     }
                     ithSufLength = pStringV.size()-i+1;
                //     return ithSufClone;//nodig?
                   }
                   public void berekenDTotVorige(){


                                                                                                          Possible Mutations
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int vorigePos=-1;
                         for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);




                                                                                                                                 ts K lled
                            if(sym.symbool==s.symbool) {vorigePos=k;break;}
                         }//is er een probleem als een n-par en een par dezelfde int hebben?
                         if(vorigePos==-1) s.dTotVorige=0;
                         else s.dTotVorige = j-vorigePos;
                       }




                                                                                                                               n
                     }




                                                   # Muta
                   }
                   public void berekenDTotVorige2(){
                     Hashtable ht = new Hashtable();
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         Integer i = new Integer(s.symbool);
                         if(!ht.containsKey(i)){
                            s.dTotVorige=0;
                            ht.put(i,new Integer(j));




                                                                roduced
                         }
                         else{
                            int vorigeIndex = ((Integer)ht.get(i)).intValue();
                            ht.put(i,new Integer(j));




                                         overage = # Mutants Int
                            s.dTotVorige = j-vorigeIndex;
                         }
                       }
                     }
                   }




                               utation C
                   public void berekenDTotVolgende(){
                     for(int j=0;j<=pStringV.size()-1;j++){
                       Symbool s = (Symbool)pStringV.elementAt(j);
                       if(s.parameter==true){
                         int volgendePos=-1;




                             M
                         for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam
                            Symbool sym = (Symbool)pStringV.elementAt(k);
                            if(sym.symbool==s.symbool) {volgendePos=k;break;}
                         }
                         if(volgendePos==-1) s.dTotVolgende=0;
                         else s.dTotVolgende = volgendePos-j;
                       }




                             SOURCE
                              CODE




 © ≈ http://pitest.org ≈
                                                                                                                                                    25

We do this for all mutations and we get a metric: Mutation Coverage which is the percentage of the number of mutants killed out
of the total number of mutants introduced.

We can use this metric to gauge the quality of a set of tests. And we now want to see if for a particular class the quality remains
the same? when only using a reduced set of tests.
Qua
                                                                                                              lity?

                 Cruisecontrol,


                                               88% equal Mutation Coverage




                     PMD$
                                               50% equal Mutation Coverage




                                                                                                                 26

In 88% and 50% of the inspected classes we have a mutation coverage that remained the same. (i.e. the quality of the reduced
test set is equal to that of the full test suite.)

In 12% (Cruisecontrol) and 50% (PMD) however we have a worse Mutation Coverage, but the question then arises
Qua
                                                                              lity?

                Cruisecontrol,


                                             88% equal Mutation Coverage

                                                rse  is
                                     ch   wo
                              w  mu               era  ge?
                          Ho           nC     ov
                                    tio
                               uta equal Mutation Coverage
                    PMD$

                         th e M 50%



                                                                              27

How much worse is the mutation coverage in these cases?
Qua
                                                                                                                                             lity?



                                  100"                                                                    100"




                                                                                  Percentage)of)more)surviving)
 Percentage)of)more)surviving)




                                   90"                                                                     90"
                                   80"                                                                     80"
                                   70"                                                                     70"




                                                                                           mutants)
                                   60"                                                                     60"
          mutants)




                                   50"                                                                     50"
                                                                                                           40"
                                   40"
                                                                                                           30"
                                   30"
                                                                                                           20"
                                   20"
                                                                                                           10"
                                   10"                                                                       0"
                                        0"                                                              ,20"      30"         80"        130"           180"
                                 ,20"        30"        80"         130"   180"
                                              Total)number)of)mutants)                                                  Total)number)of)mutants)




                                                                                                                                                   28

So we looked at those test subsets were more mutants survived than with the retest all.
We see that it varies from a couple of percent to a hundred percent more mutants surviving.
However we need to take in account the total number of mutants introduced.

So that is what is shown here.
On the vertical axis we show the percentage of more surviving mutants. Meaning the lower the better.

On the horizontal axis we show the total number of mutants introduced. Which puts some of the data points in perspective.
Qua
                                                                                                                                             lity?



                                  100"                                                                    100"




                                                                                  Percentage)of)more)surviving)
 Percentage)of)more)surviving)




                                   90"                                                                     90"
                                   80"                                                                     80"
                                   70"                                                                     70"




                                                                                           mutants)
                                   60"                                                                     60"
          mutants)




                                   50"                                                                     50"
                                                                                                           40"
                                   40"
                                                                                                           30"
                                   30"
                                                                                                           20"
                                   20"
                                                                                                           10"
                                   10"                                                                       0"
                                        0"                                                              ,20"      30"         80"        130"           180"
                                 ,20"        30"        80"         130"   180"
                                              Total)number)of)mutants)                                                  Total)number)of)mutants)




                                                                                                                                                   29

For Cruisecontrol for instance there is one point where a 100% of the introduced mutants survived the subset, but were caught in
the retest all. However when put in perspective this is out of a total of only 3 mutants!!!
Qua
                                                                                                                                             lity?



                                  100"                                                                    100"




                                                                                  Percentage)of)more)surviving)
 Percentage)of)more)surviving)




                                   90"                                                                     90"
                                   80"                                                                     80"
                                   70"                                                                     70"




                                                                                           mutants)
                                   60"                                                                     60"
          mutants)




                                   50"                                                                     50"
                                                                                                           40"
                                   40"
                                                                                                           30"
                                   30"
                                                                                                           20"
                                   20"
                                                                                                           10"
                                   10"                                                                       0"
                                        0"                                                              ,20"      30"         80"        130"           180"
                                 ,20"        30"        80"         130"   180"
                                              Total)number)of)mutants)                                                  Total)number)of)mutants)




                                                                                                                                                   30

The data points that are more worrisome in Cruisecontrol are the two in the middle. Because, here a relatively high number of
mutants is introduced an quite a few of them survived the subset of tests where they did not survive the full test set.
Qua
                                                                                                                                             lity?



                                  100"                                                                    100"




                                                                                  Percentage)of)more)surviving)
 Percentage)of)more)surviving)




                                   90"                                                                     90"
                                   80"                                                                     80"
                                   70"                                                                     70"




                                                                                           mutants)
                                   60"                                                                     60"
          mutants)




                                   50"                                                                     50"
                                                                                                           40"
                                   40"
                                                                                                           30"
                                   30"
                                                                                                           20"
                                   20"
                                                                                                           10"
                                   10"                                                                       0"
                                        0"                                                              ,20"      30"         80"        130"           180"
                                 ,20"        30"        80"         130"   180"
                                              Total)number)of)mutants)                                                  Total)number)of)mutants)




                                                                                                                                                   31

PMD performs a lot worse. As we can see all of these data points with high numbers of mutants surviving the subset and not the
full set.
Qua
                                                                                                                                                 lity?



                                  100"                                                                    100"




                                                                                  Percentage)of)more)surviving)
 Percentage)of)more)surviving)




                                   90"                                                                     90"
                                   80"                                                                     80"
                                   70"                                                                     70"




                                                                                           mutants)
                                   60"                                                                     60"
          mutants)




                                   50"                                                                     50"
                                                                                                           40"
                                   40"
                                                                                                           30"
                                   30"
                                                                                                           20"
                                   20"
                                                                                                           10"
                                   10"                                                                       0"
                                        0"                                                              ,20"          30"         80"        130"           180"
                                 ,20"        30"        80"         130"   180"
                                              Total)number)of)mutants)                                                      Total)number)of)mutants)


                                    On average 12% more                                                           On average 24% more
                                      mutants survive                                                               mutants survive
                                     (weighted average)                                                            (weighted average)

                                                                                                                                                       32

Still on average we can say that 12% and 24% more mutants survive, and this is a weighted average where we took the total
number of mutants as weights.
In short the closer the data points are to the axes,the better.

So our approach up to now is good, but it’s not perfect. We do miss some relevant tests.
Research Questions
            Compare test subset against “retest all”


           Size Reduction?



                                                          Quality?



                                                                                                Accuracy?




                                                                                                                    33

Which leads us automatically to the next question, what’s our precision and recall?
i.e.

    How many of the selected tests are really relevant tests (precision)?

    How many of the really relevant tests are selected (recall)?

To measure precision and recall we need some kind of oracle to tell us which actually are the relevant tests for each class.
Acc
                                                                                                                urac
            Dynamic Analysis                                                                                         y?



    ∀ t ∈Tests: execute t



                                        ∀ m : Method invoked
                                           during run of t




                                                                         t is a relevant test for m


                                                                                                                 34

We used a dynamic analysis to tell us.
In short we wrote a simple aspect in aspectj that during the execution of a test, notes which methods were invoked.
We can then say that that test is relevant for those methods.

Using these results we could compare to our static analysis of the changes...
Acc
                                                                                                                                              urac
                                                                                                                                                   y?


                                                Precision)                                                 Precision)
                                      [0.25,0.5[$[0,0.25[$                                               [0,0.25[$
                                                                                               [0.25,0.5[$
                        [0.5,0.75[$
                                                                               [0.5,0.75[$



                  [0.75,1[$

                                                                                                                                      [1]$
                                                                               [0.75,1[$
                                                             [1]$

                                            Avg: 0.88                                                  Avg: 0.83
                                                   Recall)                                                    Recall)
                                            [0,0.25[$
                              [0.25,0.5[$                                                  [0,0.25[$
                                                                                                                               [1]$




                [0.5,0.75[$                                     [1]$
                                                                            [0.25,0.5[$
                                                                                                                                  [0.75,1[$

                               [0.75,1[$    Avg: 0.77                                                  Avg: 0.58 [0.5,0.75[$




                                                                                                                                              35

We find for both Cruisecontrol and PMD high precision values (on average 0.88 and 0.83%).
Which means that most of the test that we selected in the subsets were in fact relevant tests!

The recall values are a bit lower especially in the case of PMD. With an average recall of 77% and 58%.
This means that some of the actually relevant tests where not selected in the subsets by our tool.
This was also apparent in the mutation testing approach.

But is this really bad?
36

When we look back at our individual developer. He is performing changes on a software system. And wants to test his code.

When he gets tool support saying, these are the relevant tests for your changes, he gets more confident about his code.
He will test more often. He gets shorter feedback cycles.

The selected subset is not safe as it occasionally misses a few relevant tests, however it is adequate especially since the complete
test suite will be executed as part of the integration build anyway.
37

What’s next after this?
We need to do some more work on this, basically polishing the approach (try to improve recall, probably at the cost of precision)
See how this approach performs on industrial cases.
On the other hand we also want to have a look at other applications of Change Centric Software Development.
One thing that we are currently looking at is looking if we can detect patterns in the set of changes.

     -- Either predefined patterns like refactorings, and checking if we can identify those.

     -- Or just frequent pattern mining on a set of changes and not knowing in advance what kind of patterns we might
uncover.
Another application is that successful changes on one branch of a piece of software might be reapplied on other branches of
that system
(bug fixes?)
Future Directions
              • Reducing Test Runtime
                  •   Polishing of the Approach (& Implementation)
                  •   More (Industrial) Cases

              • Detect Change Patterns
                  •   Identify Refactorings
                  •   Recurring sequences of changes

              • Reapplying changes
                  •   bug fixes
                  •   design improvements
                  •   API evolution



                                                                                                                  37

What’s next after this?
We need to do some more work on this, basically polishing the approach (try to improve recall, probably at the cost of precision)
See how this approach performs on industrial cases.
On the other hand we also want to have a look at other applications of Change Centric Software Development.
One thing that we are currently looking at is looking if we can detect patterns in the set of changes.

     -- Either predefined patterns like refactorings, and checking if we can identify those.

     -- Or just frequent pattern mining on a set of changes and not knowing in advance what kind of patterns we might
uncover.
Another application is that successful changes on one branch of a piece of software might be reapplied on other branches of
that system
(bug fixes?)
38

To wrap up....
We were looking for a way to find relevant tests for small changes to the software.
We found that our technique could reduce the test suite to a handful of test (5 tests in 80-90% of the cases)
We found that in 50-80% those reduced test suites had the same mutation coverage (quality) as the full test set)
The test sets that had a worse mutation coverage, was actually not that bad.
And we found that we had really good precision, but lower recall, meaning that we did in fact miss some relevant tests.
However as we mentioned this is not a very big problem since the full test suite will in the end also be built anyway.

Mais conteúdo relacionado

Destaque

цахим судалгаа
цахим судалгаацахим судалгаа
цахим судалгааutsolmon
 
Ubuntu tsahim test(2
Ubuntu tsahim test(2Ubuntu tsahim test(2
Ubuntu tsahim test(2utsolmon
 
Aplicaciones de los epp
Aplicaciones de los eppAplicaciones de los epp
Aplicaciones de los eppHoracio R.
 
Natural History NAFLD/NASH from Chronic Liver Disease Foundation
Natural History NAFLD/NASH from Chronic Liver Disease FoundationNatural History NAFLD/NASH from Chronic Liver Disease Foundation
Natural History NAFLD/NASH from Chronic Liver Disease FoundationNAFLD
 
цахим судалгаа
цахим судалгаацахим судалгаа
цахим судалгааutsolmon
 
Ubuntu tsahim test(2
Ubuntu tsahim test(2Ubuntu tsahim test(2
Ubuntu tsahim test(2utsolmon
 

Destaque (10)

цахим судалгаа
цахим судалгаацахим судалгаа
цахим судалгаа
 
Ubuntu tsahim test(2
Ubuntu tsahim test(2Ubuntu tsahim test(2
Ubuntu tsahim test(2
 
Aplicaciones de los epp
Aplicaciones de los eppAplicaciones de los epp
Aplicaciones de los epp
 
Natural History NAFLD/NASH from Chronic Liver Disease Foundation
Natural History NAFLD/NASH from Chronic Liver Disease FoundationNatural History NAFLD/NASH from Chronic Liver Disease Foundation
Natural History NAFLD/NASH from Chronic Liver Disease Foundation
 
цахим судалгаа
цахим судалгаацахим судалгаа
цахим судалгаа
 
RONNY_MT
RONNY_MTRONNY_MT
RONNY_MT
 
เทคโนโลยีมัลติมีเดีย
เทคโนโลยีมัลติมีเดียเทคโนโลยีมัลติมีเดีย
เทคโนโลยีมัลติมีเดีย
 
Ubuntu tsahim test(2
Ubuntu tsahim test(2Ubuntu tsahim test(2
Ubuntu tsahim test(2
 
Cricket
CricketCricket
Cricket
 
Hobbie presentation
Hobbie presentationHobbie presentation
Hobbie presentation
 

Semelhante a Csmr2013 presentation

AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...Robert Grossman
 
Testing in the Oil &amp; Gas Market“
Testing in the Oil &amp; Gas Market“Testing in the Oil &amp; Gas Market“
Testing in the Oil &amp; Gas Market“Ernesto Kiszkurno
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsGábor Szárnyas
 
Extending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationExtending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationPerforce
 
Waterfallacies V1 1
Waterfallacies V1 1Waterfallacies V1 1
Waterfallacies V1 1Jorge Boria
 
How to test a Mainframe Application
How to test a Mainframe ApplicationHow to test a Mainframe Application
How to test a Mainframe ApplicationMichael Erichsen
 
Cas2010 is-there-space-for-testers-in-agile-projects
Cas2010 is-there-space-for-testers-in-agile-projectsCas2010 is-there-space-for-testers-in-agile-projects
Cas2010 is-there-space-for-testers-in-agile-projectsAgile Spain
 
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...Applitools
 
Regression Optimizer
Regression OptimizerRegression Optimizer
Regression OptimizerShradha Singh
 
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONS
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONSQUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONS
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONSijseajournal
 
Devnology back toschool software reengineering
Devnology back toschool software reengineeringDevnology back toschool software reengineering
Devnology back toschool software reengineeringDevnology
 
Increasing Quality with DevOps
Increasing Quality with DevOpsIncreasing Quality with DevOps
Increasing Quality with DevOpsCoveros, Inc.
 
The View - Lotusscript coding best practices
The View - Lotusscript coding best practicesThe View - Lotusscript coding best practices
The View - Lotusscript coding best practicesBill Buchan
 
Periodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesPeriodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesJérôme Kehrli
 
Презентация
ПрезентацияПрезентация
Презентацияguest22d71d
 
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...Abdelkrim Boujraf
 
Test Environment: An Essential Component Of The DevSecOps Framework
Test Environment: An Essential Component Of The DevSecOps FrameworkTest Environment: An Essential Component Of The DevSecOps Framework
Test Environment: An Essential Component Of The DevSecOps FrameworkEnov8
 

Semelhante a Csmr2013 presentation (20)

AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
AnalyticOps: Lessons Learned Moving Machine-Learning Algorithms to Production...
 
Crp
CrpCrp
Crp
 
Testing in the Oil &amp; Gas Market“
Testing in the Oil &amp; Gas Market“Testing in the Oil &amp; Gas Market“
Testing in the Oil &amp; Gas Market“
 
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java ApplicationsTowards a Macrobenchmark Framework for Performance Analysis of Java Applications
Towards a Macrobenchmark Framework for Performance Analysis of Java Applications
 
Extending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content SynchronizationExtending Perforce Scalability Using Job Content Synchronization
Extending Perforce Scalability Using Job Content Synchronization
 
DataLyzer Brochure Gage
DataLyzer Brochure GageDataLyzer Brochure Gage
DataLyzer Brochure Gage
 
Waterfallacies V1 1
Waterfallacies V1 1Waterfallacies V1 1
Waterfallacies V1 1
 
How to test a Mainframe Application
How to test a Mainframe ApplicationHow to test a Mainframe Application
How to test a Mainframe Application
 
Iqnite keynote
Iqnite keynoteIqnite keynote
Iqnite keynote
 
Cas2010 is-there-space-for-testers-in-agile-projects
Cas2010 is-there-space-for-testers-in-agile-projectsCas2010 is-there-space-for-testers-in-agile-projects
Cas2010 is-there-space-for-testers-in-agile-projects
 
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...
Testing Hourglass at Jira Frontend - by Alexey Shpakov, Sr. Developer @ Atlas...
 
Regression Optimizer
Regression OptimizerRegression Optimizer
Regression Optimizer
 
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONS
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONSQUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONS
QUALITY METRICS OF TEST SUITES IN TESTDRIVEN DESIGNED APPLICATIONS
 
Devnology back toschool software reengineering
Devnology back toschool software reengineeringDevnology back toschool software reengineering
Devnology back toschool software reengineering
 
Increasing Quality with DevOps
Increasing Quality with DevOpsIncreasing Quality with DevOps
Increasing Quality with DevOps
 
The View - Lotusscript coding best practices
The View - Lotusscript coding best practicesThe View - Lotusscript coding best practices
The View - Lotusscript coding best practices
 
Periodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and PracticesPeriodic Table of Agile Principles and Practices
Periodic Table of Agile Principles and Practices
 
Презентация
ПрезентацияПрезентация
Презентация
 
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...
Test-Driven Developments are Inefficient; Behavior-Driven Developments are a ...
 
Test Environment: An Essential Component Of The DevSecOps Framework
Test Environment: An Essential Component Of The DevSecOps FrameworkTest Environment: An Essential Component Of The DevSecOps Framework
Test Environment: An Essential Component Of The DevSecOps Framework
 

Último

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 

Último (20)

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 

Csmr2013 presentation

  • 1. Change-Based Test Selection in the Presence of Developer Tests Quinten David Soetens Serge Demeyer Andy Zaidman Bon Giorno, my name is Quinten, I’m from the University of Antwerp. I will be showing a technique that we investigated to reduce the size of a test suite.
  • 2. Test Suites Grow 2 As a Software System grows, so does its Test Suite. And they can grow very large indeed! We talked to a couple of companies in industry and they confirmed that this is indeed a relevant problem for them. ... Why?
  • 3. un o R rs t ou e H Ta k 3 Because large test suites lead to inefficient testing -- it takes too long to run all the tests One company we talked to mentioned that their tests take up to 13 hours to run. They start the tests in the evening and when they come back in the morning for their daily standup scrum meeting the testing is still going on.
  • 4. 4 This leads in turn leads to delays -- delays in executing the test as well as delays in the updating of the test cases. This leads to to a reduced test coverage and larger feedback cycles. It takes a lot longer for a developer to know when his code was good or not.
  • 5. R un T ests i n Par allel 5 One solution to this problem could be to run the tests in parallel to save time. For instance another company that we talked to had tests that run 8 hours. Their solution was to run the tests in parallel in 16 different machines effectively reducing the runtime of their testsuite from 8 hours to half an hour. Which in my opinion is still a long time to wait. Especially as a developer who just wants to check if his code is OK.
  • 6. Which tests should I run when changing this method? 6 In light of this, developers are faced with a problem: Which test(s) should they run when changing a particular part of a system? Currently developers use their own gut feeling, common knowledge in the company or expert knowledge of a collegue to select a subset of tests that could be relevant for the code he is working on. However tool support to aid in this task is desirable. We therefore need to find which tests are relevant for that particular change. We can do this when we have recorded the fine grained changes made during the development.
  • 7. ChEOPSJ Applications TestSelection Model ChangeRecorders Change Distiller Logger Distiller SVNKit ChEOPSJ: Change-Based Test Optimization Quinten David Soetens and Serge Demeyer In "Proceedings of 16th European Conference on Software Maintenance and Reengineering, CSMR 2012 7 This approach was implemented in a tool called ChEOPSJ, which I presented at last years CSMR.
  • 8. ChEOPSJ Applications TestSelection Model ChangeRecorders Change Distiller Logger Distiller SVNKit ChEOPSJ: Change-Based Test Optimization Quinten David Soetens and Serge Demeyer In "Proceedings of 16th European Conference on Software Maintenance and Reengineering, CSMR 2012 8 We consider changes made to the source code as first class objects, -- tangible entities that we can analyze and manipulate. Basically it’s a tool that can record changes in the background while you are programming. And in order to work with real world cases we also have the capability of recovering changes from source code repositories. Once a change model is instantiated for a system we can analyze the change model and run different applications (for now only the test selection application).
  • 9. First Class Change Objects Changes act on Source Code (FAMIX) Entities (e.g. AddClassChange, AddMethodChange, etc.) 9 For instance adding a new class will result in an Add-Class change
  • 10. First Class Change Objects Changes have Structural Dependencies (e.g. AddMethod ---> AddClass--->AddPackage etc.) 10 We can also define dependencies between these changes. For instance adding a method to a class requires the class to be added first. Therefor there is a dependency between the AddMethodChange and the AddClassChange.
  • 11. First Class Change Objects Traceability via Dependencies between Test and Program Code Changes to Changes to Program Code Test Code 11 It’s these dependencies that we can use to find relevant tests. Since Tests are also source code, so we can find a series of dependencies between the test code and the source code. These dependencies form a live traceability link between the test code and the source code. Using these links we can select relevant tests for a particular change.
  • 12. Research Questions Compare test subset against “retest all” Size Reduction? Quality? Accuracy? 12 We evaluated our approach on two open source cases: Cruisecontrol and PMD. For each class we searched for the relevant test classes. (Using the changes in the class). We could then compare the found subset(s) of tests against the entire (larger) test suite. And we compared this on three criteria. How much did we actually reduce the test suite? What was the quality of the reduced test suite? Is this the same or worse? And we used a metric called Mutation Coverage to gauge the quality of a set of tests. And finally we also looked at the accuracy of our approach, which means we looked at precision and recall. The First Question is: When we reduce the test suite to a subset of tests, how much did we actually reduce it?
  • 13. Size Red ucti on? Cruisecontrol, 295 Tests Reduced to 1 Test PMD$ 215 Tests Reduced to 1 Test 13 For 54% and 44% of the classes we found that there was only 1 relevant test. For 11% and 20% of the classes we found there were 2 relevant tests. For 13% and 10% of the classes had 3 relevant tests. For 21% and 26% there were 4 or more relevant tests. Cruisecontrol: 1 (54.5%) 2 (11.4%) 3 (13.1%) >=4 (21.0%) PMD: 1 (44.0%) 2 (19.9%) 3 (09.9%) >=4 (26.2%)
  • 14. Size Red ucti on? Cruisecontrol, 295 Tests Reduced to 2 Tests PMD$ 215 Tests Reduced to 2 Test 14 For 54% and 44% of the classes we found that there was only 1 relevant test. For 11% and 20% of the classes we found there were 2 relevant tests. For 13% and 10% of the classes had 3 relevant tests. For 21% and 26% there were 4 or more relevant tests. Cruisecontrol: 1 (54.5%) 2 (11.4%) 3 (13.1%) >=4 (21.0%) PMD: 1 (44.0%) 2 (19.9%) 3 (09.9%) >=4 (26.2%)
  • 15. Size Red ucti on? Cruisecontrol, 295 Tests Reduced to 3 Tests PMD$ 215 Tests Reduced to 3 Test 15 For 54% and 44% of the classes we found that there was only 1 relevant test. For 11% and 20% of the classes we found there were 2 relevant tests. For 13% and 10% of the classes had 3 relevant tests. For 21% and 26% there were 4 or more relevant tests. Cruisecontrol: 1 (54.5%) 2 (11.4%) 3 (13.1%) >=4 (21.0%) PMD: 1 (44.0%) 2 (19.9%) 3 (09.9%) >=4 (26.2%)
  • 16. Size Red ucti on? 295 Tests Reduced to Cruisecontrol, 4 or more Tests (max = 22) 215 Tests Reduced to 4 or more Test PMD$ (max = 37) 16 For 54% and 44% of the classes we found that there was only 1 relevant test. For 11% and 20% of the classes we found there were 2 relevant tests. For 13% and 10% of the classes had 3 relevant tests. For 21% and 26% there were 4 or more relevant tests. Cruisecontrol: 1 (54.5%) 2 (11.4%) 3 (13.1%) >=4 (21.0%) PMD: 1 (44.0%) 2 (19.9%) 3 (09.9%) >=4 (26.2%)
  • 17. Test Suites Grow 17 As such we can say that we can reduce ALL the tests
  • 18. Size Red ucti on? 18 to a handful of tests. -- 80 to 90 % of the classes had up to 5 relevant tests!
  • 19. Research Questions Compare test subset against “retest all” Size Reduction? Quality? Accuracy? 19 Next Question was: Does the quality of the reduced test sets remain the same or is it worse than retest all?
  • 20. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? All Tests Pass } public void berekenDTotVorige(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } } } public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 20 To asses the quality of a set of tests, we used mutation testing. In short. this is inserting a fault into the code and checking if your test set fails (mutation killed) or not (mutation survived). We used PIT as a tool to do this automatically for us. We start with a green test suite (i.e. all tests pass)
  • 21. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE ≈ pitest.org private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ Introduce Mutant firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } + Rerun Tests } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? } public void berekenDTotVorige(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } } } public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 21 After inserting a mutation we run the tests. If the tests still pass, we say that the mutation survived (Which is BAD, because you introduced a bug in your system and the tests did not catch it.)
  • 22. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? All Tests Pass } public void berekenDTotVorige(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Mutation Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } Survived } } public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 22 After inserting a mutation we run the tests. If the tests still pass, we say that the mutation survived (Which is BAD, because you introduced a bug in your system and the tests did not catch it.)
  • 23. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? } ≈ pitest.org public void berekenDTotVorige(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; Introduce Mutant for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } + Rerun Tests } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } } } public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 23 After inserting another mutation we run the tests again. Now some of tests fail, so we can say that this mutation was killed (This is GOOD)
  • 24. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? } Some Tests Fail public void berekenDTotVorige(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Mutation Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } } } Killed public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 24 After inserting another mutation we run the tests again. Now some of tests fail, so we can say that this mutation was killed (This is GOOD)
  • 25. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } ≈ pitest.org private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ Repeat For All ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? } public void berekenDTotVorige(){ Possible Mutations for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } } } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); s.dTotVorige = j-vorigeIndex; } } } } public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 25 We do this for all mutations and we get a metric: Mutation Coverage which is the percentage of the number of mutants killed out of the total number of mutants introduced. We can use this metric to gauge the quality of a set of tests. And we now want to see if for a particular class the quality remains the same? when only using a reduced set of tests.
  • 26. Qua Mutation Testing lity? package engine; import java.util.*; public class SuffixTree { int[] hdlabel = new int[10000]; int[] ithSuf; int ithSufLength; int ithSufBegin; int[] firstSuf; public Vertex root= null; private Vector pStringV; private int[] a; public Vector pMatches = new Vector(); Vector inputFiles; int in;//de in-de suffix public SuffixTree(Vector symbolen, Vector files) { pStringV = symbolen; inputFiles = files; ithSuf = new int[pStringV.size()]; firstSuf = new int[pStringV.size()]; } SOURCE private void ithSuffix1(){//nieuwe versie, nu voor i=1 for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==false){ firstSuf[j]=s.symbool; ithSuf[j]=s.symbool; } else{ firstSuf[j]=s.dTotVorige; ithSuf[j]=s.dTotVorige; } } ithSufLength = pStringV.size(); ithSufBegin=0; CODE } ≈ ipitest.org private void ithSuffix(int i){//nieuwste versie, niet voor i=1 ithSufBegin = i-1; Symbool sym = (Symbool)pStringV.elementAt(i-2); if(sym.parameter==true && sym.dTotVolgende!=0){ Repeat For All ithSuf[i-2+sym.dTotVolgende] = 0; } ithSufLength = pStringV.size()-i+1; // return ithSufClone;//nodig? } public void berekenDTotVorige(){ Possible Mutations for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int vorigePos=-1; for(int k=j-1;k>=0;k--){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); ts K lled if(sym.symbool==s.symbool) {vorigePos=k;break;} }//is er een probleem als een n-par en een par dezelfde int hebben? if(vorigePos==-1) s.dTotVorige=0; else s.dTotVorige = j-vorigePos; } n } # Muta } public void berekenDTotVorige2(){ Hashtable ht = new Hashtable(); for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ Integer i = new Integer(s.symbool); if(!ht.containsKey(i)){ s.dTotVorige=0; ht.put(i,new Integer(j)); roduced } else{ int vorigeIndex = ((Integer)ht.get(i)).intValue(); ht.put(i,new Integer(j)); overage = # Mutants Int s.dTotVorige = j-vorigeIndex; } } } } utation C public void berekenDTotVolgende(){ for(int j=0;j<=pStringV.size()-1;j++){ Symbool s = (Symbool)pStringV.elementAt(j); if(s.parameter==true){ int volgendePos=-1; M for(int k=j+1;k<pStringV.size();k++){//zoek of de parameter al eerder voorkwam Symbool sym = (Symbool)pStringV.elementAt(k); if(sym.symbool==s.symbool) {volgendePos=k;break;} } if(volgendePos==-1) s.dTotVolgende=0; else s.dTotVolgende = volgendePos-j; } SOURCE CODE © ≈ http://pitest.org ≈ 25 We do this for all mutations and we get a metric: Mutation Coverage which is the percentage of the number of mutants killed out of the total number of mutants introduced. We can use this metric to gauge the quality of a set of tests. And we now want to see if for a particular class the quality remains the same? when only using a reduced set of tests.
  • 27. Qua lity? Cruisecontrol, 88% equal Mutation Coverage PMD$ 50% equal Mutation Coverage 26 In 88% and 50% of the inspected classes we have a mutation coverage that remained the same. (i.e. the quality of the reduced test set is equal to that of the full test suite.) In 12% (Cruisecontrol) and 50% (PMD) however we have a worse Mutation Coverage, but the question then arises
  • 28. Qua lity? Cruisecontrol, 88% equal Mutation Coverage rse is ch wo w mu era ge? Ho nC ov tio uta equal Mutation Coverage PMD$ th e M 50% 27 How much worse is the mutation coverage in these cases?
  • 29. Qua lity? 100" 100" Percentage)of)more)surviving) Percentage)of)more)surviving) 90" 90" 80" 80" 70" 70" mutants) 60" 60" mutants) 50" 50" 40" 40" 30" 30" 20" 20" 10" 10" 0" 0" ,20" 30" 80" 130" 180" ,20" 30" 80" 130" 180" Total)number)of)mutants) Total)number)of)mutants) 28 So we looked at those test subsets were more mutants survived than with the retest all. We see that it varies from a couple of percent to a hundred percent more mutants surviving. However we need to take in account the total number of mutants introduced. So that is what is shown here. On the vertical axis we show the percentage of more surviving mutants. Meaning the lower the better. On the horizontal axis we show the total number of mutants introduced. Which puts some of the data points in perspective.
  • 30. Qua lity? 100" 100" Percentage)of)more)surviving) Percentage)of)more)surviving) 90" 90" 80" 80" 70" 70" mutants) 60" 60" mutants) 50" 50" 40" 40" 30" 30" 20" 20" 10" 10" 0" 0" ,20" 30" 80" 130" 180" ,20" 30" 80" 130" 180" Total)number)of)mutants) Total)number)of)mutants) 29 For Cruisecontrol for instance there is one point where a 100% of the introduced mutants survived the subset, but were caught in the retest all. However when put in perspective this is out of a total of only 3 mutants!!!
  • 31. Qua lity? 100" 100" Percentage)of)more)surviving) Percentage)of)more)surviving) 90" 90" 80" 80" 70" 70" mutants) 60" 60" mutants) 50" 50" 40" 40" 30" 30" 20" 20" 10" 10" 0" 0" ,20" 30" 80" 130" 180" ,20" 30" 80" 130" 180" Total)number)of)mutants) Total)number)of)mutants) 30 The data points that are more worrisome in Cruisecontrol are the two in the middle. Because, here a relatively high number of mutants is introduced an quite a few of them survived the subset of tests where they did not survive the full test set.
  • 32. Qua lity? 100" 100" Percentage)of)more)surviving) Percentage)of)more)surviving) 90" 90" 80" 80" 70" 70" mutants) 60" 60" mutants) 50" 50" 40" 40" 30" 30" 20" 20" 10" 10" 0" 0" ,20" 30" 80" 130" 180" ,20" 30" 80" 130" 180" Total)number)of)mutants) Total)number)of)mutants) 31 PMD performs a lot worse. As we can see all of these data points with high numbers of mutants surviving the subset and not the full set.
  • 33. Qua lity? 100" 100" Percentage)of)more)surviving) Percentage)of)more)surviving) 90" 90" 80" 80" 70" 70" mutants) 60" 60" mutants) 50" 50" 40" 40" 30" 30" 20" 20" 10" 10" 0" 0" ,20" 30" 80" 130" 180" ,20" 30" 80" 130" 180" Total)number)of)mutants) Total)number)of)mutants) On average 12% more On average 24% more mutants survive mutants survive (weighted average) (weighted average) 32 Still on average we can say that 12% and 24% more mutants survive, and this is a weighted average where we took the total number of mutants as weights. In short the closer the data points are to the axes,the better. So our approach up to now is good, but it’s not perfect. We do miss some relevant tests.
  • 34. Research Questions Compare test subset against “retest all” Size Reduction? Quality? Accuracy? 33 Which leads us automatically to the next question, what’s our precision and recall? i.e. How many of the selected tests are really relevant tests (precision)? How many of the really relevant tests are selected (recall)? To measure precision and recall we need some kind of oracle to tell us which actually are the relevant tests for each class.
  • 35. Acc urac Dynamic Analysis y? ∀ t ∈Tests: execute t ∀ m : Method invoked during run of t t is a relevant test for m 34 We used a dynamic analysis to tell us. In short we wrote a simple aspect in aspectj that during the execution of a test, notes which methods were invoked. We can then say that that test is relevant for those methods. Using these results we could compare to our static analysis of the changes...
  • 36. Acc urac y? Precision) Precision) [0.25,0.5[$[0,0.25[$ [0,0.25[$ [0.25,0.5[$ [0.5,0.75[$ [0.5,0.75[$ [0.75,1[$ [1]$ [0.75,1[$ [1]$ Avg: 0.88 Avg: 0.83 Recall) Recall) [0,0.25[$ [0.25,0.5[$ [0,0.25[$ [1]$ [0.5,0.75[$ [1]$ [0.25,0.5[$ [0.75,1[$ [0.75,1[$ Avg: 0.77 Avg: 0.58 [0.5,0.75[$ 35 We find for both Cruisecontrol and PMD high precision values (on average 0.88 and 0.83%). Which means that most of the test that we selected in the subsets were in fact relevant tests! The recall values are a bit lower especially in the case of PMD. With an average recall of 77% and 58%. This means that some of the actually relevant tests where not selected in the subsets by our tool. This was also apparent in the mutation testing approach. But is this really bad?
  • 37. 36 When we look back at our individual developer. He is performing changes on a software system. And wants to test his code. When he gets tool support saying, these are the relevant tests for your changes, he gets more confident about his code. He will test more often. He gets shorter feedback cycles. The selected subset is not safe as it occasionally misses a few relevant tests, however it is adequate especially since the complete test suite will be executed as part of the integration build anyway.
  • 38. 37 What’s next after this? We need to do some more work on this, basically polishing the approach (try to improve recall, probably at the cost of precision) See how this approach performs on industrial cases. On the other hand we also want to have a look at other applications of Change Centric Software Development. One thing that we are currently looking at is looking if we can detect patterns in the set of changes. -- Either predefined patterns like refactorings, and checking if we can identify those. -- Or just frequent pattern mining on a set of changes and not knowing in advance what kind of patterns we might uncover. Another application is that successful changes on one branch of a piece of software might be reapplied on other branches of that system (bug fixes?)
  • 39. Future Directions • Reducing Test Runtime • Polishing of the Approach (& Implementation) • More (Industrial) Cases • Detect Change Patterns • Identify Refactorings • Recurring sequences of changes • Reapplying changes • bug fixes • design improvements • API evolution 37 What’s next after this? We need to do some more work on this, basically polishing the approach (try to improve recall, probably at the cost of precision) See how this approach performs on industrial cases. On the other hand we also want to have a look at other applications of Change Centric Software Development. One thing that we are currently looking at is looking if we can detect patterns in the set of changes. -- Either predefined patterns like refactorings, and checking if we can identify those. -- Or just frequent pattern mining on a set of changes and not knowing in advance what kind of patterns we might uncover. Another application is that successful changes on one branch of a piece of software might be reapplied on other branches of that system (bug fixes?)
  • 40. 38 To wrap up.... We were looking for a way to find relevant tests for small changes to the software. We found that our technique could reduce the test suite to a handful of test (5 tests in 80-90% of the cases) We found that in 50-80% those reduced test suites had the same mutation coverage (quality) as the full test set) The test sets that had a worse mutation coverage, was actually not that bad. And we found that we had really good precision, but lower recall, meaning that we did in fact miss some relevant tests. However as we mentioned this is not a very big problem since the full test suite will in the end also be built anyway.