Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Code coverage for MSR Researches [Work in Progress]
1. Code Coverage
for MSR Researchers
Mauricio Aniche
aniche@ime.usp.br
Monday, November 18, 13
2. What is Code Coverage?
• Describes how much a production code is
tested by the test suite.
• It basically counts the numbers of executed
lines (when running the test suite) divided
by the number of total lines.
Monday, November 18, 13
3. Why Do We Need This?
• It is hard to calculate code coverage when
studying a large quantity of repositories.
• Compiled code needed
• Test suite execution needed
• As we know, every project contains a
different way to compile/run.
Monday, November 18, 13
4. Statical Analysis
• Statical Analysis would solve this problem.
• It is impossible to execute the code
statically.
• We need heuristics!
Monday, November 18, 13
5. Our idea
• A production method contains a certain
level of complexity (which can be measured
by McCabe’s number)
• public void a() {
if(x) return 1; else return 2;
}
• If a method contains 2 different paths, then
it probably needs two different tests.
Monday, November 18, 13
6. Our formula
• Method-level: Qty of tests / McCabe’s
number
• Class-Level: Sum(Qty of tests per
method) / Sum(McCabe’s number per
method)
Monday, November 18, 13
7. Identifying test
methods
@Test
public void testaOMetodo2() {
A a = new A();
int resultado = a.fazAlgo();
Assert.assertEquals(1, resultado);
}
@Test
public void testaOMetodo() {
A a = new A();
int resultado = a.getB().fazAlgo();
Assert.assertEquals(1, resultado);
}
1st impl.
tests fazAlgo()
2nd impl.
tests getB(),
fazAlgo()
Monday, November 18, 13
8. Comparing the solution
• I want to compare to Emma (a tool that
does dynamic analysis on the source code)
• I don’t want to replace the tool (it does
not make sense)
• I want to discover the error average
• If it is small, then we can use it.
Monday, November 18, 13
9. Calculating the
difference
• All charts were based on the difference
between our calculated number minus
Emma’s number.
• It means that a “0” means that the
numbers were the same
• A negative number indicates the our tool
calculates a smaller code coverage than
Emma.
• A positive number, the other way around.
Monday, November 18, 13
15. Discussion
• Looks like the tool can differ from dynamic
analysis by 25%~30%.
• Questions:
• How can I eliminate big mistakes?
• How can I determine if the tool is valid
or not?
Monday, November 18, 13
16. Advantages
• Really fast. It does not need to compile and
run the tests.
• If the test fails, dynamic analysis may fail.
Static analysis do not.
Monday, November 18, 13
17. Disadvantages
• It is an heuristic.
• The implementation is very complicated.
• There might be bugs on the
implementation.
• There are a few things that is pretty hard
to identify. Mainly inheritance and
polymorphism.
• AOP code.
Monday, November 18, 13