CodeFest 2012. Ильин А. — Метрики покрытия. Прагматичный подход

<Insert Picture Here>

Code coverage metrics.
The pragmatic approach.
Александр Ильин
Oracle


Preface

What is the code coverage data for

Measure to which extent source code is covered
during testing.
consequently …
Code coverage is
A measure of how much source code is covered
during testing.
finally …
Testing is
A set of activities aimed to prove that the system under
test behaves as expected.

3

CC – how to get

Create a template
Template is a description of all the code there is to cover
“Instrument” the source/compiled code/bytecode
Insert instructions for dropping data into a file/network, etc.
Run testing, collect data
May need to change environment
Generate report
HTML, DB, etc

4

CC – kinds of

Block / primitive block
Line
Condition/branch/predicate
Entry/exit
Method
Path/equence

5

CC – how to use
for testbase improvement

1: Measure (prev. slide)
Measure (prev. slide)
Good reporting is really important
Perform analysis
Find what code you need to cover.
Find what tests you need to develop.
Develop more tests

Find dead code

GOTO 1

6


Mis-usages

CC – how not to use
mis-usages
Must to get to 100%
May be not.
100% means no more testing
No it does not.
CC does not mean a thing
It does mean a fair amount if it is used properly.
There is that tool which would generate tests
for us and we're done
Nope.

8


Mis-usages
Test generation

Test generation

“We present a new symbolic execution tool, ####,
capable of automatically generating tests that
achieve high coverage on a diverse set of complex
and environmentally-intensive programs.”
#### tool documentation

Test generation cont.

if ( b != 3 ) {
double a = 1 / ( b – 2);
3);
} else {
…
}

Reminder: testing is ...
A set of activities aimed to prove that the system under
test behaves as expected.

Test generation - conclusion

Generated tests could not test that the code work
as expected because they only know how the
code works and not how it is expected to.
Because the only thing they possess is the code
which may already be not working as expected. :)

Hence …

Generated tests code coverage should not be
mixed with regular functional tests code
coverage.

12


Mis-usages
What does 100% coverage mean?

100% block/line coverage

number value
1 true

100% branch coverage

number value
1 true
-1 false

100% domain coverage

number value
0 0
.1 0.316227766016838
-1 exception

100% sequence coverage

a b result
-1 -1 1
b -1 1 -1
1 -1 -1
1 1 1
0 1 NaN
1 0 NaN
0 0 NaN

100% coverage - conclusion

100% block/line/branch/path coverage, even if
reachable, does not prove much.

Hence …

No need to try to get there unless ...

18


Mis-usages
Target value

CC target value - cost
Test Dev. Effort by Code Block Coverage.
at effort increases exponentially with coverage.
90.00

80.00
Relative Test Dev. Effort (1 at 50% code block coverage)

70.00

60.00
ffort relative to the effort of getting 50% coverage.

50.00 rx
f  x =k e

40.00 k =e−50r ⇒ f 50=1

coverage is proportional to the total effort30.00
needed to df current coverage.
get
=r f  x 
dx
20.00

10.00

elow 50% coverage, except maybe very big projects.
0.00
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

Code Block Coverage (%)

CC target value - effectiveness
Defect Coverage by Code Block Coverage

120.00
t coverage by code block coverage...

100.00

ffort per code coverage and defect coverage by effort.
H  x =h  f  x 
80.00

f  x =k e r x
Defect Coverage(%)

s
− y
B
60.00 h  y = B1−e 
the percentage of bugs remaining and the effort needed to get current  x  df
dH H coverage.
=s 1−  x
dx B dx
40.00

20.00
e below 50% coverage except maybe very big projects.

0.00
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100

Code Block Coverage(%)

CC target value - ROI
Benefit(c) = DC(c) DD COD, where
Cost-Benefit Analysis DC(c): Defect Coverage
DD: Defect Density. Example: 50bug/kloc
1200.00 COD: Cost Of Defect. Example: $20k/bug

1000.00

800.00
Benefit ($/size), Cost ($/size), ROI (%)

ROI = Benefit(c)/ Cost(c) - 1
600.00

400.00

200.00

0.00

Cost(c) = F + V * RE(c), where
-200.00
0 5 10 15 20 25 30 35 40 45 50 55 60 65 70 75 80 85 90 95 100
RE(c): Relative Effort, RE(50%) = 1
F: Fixed cost of test. Example: $50k/kloc
V: Variable cost of test. Example: $5k/kl

Code Block Coverage (%)

100% coverage - conclusion

100% block/line/branch/path coverage, even if
reachable, does not prove much.

Hence …

No need to try to get there unless 100% is the
target value.
Which could happen if cost of a bug is really big
and/or the product is really small.

23

Target value - conclusion

True target value for block/line/branch/path comes
from ROI, which is really hard to calculate and
justify.

24


Usages

CC – how to use

Test base improvement.
Right. How to select which tests to develop first
Dead code.
Barely of artifact
Metric
Better have a good metric.
Control over code development

Deep analysis

26


CC as a metric

What makes a good metric

Simple to explain
So that you could explain your boss why is that important
to spend resources on
Simple to work towards
So that you know what to do to improve

Has a clear goal
So you could tell how far are you.

Is CC a good metric?

Simple to explain +
Is a metric of quality of testing.

Simple to work towards +
Relatively easy to map uncovered code to missed tests

Has a clear goal -
Nope. ROI – too complicated.

Need to filter the CC data
so only that is left which must be covered

Public API*

Is a set of program elements suggested for usage by
public documentation.

For example: all functions and variables which are
described in documentation.

For a Java library: all public and protected methods and
fields mentioned in the library javadoc.

For Java SDK: … of all public classes in java and javax
packages.
(*) Only applicable for a library or a SDK

True Public API (c)

Is a set of program elements which could be accessed
directly by a library user

Public API
+
all extensions of public API in non-public classes

True public API example

My code

ArrayList.java

True Public API how to get

Get public API with interfaces
Filter CC data so that it only contains implementations
and extensions of the public API (*)

(*) This assume that you either
Use a tool which allows such kind of filtering
or
Have the data in a parse-able format and develop the
filtering on your own

UI coverage

In a way, equivalent to public API but for a UI product

%% of UI elements shown – display coverage
%% user actions performed – action coverage

Only “action coverage” could be obtained from CC data
(*).

(*) For UI toolkits which the presenter is familiar with.

Action coverage – how to get

Collect CC
Extract all implementations of
javax.swing.Action.actionPerformed(ActionEvent)
or
javafx.event.EventHandler.handle(Event)
Inspect all the implementations
org.myorg.NodeAction.actionPerformed(ActionEvent)
Add to the filter:
org.myorg.NodeAction.nodeActionPerformed(Node myNode)
Extract, repeat

“Controller” code coverage

Model
Contains the domain logic
View
Implements user interaction
Controller
Maps the two. Only contains code which is called as a
result of view actions and model feedbacks.

Controller has very little boilerplate code. A good
candidate for 100% block coverage.

“Important” code

Development/SQE marks class/method as important
We use an annotation @CriticalForCoverage
List of methods is obtained which are marked as
important
We do that by an annotation processor right while main
compilation
CC data is filtered by the method list
Goal is 100%

Examples of non-generic metrics

SOA elements
JavaFX properties
A property in JavaFX is something you could set, get and bind
Insert your own.

CC as a metric - conclusion

There are multiple ways to filter CC data to a set of
code which needed to be covered in full.

There are generic metrics and there is a possibility
to introduce product specific metric.

Such metrics are easy to use, although not always
so straightforward to obtain.

40


Test prioritization

Test prioritization

100500 uncovered lines of code!

“OMG! Where do I start?”

Metric
Develop tests to close the metric
Pick another metric

“Metrics for managers. Me no manager! Me write code!”

Consider mapping CC data to few other source code
characteristics.

Age of the code

New code is better be tested before getting to customer.
Improves bug escape rate, BTW

Old code is more likely to be tested by users
or
Not used by users.

What's a bug escape metric?

Ratio of defects sneaked out unnoticed

# defects not found before release
In theory:
# defects in the product

# defects found after release
Practical:
# defects found after + # defects found before

Number of changes

More times a piece of code was changed, more atomic
improvements/bugfixes were implemented in it.

Hence …

Higher risk of introducing a regression.

Number of lines changes

More lines changed – more testing it needs.

Better all – number of uncovered lines which were
changed in the last release.

Bug density

Assuming all the pieces were tested equally well …

Many bugs means there are, probably, even more
Hidden behind the known ones
Fixing existing ones may introduce yet more as regressions

Code complexity

Assuming the same engineering talent and the same
technology …

More complex the code is – more bugs likely to be there.

Any complexity metric would work: from class size to
cyclomatic complexity

Putting it together

A formula
(1 – cc) * (a1*x1 + a2*x2 + a3*x3 + ...)
Where
cc – code coverage (0 - 1)
xi – a risk of bug discovery in a piece of code
ai – a coefficient

Putting it together

(1 – cc) * (a1*x1 + a2*x2 + a3*x3 + ...)
The ones with higher value are first to cover

Fix the coefficients
Develop tests
Collect statistics on bug escape
Fix the coefficient
Continue

Test prioritization - conclusion

CC information alone may not give enough
information.

Need to accompany it with other characteristics of
test code to make a decision.

Could use a few of other characteristics
simultaniously.

51


Test prioritization
Execution

Decrease test execution time

Exclude tests which do not add coverage (*).

But, be careful! Remember that CC is not all and even
100% coverage does not mean a lot.

While excluding tests get some orthogonal measurement
as well, such as specification coverage.

(*) Requires “test scales”

Deep analysis

Study the coverage report, see what test code exercises
which code. (*).

Recommended for developers.

(*) Also requires “test scales”

Controlled code changes

Do not allow commits unless all the new/changed code is
covered.

Requires simultaneous commits of tests and the
changes.

Code coverage - conclusion

100% CC does not guarantee that the code is working right

100% CC may not be needed

It is possible to build good metrics with CC

CC helps with prioritization of test development

Other source code characteristics could be used with CC

56

CodeFest 2012. Ильин А. — Метрики покрытия. Прагматичный подход

Recomendados

Recomendados

Mais conteúdo relacionado

Semelhante a CodeFest 2012. Ильин А. — Метрики покрытия. Прагматичный подход

Semelhante a CodeFest 2012. Ильин А. — Метрики покрытия. Прагматичный подход (20)

Mais de CodeFest

Mais de CodeFest (20)

Último

Último (20)

CodeFest 2012. Ильин А. — Метрики покрытия. Прагматичный подход