Mutation Testing has been around for almost 20 years. Originated in academic research it has found its way into the developer’s toolbox being easy to setup, use and producing valuable results. But what is mutation testing? It’s a practice to determine the actual value of an automated test suite and automatically explore parts of the code that have yet been untested, unveiling surprises even to experienced test automation developers. Given a test suite that runs successfully, mutation testing will inject changes to the production code based on a set of rules and reruns the test to determine if the test will fail. Depending on the size of the code base the execution time increases exponentially due to the sheer amount of permutations, requiring thorough planning, focus and prioritization.
3. About me
• Gerald Mücke
• Founder & CEO of DevCon5 GmbH
• Passionate Software Developer
• Focal Points
• Performance Analysis
• Test Automation
• Mutation Testing
• DevOps
• Using mutation testing for > 2 years
5. «Quality Assurance»
“is a way of preventing mistakes or defects in
manufactured products and avoiding problems when
delivering solutions or services to customers”
(Wikipedia)
6. «manufactured» products
• «The process of converting raw materials,
components, or parts into finished goods that meet a
customer's expectations or specifications.»
• Most of the critical code is written manually
• Raw Materials
• existing software or parts of it
• brain, ideas, knowledge, experience,
requirements,
• Every product is unique
(may look similar, though)
7. «Preventing» defects
• Defects are «created» in development
• Can not be prevented,
it’s human to make mistakes
• Could be detected:
the earlier, the better
• Defects manifest in production
• Or during test
• Can be prevented:
the earlier, the better
8. Sources of a
Product
• Internal Development
• QA embeddable
• QA along the pipe line
• Quality is shared effort
• More Easy to change or
influence
• External Development
• Software Vendors
• more effort required for
dedicated QA
• Less easy to change
• handoff «Waterfall» style
10. Real-Life Bugs
if( isThreadSafe() ) {
computeSingleThreaded();
} else {
computeMultiThreaded();
} Made it to Production,
Performance Impact:
500% Duration of Day
End Processing
11. Real-Life Bugs
if( ! isDevelopmentMode() ){
collectProfileDataAndSendDeveloperReport();
}
In Production,
Impact:
20% Performance loss
Compliance Violation
12. Real-Life Bugs
void function(LocalDate begin, LocalDate end, LocalDate minFrom, ...) {
//...
outerLoop:
while( it.hasNext() ) {
Object current = it.next();
Local from = funcA(current);
Local upto = funcB(current);
while(true){
if( ! isBeforeOrEqual( from , upto ) ) {
continue outerLoop;
}
if( condY(from, minFrom) ) {
from = DateUtil.addDaysToDate(upto, 1);
upto = DateUtil.getLastOfMonth(from);
from = DateUtil.min(new LocalDate[]{ end, from});
upto = DateUtil.min(new LocalDate[]{ end, upto});
void function(LocalDate begin, LocalDate end, LocalDate minFrom, ...) {
//...
outerLoop:
while( it.hasNext() ) {
Object current = it.next();
Local from = funcA(current);
Local upto = funcB(current);
while(true){
if( ! isBeforeOrEqual( from , upto ) ) {
continue outerLoop;
}
if( condY(from, minFrom) ) {
from = DateUtil.addDaysToDate(upto, 1);
upto = DateUtil.getLastOfMonth(from);
from = DateUtil.min(new LocalDate[]{ end, from});
upto = DateUtil.min(new LocalDate[]{ end, upto});
17. Good Decisions are based on
Information
Simple
Metrics
Number of Unit Tests
Line Coverage
Branch Coverage
Complex
Test Results
Code Review
Static Code Analysis
…
17
18. Code Coverage
Information about what elements of a
product have been touched by a test.
Common Coverage Metrics
Line Coverage
Condition Coverage
Branch Coverage
Semantics ?
Code
Test
Test Oracle
21. Arcance
Arts
To most of the Non-Developers
Software Development
seems increasingly like an
arcane art
Languages, Paradigms,
Frameworks
Algorithms & Datastructures
O(n), ByteCode, Lambdas...
23. Quality Gates
Decision Point
List of Checks when the Product is
ready to be released
Based on information
Based on agreement between
stakeholders
Part of Definition of Done
Evolves over time
Should not replace human judgement
25. Perspectives
Programmers
• Implement the Solution
• Provide indication the solution is
working
• claim, they did it «right»
Testers
• Show if and how the solution will fail
• have to provide information for
stakeholders to make informed
decisions
• usually don’t understand arcane arts
26. Checking vs. Testing
Things we
are aware of
but don‘t
understand
Things we
are aware of
and
understand
Things we
are neither
aware of nor
understand
Things we
understand
but are not
aware of
28
Understanding
Awareness
Unknowns Knowns
UnknownKnown
CheckingTesting
Automated
Checking
27. The Testing Pyramid of Functional
Tests
UI
Tests
Integration Tests
Unit Tests
DegreeofAutomation
36. Mutation Testing – History
Mutations testing injects faults,
based on rules, into a product
to verify if the test suite
is capable of finding it.
Fault injection technique
Concept is known since ~1970
First implementation of a mutation testing tool in 1980
Most of the time it was subject to academic research only
Recently, with increasing processing power, there is a growing interest
More academic research ongoing
Practical tooling available
37. Mutation Testing – Some Theory
Mutation testing is a special form of Fault Injection
Based on two hypotheses
1: Most of the software faults are due to small syntactic errors
2: Simple faults can cascade to more emergent faults
Assumption:
“if a mutant was introduced without the behavior of the test suite being affected, this indicated either
that the code that had been mutated was never executed (dead code) or that the test suite was unable
to locate the faults represented by the mutant” (Wikipedia)
38. Mutation Testing - Definitions
Mutant
a variation P’ of the product P created by
applying a mutant operator m
P’ = m(P)
Killed Mutant
a variation P’ in which a test has found at
least ONE error
Live Mutant
a variation P’ in which a test has found NO
errors
Mutation Operators
A function m() that creates a variation of the
Product P by applying a set of modification
rules
Inject Faults into the Product
Based on Bug Taxonomies
Mutation Score
Number of Killed Mutants / Total number of
Mutants
Also Called Mutation Coverage
39. Some more definitions
Equivalent Mutation
a variation P’ that is semantically
identical to P
Duplicate Mutation
a variation P’ that is equivalent to another
variation P’’
Weak Mutation
Fault does not lead to incorrect output
Strong Mutation
Fault propagates to incorrect output
Unstable Mutation
Any test can find the mutations generated
by it
High-Order Mutants
Mutants that are defined by a set of Low-
Level Mutants
Subsumed Mutants
One mutant subsumes another if at least
one test kills the first and every test that
kills the first also kills the second.
42. Approaches to Mutation
Testing
Byte Code Mutation
Can be done on-the-fly
Faster to apply and execute
Might be affected by compiler optimizations
Source Code Mutation
Requires recompilation after every change
Takes very long
Is not affected by compiler optimizations
Higher Level Mutations
Configuration, Architecture, Specification,
Use/Business Case, ...
No Tooling Support (yet?)
43. Mutation Testing Phases
Mutant generation
analyzing classes and generate mutations for them
Test selection
selecting the tests to run against the mutations
Mutant insertion
loading the mutations into a JVM / Runtime Environment
Mutant detection
executing tests against the loaded mutants
44. Mutation Testing 101
Modify your code
(Mutant generation)
Re-Run the Test
(Test selection + Loading)
Check if test is failing
(Detection)
class Builder {
Builder withValue(String in) {
this.value = in;
this.value = in;
return this;
}
}
@Test
public void testLeft() {
Builder b = b.withValue(„one");
assertNotNull(b);
}
If test is Green it‘s a Fail!!!
45. Related Techniques
Bebugging / Fault Seeding
randomly adding bugs, programmers are tasked to find them
Fuzzing
Injecting Faults into Test Data
For Operations:
Chaos Monkey (Simian Army, Netflix)
Randomly terminating running processes or servers to test operational procedures or fitnesse
47. Tool: PIT
Mutation Testing for Java / JVM
Operates on ByteCode modification
easy to use - works with ant, maven, gradle and
others
~ 20 Mutation Operands for altering your
code
Parallel execution
fast - can analyze in minutes what would take
earlier systems days
Active Community
actively developed & supported
Mature Tooling
Good Documentation
HTML & XML Reports
49. Interpreting Results
Live Mutants
Reflects unspecified behavior
superfluous code / unrequired semantics
Could be an actual bug that is not covered by the test suite
Could be equivalent mutation
Killed by TimeOut or MemError
Could be “real kill” (i.e. endless loop)
Could be still alive
Mutation Score
Gives an indication of the overall quality of you test suite
50. Unit Test Maturity Model
Level Description
0 No Test
1 We have a test
2 1 + We have > 0% Line Coverage
3 2 + We have > 50% Branch Coverage
4 3 + We have at least 1 effective assertion per test
5 4 + We have > 80 % Mutation Coverage
52. Timeouts
Mutating abort-conditions of loops can cause timeouts
Loop runs endlessly
Mutation is effectively killed
Mutation might not be killed
Loop runs longer (i.e. counter underrun / overrun) -> Mutation might eventually survive
Your System is just too slow / the tests takes too long
When to stop the test?
Will the test fail?
If a loop runs longer, the machine performance is important for choosing the timeout.
53. Limitations
Fault Coverage
~¼ of real faults are not coverable by mutation testing
Mutation Score
PIT does not recognize subsumed or equivalent mutations
mutation score may not be “academically” precise – context matters!
Mutation Operators
PIT has no Java concurrency mutation operands
PIT has no high-order mutation operands
PIT has no Java-language specific mutation operands
Techniques
PIT does not support sampling
54. Value has it’s cost
Mutation Testing is computationally expensive
Duration of a mutation test depends on
number of tests
test suite execution time
number of mutation operators
Processing Power
Basically:
D = xn
n = number of mutation operators
x = number of tests
55. Deviation in Mutation
Score
Impact of Mutation Operator
Selection
Size of Codebase
Computational
Effort
Mutations
Found
More Operands
Less Operands
56. Mutation Analysis of Large Code
bases
Computational
Effort
Time Cap
Size of Codebase
Break into Chunks
Mo Tu We Th Fr
57. Other
techniques
Incremental Analysis
Based on historical data
Only test code that has changed
Increases deviation
Sampling
Good for Mutation Scoring
Increases Deviation
No support for sampling in PIT
58. Challenges of Mutation Testing
Redundant Mutants
Subsuming
Duplicates
Equivalent
Equivalent Detection
Current Algorithm achieves 50% detection rate in
Research
High Order Mutations
Computational Cost
Mutation and Test Selection
Equivalent
Mutants
Subsuming/
Duplicate
Mutants
60. Some Advices
Unit Tests are usually owned by development
challenge them with Mutation Testing!
It’s NOT unit tested until mutation tested.
Don’t go on a killing spree
Set achievable goals for mutation score
Triage surviving mutants
A mutation score > 0.8 is considered good (it depends…)
Determine mutation score regularly in a sensible intervall
Every build vs. Every release
Use historical data & SCM support
Find concrete mutants as needed
Adjust mutators & scope
61. Use Cases
Finding Gaps in Test Suite
Testing Highly Exposed Code
Algorithms and Calculations
Security-related code
Transaction-related code
Assessing Test Suites / Testing Strategies / Methodologies
By comparing the mutation scores, i.e.
Developing Test Suites for Legacy Code
Finding semantic hotspots
Finding gaps in Test Suite
Forced to break Code Base into more manageable pieces
Minimizing Test Suites
Reduce number of Tests while keeping mutation score stable
! Reduces the effectiveness of the suite for detecting real faults
63. Takeaways
Don’t trust your Unit Tests unless you mutation-tested it.
Mutation Testing is the practice to find bugs in your test suite
Forget about other coverage metrics
Cheap to get, but next to no value
Include Mutation Testing in your project.
Always.
Use it with common sense
don’t go on a killing spree.
For Java
PIT is the tool to use.