9. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
10. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
■ Failing is the new passing—the mutant should break the tests
11. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
■ Failing is the new passing—the mutant should break the tests
■ The tests collectively should fail, not every individual test
12. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
■ Failing is the new passing—the mutant should break the tests
■ The tests collectively should fail, not every individual test
■ It indicates how well your code is covered by your suite of tests
13. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
■ Failing is the new passing—the mutant should break the tests
■ The tests collectively should fail, not every individual test
■ It indicates how well your code is covered by your suite of tests
■ Mutation testing doesn’t test your code, it tests your tests
14. What is mutation testing? 3
■ It is the technique of running your tests over deliberately broken
versions (mutants) of your code to make sure they fail
■ Failing is the new passing—the mutant should break the tests
■ The tests collectively should fail, not every individual test
■ It indicates how well your code is covered by your suite of tests
■ Mutation testing doesn’t test your code, it tests your tests
■ It is not new—the concept was introduced by Richard Lipton in 1971
16. How do we “break” the code? 4
■ Example: swap arithmetic operators +, –, x, ÷, modulo
17. How do we “break” the code? 4
■ Example: swap arithmetic operators +, –, x, ÷, modulo
■ Errors can be introduced, for example division by zero—but this
should just produce a test failure, which is fine
18. How do we “break” the code? 4
■ Example: swap arithmetic operators +, –, x, ÷, modulo
■ Errors can be introduced, for example division by zero—but this
should just produce a test failure, which is fine
■ Example: swap comparison operators ==, !=, <=, <, >=, >
19. How do we “break” the code? 4
■ Example: swap arithmetic operators +, –, x, ÷, modulo
■ Errors can be introduced, for example division by zero—but this
should just produce a test failure, which is fine
■ Example: swap comparison operators ==, !=, <=, <, >=, >
■ Can introduce invariants, for example == and >= can be equivalent
checking the upper limit of a loop counter, so handle with care
20. How do we “break” the code? 4
■ Example: swap arithmetic operators +, –, x, ÷, modulo
■ Errors can be introduced, for example division by zero—but this
should just produce a test failure, which is fine
■ Example: swap comparison operators ==, !=, <=, <, >=, >
■ Can introduce invariants, for example == and >= can be equivalent
checking the upper limit of a loop counter, so handle with care
■ Example: delete “statements” (e.g. .NET IL sequence points)
22. Then what? 5
■ We save the mutated code and run the test suite against it
23. Then what? 5
■ We save the mutated code and run the test suite against it
■ We hope the test suite fails and the mutant is “killed”
24. Then what? 5
■ We save the mutated code and run the test suite against it
■ We hope the test suite fails and the mutant is “killed”
■ If the mutant survives, then further investigation is necessary
25. Then what? 5
■ We save the mutated code and run the test suite against it
■ We hope the test suite fails and the mutant is “killed”
■ If the mutant survives, then further investigation is necessary
■ We may have to add tests to kill the mutant
26. Then what? 5
■ We save the mutated code and run the test suite against it
■ We hope the test suite fails and the mutant is “killed”
■ If the mutant survives, then further investigation is necessary
■ We may have to add tests to kill the mutant
■ We may have to refactor the code to kill the mutant
27. Then what? 5
■ We save the mutated code and run the test suite against it
■ We hope the test suite fails and the mutant is “killed”
■ If the mutant survives, then further investigation is necessary
■ We may have to add tests to kill the mutant
■ We may have to refactor the code to kill the mutant
■ Both are related to TDD—red, green refactor
29. Can I see an example? 6
■ Here’s a simple method with a test public int Subtract(int a, int b)
{
return a – b;
}
[Test]
public void TestSubtract()
{
Assert.AreEqual(1, Subtract(3, 2));
}
Test passes
30. Can I see an example? 6
■ Here’s a simple method with a test public int Subtract(int a, int b)
{
return a – b;
■ We’re going to mutate the arithmetic }
operator, replacing it with addition and
[Test]
division public void TestSubtract()
{
Assert.AreEqual(1, Subtract(3, 2));
}
Test passes
31. Can I see an example? 6
■ Here’s a simple method with a test public int Subtract(int a, int b)
{
return a + b;
■ We’re going to mutate the arithmetic }
operator, replacing it with addition and
[Test]
division public void TestSubtract()
{
■ Let’s try addition: 3 + 2 = 5, so the }
Assert.AreEqual(1, Subtract(3, 2));
test fails—which is good
Test fails
32. Can I see an example? 6
■ Here’s a simple method with a test public int Subtract(int a, int b)
{
return a / b;
■ We’re going to mutate the arithmetic }
operator, replacing it with addition and
[Test]
division public void TestSubtract()
{
■ Let’s try addition: 3 + 2 = 5, so the }
Assert.AreEqual(1, Subtract(3, 2));
test fails—which is good
Test passes
■ Let’s try division: 3 / 2 = 1 (remember,
this is integer division), so the test
passes—which is bad
34. What does that show us? 7
■ The division mutant survived
35. What does that show us? 7
■ The division mutant survived
■ The existing test (suite) is inadequate
36. What does that show us? 7
■ The division mutant survived public int Subtract(int a, int b)
{
return a – b;
■ The existing test (suite) is inadequate }
[Test]
■ We need to improve the test(s)—the public void TestSubtract()
current test is far too simplistic with {
Assert.AreEqual(1, Subtract(3, 2));
just one simple assertion }
Unit test passes
Mutation testing fails
37. What does that show us? 7
■ The division mutant survived public int Subtract(int a, int b)
{
return a – b;
■ The existing test (suite) is inadequate }
[Test]
■ We need to improve the test(s)—the public void TestSubtract()
current test is far too simplistic with {
Assert.AreEqual(1, Subtract(3, 2));
just one simple assertion }
■ We add a second test assertion
Unit test passes
Mutation testing fails
38. What does that show us? 7
■ The division mutant survived public int Subtract(int a, int b)
{
return a – b;
■ The existing test (suite) is inadequate }
■ We need to improve the test(s)—the [Test]
public void TestSubtract()
current test is far too simplistic with {
Assert.AreEqual(1, Subtract(3, 2));
just one simple assertion Assert.AreEqual(5, Subtract(3, -2));
}
■ We add a second test assertion
Unit test passes
■ Here is the fixed test that now kills the Mutation testing passes
division mutant
40. How do we do mutation testing? 8
■ Can be applied at source code, intermediate code (Java ByteCode,
Microsoft IL) or compiled code level
41. How do we do mutation testing? 8
■ Can be applied at source code, intermediate code (Java ByteCode,
Microsoft IL) or compiled code level
■ For Java and .NET, intermediate code is easiest to target in an
automated solution as it has a relatively small set of well
understood instructions and doesn’t require code parsing
42. How do we do mutation testing? 8
■ Can be applied at source code, intermediate code (Java ByteCode,
Microsoft IL) or compiled code level
■ For Java and .NET, intermediate code is easiest to target in an
automated solution as it has a relatively small set of well
understood instructions and doesn’t require code parsing
■ Can be automated using a mutation testing tool
44. What tools are available? 9
■ There are a number of abortive open source attempts, or skeleton
projects; only a few projects are actively maintained and developed
45. What tools are available? 9
■ There are a number of abortive open source attempts, or skeleton
projects; only a few projects are actively maintained and developed
■ The following show signs of meaningful activity within the last six
months at the time of writing (August 2012)
46. What tools are available? 9
■ There are a number of abortive open source attempts, or skeleton
projects; only a few projects are actively maintained and developed
■ The following show signs of meaningful activity within the last six
months at the time of writing (August 2012)
■ Java ByteCode mutation: PIT, Javalanche
47. What tools are available? 9
■ There are a number of abortive open source attempts, or skeleton
projects; only a few projects are actively maintained and developed
■ The following show signs of meaningful activity within the last six
months at the time of writing (August 2012)
■ Java ByteCode mutation: PIT, Javalanche
■ .NET IL: NinjaTurtles
49. Why do we do mutation testing? 10
■ I like to think of it as a way of measuring a team’s test-drivenness
50. Why do we do mutation testing? 10
■ I like to think of it as a way of measuring a team’s test-drivenness
■ Test-driven development (TDD): write test, write code to make it
pass, refactor (red, green, refactor)
51. Why do we do mutation testing? 10
■ I like to think of it as a way of measuring a team’s test-drivenness
■ Test-driven development (TDD): write test, write code to make it
pass, refactor (red, green, refactor)
■ No code should be written that isn’t there to make a test pass
52. Why do we do mutation testing? 10
■ I like to think of it as a way of measuring a team’s test-drivenness
■ Test-driven development (TDD): write test, write code to make it
pass, refactor (red, green, refactor)
■ No code should be written that isn’t there to make a test pass
■ Therefore if you mutate any code you should make a test fail
53. Why do we do mutation testing? 10
■ I like to think of it as a way of measuring a team’s test-drivenness
■ Test-driven development (TDD): write test, write code to make it
pass, refactor (red, green, refactor)
■ No code should be written that isn’t there to make a test pass
■ Therefore if you mutate any code you should make a test fail
■ So perfect TDD implies 100% of (meaningful) mutants killed
56. YAMM—yet another maturity model 11
■ Level 0: code is tested when it is released, by the users
■ Level 1: code is tested by dedicated testers before release
57. YAMM—yet another maturity model 11
■ Level 0: code is tested when it is released, by the users
■ Level 1: code is tested by dedicated testers before release
■ Level 2: code is also tested by the developers; some automation
58. YAMM—yet another maturity model 11
■ Level 0: code is tested when it is released, by the users
■ Level 1: code is tested by dedicated testers before release
■ Level 2: code is also tested by the developers; some automation
■ Level 3: tests are written first and automated as far as is possible
59. YAMM—yet another maturity model 11
■ Level 0: code is tested when it is released, by the users
■ Level 1: code is tested by dedicated testers before release
■ Level 2: code is also tested by the developers; some automation
■ Level 3: tests are written first and automated as far as is possible
■ Level 4: mutation testing is applied to verify efficacy of test suites
60. YAMM—yet another maturity model 11
■ Level 0: code is tested when it is released, by the users
■ Level 1: code is tested by dedicated testers before release
■ Level 2: code is also tested by the developers; some automation
■ Level 3: tests are written first and automated as far as is possible
■ Level 4: mutation testing is applied to verify efficacy of test suites
■ This could be a tree, with other ideas like BDD introduced, but it
illustrates the position of mutation testing
62. What are the disadvantages? 12
■ Good tooling is only just emerging
63. What are the disadvantages? 12
■ Good tooling is only just emerging
■ It’s relatively slow and computationally expensive
64. What are the disadvantages? 12
■ Good tooling is only just emerging
■ It’s relatively slow and computationally expensive
■ Each line of code can produce many mutants
65. What are the disadvantages? 12
■ Good tooling is only just emerging
■ It’s relatively slow and computationally expensive
■ Each line of code can produce many mutants
■ Each mutant requires a suite of tests to be run
66. What are the disadvantages? 12
■ Good tooling is only just emerging
■ It’s relatively slow and computationally expensive
■ Each line of code can produce many mutants
■ Each mutant requires a suite of tests to be run
■ But the suite of tests can be narrowed down, and it only needs to
be run until the first failure is encountered
67. What are the disadvantages? 12
■ Good tooling is only just emerging
■ It’s relatively slow and computationally expensive
■ Each line of code can produce many mutants
■ Each mutant requires a suite of tests to be run
■ But the suite of tests can be narrowed down, and it only needs to
be run until the first failure is encountered
■ And it’s eminently parallelisable for multicore PCs
69. What are the disadvantages? 13
■ A passing mutation test can fail fast, but a failing mutation test is likely
to take significantly longer as it runs the entire set of relevant tests
70. What are the disadvantages? 13
■ A passing mutation test can fail fast, but a failing mutation test is likely
to take significantly longer as it runs the entire set of relevant tests
■ So should be applied continuously to keep quality high
71. What are the disadvantages? 13
■ A passing mutation test can fail fast, but a failing mutation test is likely
to take significantly longer as it runs the entire set of relevant tests
■ So should be applied continuously to keep quality high
■ Can only realistically be applied in this way to code bases and test
suites that are already in fairly good shape
73. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
74. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
■ Open source tool working at a .NET IL level
75. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
■ Open source tool working at a .NET IL level
■ Includes seven different types of mutation at time of writing
76. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
■ Open source tool working at a .NET IL level
■ Includes seven different types of mutation at time of writing
■ Compatible with existing .NET languages and unit test frameworks
77. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
■ Open source tool working at a .NET IL level
■ Includes seven different types of mutation at time of writing
■ Compatible with existing .NET languages and unit test frameworks
■ Optimisations include parallelised test runs, reduced test suites, fail
fast (where the underlying test runner supports it)
78. A shameless plug for NinjaTurtles 14
■ Disclaimer: I’m the lead developer on this so I’m biased
■ Open source tool working at a .NET IL level
■ Includes seven different types of mutation at time of writing
■ Compatible with existing .NET languages and unit test frameworks
■ Optimisations include parallelised test runs, reduced test suites, fail
fast (where the underlying test runner supports it)
■ Console runner gives immediate results
80. Sample NinjaTurtles output 15
■ Simple console command
applied to one class within the
NinjaTurtles code base
81. Sample NinjaTurtles output 15
■ Simple console command
applied to one class within the
NinjaTurtles code base
■ Over 1,000 runs of the test
suite, each run as a separate
process
82. Sample NinjaTurtles output 15
■ Simple console command
applied to one class within the
NinjaTurtles code base
■ Over 1,000 runs of the test
suite, each run as a separate
process
■ Test results returned in under
90 seconds
83. Sample NinjaTurtles output 15
■ Simple console command
applied to one class within the
NinjaTurtles code base
■ Over 1,000 runs of the test
suite, each run as a separate
process
■ Test results returned in under
90 seconds
■ Also XML or HTML output
85. Conclusions 16
■ Mutation testing has a lot to offer teams looking to constantly improve
the way they do things
86. Conclusions 16
■ Mutation testing has a lot to offer teams looking to constantly improve
the way they do things
■ Test code is a first class citizen—it deserves testing too
87. Conclusions 16
■ Mutation testing has a lot to offer teams looking to constantly improve
the way they do things
■ Test code is a first class citizen—it deserves testing too
■ Computing power and quality of tooling are improving so mutation
testing is a realistic tool to use day to day
88. Conclusions 16
■ Mutation testing has a lot to offer teams looking to constantly improve
the way they do things
■ Test code is a first class citizen—it deserves testing too
■ Computing power and quality of tooling are improving so mutation
testing is a realistic tool to use day to day
■ Passing mutation tests run a lot quicker than failing ones, so this
should only be applied by relatively “mature” development teams
89. Conclusions 16
■ Mutation testing has a lot to offer teams looking to constantly improve
the way they do things
■ Test code is a first class citizen—it deserves testing too
■ Computing power and quality of tooling are improving so mutation
testing is a realistic tool to use day to day
■ Passing mutation tests run a lot quicker than failing ones, so this
should only be applied by relatively “mature” development teams
■ The barrier to entry is low, so you can start doing this today