Describes how to leverage combinatorial testing to reduce the number of selenium test cases while still maintaining desired code coverage, as well as the benefits of keeping inputs, metadata, and outputs relating to the tests and their runs in the database
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
Seconf2011 database driven combinatorial testing
1. Get More For Less!
Database Driven Combinatorial Testing
Aaron Silverman
Lead Engineer
Applied Predictive Technologies
www.predictivetechnologies.com
2. Background
APT produces business analytic software used by large
organizations to make decisions worth millions of dollars
It is critically
important that:
• Our software
behave as
expected
• The numbers we
display are correct
page 2
3. Agenda
We aim to share the lessons we have learned
• Front-end testing can only be made so fast
• An alternate way to reduce testing time is to reduce the
number of tests—but you don’t want to reduce testing coverage
• Combinatorial test selection techniques allow selective test case
creation and intelligent test case reduction
• Moving test case inputs and expected outputs out of the code
and into a testing framework database makes leveraging
combinatorial testing easy
• The database also makes test case maintenance and debugging
transparent and easy
page 3
4. Front End Testing Is Slow
End-to-end Selenium tests most accurately simulate an actual user
using the product, making them slower than other automated tests
March 2011 APT Continuous Integration
Automated Testing Stats
Test Type Test Cases Run Average Time
Per Test Case
Unit 434,016 6 seconds
Functional 91,912 21 seconds
Selenium 13,652 983 seconds
page 4
5. Test Cases Can Not Always Be Shortened
Despite extensive parallelization, some test cases can only be so
fast without deviating from real world use cases
Large Datasets Complex Computations
Slow Tests
page 5
6. Number of Tests Can Often Be Reduced
However, testing time can still be reduced by running fewer tests
Hey! Are you sure
about this?
Note: No turtles were harmed in the making of this presentation
page 6
7. Number of Tests Can Often Be Reduced
Run fewer tests?
Doesn’t that mean
we are testing
less?
page 7
8. Number of Tests Can Often Be Reduced
Not if we pick our
test cases using
combinatorial
testing!
page 8
10. Intelligently Pick Test Cases
Combinatorial testing focuses on testing specific relevant
combinations of inputs
There is a plethora of research and
resources about combinatorial testing,
especially 2-way, or pairwise testing.
• Hundreds of Papers
• Dozens of Tools
APT uses the freely available Microsoft’s Pairwise Independent
Combinatorial Testing, also known as PICT, as well as some tools we
developed ourselves
Download PICT at: http://msdn.microsoft.com/en-us/testing/bb980925.aspx
page 10
11. Simplified Example
Suppose we had an attendee survey for this conference…
Attendee Survey
Name:
Do you like Selenium?
Yes
No
Rate the conference: (Scale 1-10)
page 11
12. Simplified Example
…and if you do not like selenium, it asks why
Attendee Survey
Name:
Do you like Selenium?
Yes
No
because (check all that apply):
I have too many slow tests
The devs keep changing the UI
My favorite browser is Lynx
I like to complain a lot
Rate the conference: (Scale 1-10)
page 12
13. Simplified Example
There is lots to test!
Attendee Survey
• No Name Entered
Name: • Name Entered
Do you like Selenium?
• Yes Yes
• No No, because (check all that apply)
I have too many slow tests
16 permutations
of checked The devs keep changing the UI
selections when • Valid and
My favorite browser is Lynx
“No” is selected favorable
I like to complain a lot • Valid and
unfavorable
Rate the conference: (Scale 1-10) • Invalid
• Non-integer
page 13
15. Test Case Creation Tools
Using pairwise tools we can reduce our test cases to a much
smaller set that maintains coverage quality
pict_inputs.txt
name: blank, filled
likeSelenium: yes, no
specify input
names and
slowTests: checked, unchecked
possibilities
changingUI: checked, unchecked
usesLynx: checked, unchecked
complainsALot: checked, unchecked exclude
impossible
rating: favorable, unfavorable, invalid, non-integer
combinations
#if user likes selenium, no reasons can be checked
IF [likeSelenium] = "yes" THEN [slowTests] = "unchecked"
AND [changingUI] = "unchecked" AND [usesLynx] =
"unchecked" AND [complainsALot] = "unchecked";
Note: PICT is far more powerful than the options shown here
page 15
16. Test Case Creation Tools
PICT will then generate the optimal set of test cases that cover
all possible input pairs meeting our specified conditions
C:Program Files (x86)PICT>pict.exe pict_inputs.txt > pict_outputs.xls
pict_outputs.xls
name likeSelenium slowTests changingUI usesLynx complainsALot rating
blank no checked checked checked checked favorable
blank no checked checked unchecked checked unfavorable
blank no unchecked checked unchecked checked non-integer
blank yes unchecked unchecked unchecked unchecked favorable
blank yes unchecked unchecked unchecked unchecked invalid
filled no checked checked checked checked invalid Where did all my
filled no checked unchecked checked unchecked non-integer friends go?
filled no unchecked checked unchecked unchecked favorable
filled no unchecked unchecked checked checked favorable
filled no unchecked unchecked checked checked unfavorable
filled yes unchecked unchecked unchecked unchecked non-integer
filled yes unchecked unchecked unchecked unchecked unfavorable
Note: PICT is far more powerful than the options shown here
page 16
17. We should test outputs this way too
Our survey also has a few different outputs to validate:
Attendee Survey Attendee Survey
You seem to be having Rating >= 5:
Any Invalid input: trouble following Glad you liked the
directions, please try conference!
again!
Attendee Survey
Attendee Survey
“No” selected, a
rating of 1, and
Rating < 5: Sorry you did not like You should consider a
all checkboxes
the conference. career change.
selected:
page 17
18. Complete Coverage Is Not Always Complete
The outputs are related to the inputs but are not inputs
themselves, so we treat them as “metadata”
pict_outputs.xls
name likeSelenium slowTests changingUI usesLynx complainsALot rating metadata - result
blank no checked checked checked checked favorable try again
blank no checked checked unchecked checked unfavorable try again
blank no unchecked checked unchecked checked non-integer try again
blank yes unchecked unchecked unchecked unchecked favorable try again
blank yes unchecked unchecked unchecked unchecked invalid try again
filled no checked checked checked checked invalid try again
filled no checked unchecked checked unchecked non-integer try again
filled no unchecked checked unchecked unchecked favorable glad
filled no unchecked unchecked checked checked favorable glad
filled no unchecked unchecked checked checked unfavorable sorry
filled yes unchecked unchecked unchecked unchecked non-integer try again
filled yes unchecked unchecked unchecked unchecked unfavorable sorry
Even if the unfavorable rating is made a 1, we are still not testing the
“career change” result screen
page 18
19. Seeding Helps Achieve Desired Coverage
Using seeding we can intelligently adjust our set of inputs
We know we want at least one test that gets the “career
change” result. One way to do this is through seeding:
pict_seeds.txt
name likeSelenium slowTests changingUI usesLynx complainsALot rating
filled no checked checked checked checked unfavorable
C:Program Files (x86)PICT>pict.exe pict_inputs.txt /e:pict_seeds.txt > pict_outputs.xls
Attendee Survey
Now we are forcing PICT to include
the set of combinations that will You should consider a
generate the career change result career change.
Note: PICT is far more powerful than the options shown here
page 19
20. Seeding Helps Achieve Desired Coverage
With our seed in place, PICT generates a new set of combinations
pict_outputs.xls
name likeSelenium slowTests changingUI usesLynx complainsALot rating metadata - result
blank no checked checked unchecked checked unfavorable try again
blank no checked checked unchecked unchecked favorable try again
blank no checked unchecked checked checked favorable try again
blank no unchecked checked unchecked checked non-integer try again
blank no unchecked unchecked checked unchecked unfavorable try again
blank yes unchecked unchecked unchecked unchecked invalid try again
blank yes unchecked unchecked unchecked unchecked non-integer try again
filled no checked checked checked checked invalid try again
filled no checked checked checked checked unfavorable career change
filled no checked unchecked checked unchecked non-integer try again
filled yes unchecked unchecked unchecked unchecked favorable glad
filled yes unchecked unchecked unchecked unchecked unfavorable sorry
Excellent! We can test everything!
Too bad the real world isn’t this simple…
page 20
21. Real World Products Are Incredibly Complex
In the real world, test cases are extremely complicated!
• Store location with
custom drawn trade
area
• Map modification
• Competition layering
• Boundary
highlighting
• Enabling heat map
• Map Detail
• Shading attribute
selection
• Map controls
• Product navigation
page 21
22. Practical Test Case Selection
The approach we’ve found that works best revolves around both
coverage of inputs and metadata
• When there is not much metadata that needs to be considered, the first
pass using combinatorial tools works great at test case generation
• When there is lots of metadata, we generally intentionally start with too
many tests and then trim down using our coverage tools
• Using the code to programmatically populate metadata values during
test case execution helps keeps inputs and metadata in sync as we
adjust our test cases
• Seeding is a great way to force in the test cases to ensure the coverage
we want
• Evaluation of testing coverage is an iterative process during test case
development
page 22
23. Using The Test Cases
Writing many test cases, where only the inputs and outputs
change, results in a lot of code duplication
@Test
public void test_likesSelenium_likesConference() throws Exception {
surveyPage.enterName("Aaron");
surveyPage.selectLikesSelenium(true);
surveyPage.enterRating("5");
surveyPage.submit();
resultPage.synchronize();
Assert.assertEquals("Glad you liked the conference!", resultPage.getResponseMessage());
}
@Test
public void test_dislikes_Selenium_hatesConference() throws Exception {
surveyPage.enterName("Aaron");
surveyPage.selectLikesSelenium(false);
surveyPage.checkDislikeReason(1);
surveyPage.checkDislikeReason(3);
surveyPage.enterRating("1");
surveyPage.submit();
resultPage.synchronize();
Assert.assertEquals("Sorry you did not like the conference.", resultPage.getResponseMessage());
}
DRY!
Note: DRY == Do not Repeat Yourself
page 23
24. Database Backed Frameworks Avoid Duplication
At APT, we store inputs and outputs in the database, and access
them by simple helper methods
@Test
public void test_conferenceSurvey(int testCaseId) throws Exception {
setTestCaseId(testCaseId);
surveyPage.enterName(getInputValue("name"));
surveyPage.selectLikesSelenium(getInputValue("likesSelenium").equals("yes"));
String[] reasons = getInputValue("reasonList").split(",");
for (String s : reasons) {
surveyPage.enterRating(s);
}
surveyPage.submit();
resultPage.synchronize();
compareAgainstExpectedOutput("Result Page Message", resultPage.getResponseMessage());
}
Like me, this • Variable length inputs stored as lists
function can • Comparisons usually non-fatal
handle • Caching used to avoid continuously
everything!
querying the database
• Test cases have names and comments
associated with them
page 24
25. The Database Also Improves Testing Overall
Storing inputs and outputs in the database opens up many
possibilities for test case creation, evaluation and debugging
• Creating new test cases is easier
• Database-backed web-based test case editor often easier to use than
having inputs in code, excel, wiki, etc.
• Developing tools to evaluate coverage is easier
• No need to parse files, just query the database
• Easy to specify impossible pairs
• Easy to specify values not tested
• Debugging is easier for engineers
• History of inputs and outputs (linked to screenshots) can be tracked
• Easy to store reasons for change of input or expected output
page 25
26. Creating Test Cases Is Transparent and Easy
Following import from pairwise selection tools, database driven
test cases can now be easily edited and configured by anyone
Test cases easily evaluated and adjusted by:
• Testers
• Engineers
• Product Management
Note: Yes we know our internal tools like they are from 1990—but they work great!
page 26
27. Evaluating Coverage is Transparent and Easy
With complex tests, coverage reports make identifying both
extraneous and missing test cases easy
For this one test:
• 38 test cases currently cover 3132 of 7917 pairs
• 9 test cases are not adding any additional coverage These test cases
• It is still easy to identify gaps in testing—look for the red! need some tuning!
page 27
28. Debugging Results Is Transparent and Easy
Simple reporting pages can make debugging test failures and
updating test cases much easier
Ability to accept
new actual outputs
as the expected
This test case checked 657 outputs; 255 of them failed outputs going
forward
Ability to see history of output changes
Identifiers
easily Actual value with Expected value Date output was
recognized by link to screenshot with link to set and link to
engineers of when it was set screenshot of the information about
failure the test where it
was set from
page 28
29. Debugging Results Is Transparent and Easy
Screenshots can be stored in the database and associated with
specific test case executions
It is easy to view all screenshots for this run of test case, the last
time this test case passed, etc.
page 29
30. Easy To Integrate With Continuous Integration
With lots of information in the database, it is also easy to
aggregate up to the level desired for continuous integration
At APT we aggregate the results and then format them into junit style XML
files to allow easy integration with Jenkins
However each test case still provides a link to drill into the reports shown
in the previous slides that contain screenshots, inputs/outputs, logs, etc.
Note: Jenkins was formerly know as Hudson
page 30
31. The Big Picture
Combinatorial testing is just one part of a well-balanced
testing diet
Unit tests
Code coverage tools let us know which
lines of code we are and are not testing
Functional Tests
Combinatorial coverage tools let us know
which input combinations we are and are
not testing
Selenium Tests
Exploratory testing alerts us to new
Manual Tests important inputs and interactions to
consider for our automated test cases
page 31
32. The Big Picture
Database driven combinatorial testing lets you get more for less!
• End-to-end selenium tests can be complicated
and slow
• Combinatorial testing is a great way to reduce
the number of these slow tests while maximizing
testing coverage
• Test metadata must be considered along with
inputs when evaluating coverage
• For complex tests, tools for test case creation,
evaluating coverage, and debugging failures are
essential
• Storing inputs, outputs, and metadata in a
database makes developing test case
management and coverage reporting tools easy
page 32
33. Questions
Whew that was
a lot! Do you
think the
audience has
any questions?
page 33
34. Questions
Only one way to
find out! Look for
hands!
page 34