1. Search-Based Software Testing in Industry
---
Research collaborations and Lessons Learned
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
SBST, Hyderabad, 2014
2. SnT Software Verification and Validation Lab
• SnT centre, Est. 2009: Interdisciplinary,
ICT security-reliability-trust
• 200 scientists and Ph.D. candidates, 20
industry partners
• SVV Lab: Established January 2012,
www.svv.lu
• 25 scientists (Research scientists,
associates, and PhD candidates)
• Industry-relevant research on system
dependability: security, safety, reliability
• Six partners: Cetrel, CTIE, Delphi, SES,
IEE, Hitec …
• And we are always hiring!
2
3. An Effective, Collaborative Model of Research
and Innovation
Basic
Research
Applied
Research
Innova3on
&
Development
• Basic and applied research take place in a rich context
• Basic Research is also driven by problems raised by applied
research, which is itself fed by innovation and development
• Publishable research results and focused practical solutions that
serve an existing market. 3
Schneiderman, 2013
4. Collaboration in Practice
• Well-defined problems in context
• Realistic evaluation
• Long term industrial collaborations
4
Problem
Formulation
Problem
Identification
State of the
Art Review
Candidate
Solution(s)
Initial
Validation
Training
Realistic
Validation
Industry
Partners
Research
Groups
1
2
3
4
5
7
Solution
Release
8
6
5. Outline
• Four projects:
– Testing PID controllers in the automotive industry (Delphi)
– Robustness testing of a video conference system (Cisco)
– Environment-based testing of a seismic acquisition system
(WesternGeco)
– Schedulability analysis and stress testing of safety-critical
drivers in the oil&gas industry (Kongsberg)
• Lessons learned, patterns, discussions
• Meant to be an interactive talk – I am also here to learn
5
6. Acknowledgements
PhD. Students:
• Marwa Shousha
• Shaukat Ali
• Zohaib Iqbal
• Hadi Hemmati
• Reza Matinnejad
• Stefano Di Alesio
Research Associates/Scientists, former colleagues:
• Shiva Nejati
• Andrea Arcuri
• Arnaud Gotlieb
• Yvan Labiche
6
7. Testing PID Controllers (Delphi)
References:
7
• R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, “MiL Testing of Highly Configurable
Continuous Controllers: Scalable Search Using Surrogate Models”, Submitted (2104)
• R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull, , “Search-Based Automated
Testing of Continuous Controllers: Framework, Tool Support, and Case Studies”,
forthcoming in Information and Software Technology (2014)
• R. Matinnejad, S. Nejati, L. Briand, T. Bruckmann, C. Poull , “Automated Model-in-the-Loop
Testing of Continuous Controllers using Search”, in 5th Symposium on Search-Based
Software Engineering (SSBSE 2013), Springer Lecture Notes in Computer Science (2013,
August)
10. Controllers at MIL
10
Plant Model
+
+
+
⌃
+
-
e(t)
actual(t)
desired(t)
⌃
KP e(t)
KD
de(t)
dt
KI
R
e(t) dt
P
I
D
output(t)
Inputs: Time-dependent variables
Configuration Parameters
11. Inputs, Outputs, Test Objectives
11
InitialDesired
(ID)
Desired ValueI (input)
Actual Value (output)
FinalDesired
(FD)
time
T/2 T
Smoothness
Responsiveness
Stability
12. Process and Technology
12
HeatMap
Diagram
1. Exploration
List of
Critical
RegionsDomain
Expert
Worst-Case
Scenarios
+
Controller-
plant
model
Objective
Functions
based on
Requirements
2. Single-State
Search
Continuous Controller Tester
(a) Liveness (b) Smoothness
13. Testing in the Configuration Space
• MIL testing for all feasible configurations
• The search space is much larger
• The search is much slower (Simulations of Simulink models are
expensive)
• Not all configuration parameters matter for all objective functions
• Results are harder to visualize
13
14. Modified Process and Technology
14
+
Controller
Model
(Simulink)
Worst-Case
Scenarios
List of
Critical
PartitionsRegression
Tree
1.Exploration with
Dimensionality
Reduction
2.Search with
Surrogate
Modeling
Objective
Functions
Domain
Expert
Visualization of the
8-dimension space
using regression treesDimensionality
reduction to identify
the significant variables
Surrogate modeling
to predict the objective
function and
speed up the search
15. Dimensionality Reduction
• Sensitivity Analysis:
Elementary Effect Analysis
(EEA)
• Identify non-influential
inputs in computationally
costly mathematical
models
• Requires less data points
than other techniques
• Observations are
simulations generated
during the Exploration step
• Compute sample mean
and standard deviation for
each dimension of the
distribution of elementary
effects
15
Cal5
ID
Cal3
FD
Cal4
Cal6
Cal1,Cal2
0.6
0.4
0.2
0.0
SampleStandardDeviation()
-0.6 -0.4 -0.2 0.0 0.2
Sample Mean ( )
⇤10 2
⇤10 2
Si
i
16. Visualization in Inputs & Configuration Space
16
All Points
FD>=0.43306
Count
Mean
Std Dev
Count
Mean
Std Dev
FD<0.43306
Count
Mean
Std Dev
ID>=0.64679
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.020847 Cal5>0.020847
Count
Mean
Std Dev
Count
Mean
Std Dev
Cal5>=0.014827 Cal5<0.014827
Count
Mean
Std Dev
Count
Mean
Std Dev
1000
0.007822
0.0049497
ID<0.64679
574
0.0059513
0.0040003
426
0.0103425
0.0049919
373
0.0047594
0.0034346
201
0.0081631
0.0040422
182
0.0134555
0.0052883
244
0.0080206
0.0031751
70
0.0106795
0.0052045
131
0.0068185
0.0023515 Regression Tree
17. Surrogate Modeling
17
• Any supervised learning or
statistical technique
providing fitness predictions
with confidence intervals
1. Predict higher fitness with
high confidence: Move to
new position, no simulation
2. Predict lower fitness with
high confidence: Do not
move to new position, no
simulation
3. Low confidence in
prediction: Simulation
Surrogate Model
Real Function
x
Fitness
18. Results
• Search yielded worst-case scenarios that were much worse than
known and expected scenarios
• Surrogate modeling: Polynomial regression yielded best fit and
predictive power so far
• Dimensionality reduction helps generate better surrogate models
• Surrogate modeling can yield up to an eight-fold increase in search
speed
• Surrogate modeling can help find more critical requirements violations
• By accounting for variations in configurations, we found more critical
requirements violations than just with the HIL configuration
18
19. Robustness Testing of a Video Conference System
(Cisco)
References:
19
• S. Ali, Briand, H. Hemmati, “Modeling Robustness Behavior Using Aspect-Oriented
Modeling to Support Robustness Testing of Industrial Systems”, Journal of Software
and Systems Modeling (Springer), 2011
• S. Ali, M. Z. Iqbal, A. Arcuri, L. Briand, “Generating Test Data from OCL Constraints
with Search Techniques”, IEEE Transactions on Software Engineering, 2013
22. Robustness
• Robustness is the degree to which a software
component functions correctly in the presence of
exceptional inputs or stressful environmental
conditions (IEEE Std 610.12-1990)
• Significant additional complexity lies with handling the
robustness properties
– Network communication faults
– Media quality faults in media streams
– Faults in the endpoints
22
24. Model-Based Testing (MBT)
24
• Goals: Scalability, complete automation
• Model-based Testing (MBT) uses models of the system for test case
and oracle generation
– The models typically describe some aspects of system under test
– Increasingly used for complete test automation, e.g., aerospace,
automotive, banking
• Often using well-established standards for modeling and their
extensions: UML (profiles), OCL, etc.
• Requirements:
– Test-ready models
– Appropriate test strategies, e.g., path selection
– Test data generation
– Oracles
26. Test Data Generation for MBT
• Test data is needed to execute program paths as required by a
coverage criterion during testing
• For MBT, test data is typically an instance of a class diagram
• Instances must fulfill invariants
• Paths in state machines carry constraints (guards) on conditions
• To generate test data for UML/OCL models, we need to solve
OCL constraints written on the models
26
context Student inv ageConstraint:
self.age > 15 and self.age < 80
27. Example OCL expression in VC Model
27
context Saturn inv synchronizationConstraint: !
!self.systemUnit.NumberOfActiveCalls > 1 and !
!self.systemUnit.NumberOfActiveCalls <= !
! ! ! ! ! !
!self.systemUnit.MaximumNumberOfActiveCalls !
!and !
!self.media.synchronizationMismatch.unit = TimeUnitKind::s and !!
!(!
! !self.media.synchronizationMismatch.value >= 0 and !
! !self.media.synchronizationMismatch.value <= ! ! !
!!
! ! !self.media.synchronizationMismatchThreshold.value!
!) and !
!self.conference.PresentationMode = Mode::Off and !
!self.conference.call→select(call | !
! !call.incomingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2 !
!and !!
!self.conference.call→select(call | !
! ! call.outgoingPresentationChannel.Protocol <> VideoProtocol::Off)→size()=2!
28. OCL Constraint Solvers
• A number approaches for OCL constraint solving
• Not complete
– Support subset of OCL
• Lack of proper tool support
– A number of approaches are not automated
• Not scalable
– Often based on translation (e.g., to CSP)
– Combinatorial explosion
28
29. A Search Problem
• We used an alternate approach by applying the search-based testing
(SBT) concepts to solve OCL constraint
• The process of generating test data can be seen as a search process
– There is a huge number of possible instances that can be
generated for a particular model
– We need to select instances that solve the constraint
• Fitness defined as a distance function d()
– d() returns 0 if the constraint is solved
– otherwise a value that heuristically estimates how far the constraint
was from being evaluated as true
29
30. Challenges
• Primitive Types, Boolean Operators
• Operations on Collections, Iterators
• Fine grained fitness functions for iterators using size, oclInState
• Consider a collection C = {1, 2, 3} and a constraint C→forAll(x|x= 0)
d(C->forAll(x|x=0)) ! d(C.at(i) = 0)/C->size()
! (d(1 = 0) + d(2=0) + d(3=0))/3
! (2 + 3 + 4)/3
! 3
• Many complex rules for the computations of fitness functions based on
OCL expressions
• Fine grained heuristics -> maximum guidance
30
31. VC Model and Results
• UML Class diagram, state machines, OCL
• 20 subsystems, on average 5 states and 11
transitions (largest: 22 states – 63 transitions)
• OCL: 144 constraints as guards, 100 invariants, and
57 change events
• Results:
– All constraints were resolved
– Maximum time: ~ 2 minutes on laptop
31
32. Environment-Based Testing of a Seismic Acquisition
System (WesternGeco)
References:
32
• Z. Iqbal, A. Arcuri, L. Briand, “Empirical Investigation of Search Algorithms for Environment
Model-Based Testing of Real-Time Embedded Software”, ACM ISSTA, 2012
• Z. Iqbal, A. Arcuri, L. Briand, “Environment Modeling and Simulation for Automated Testing
of Soft Real-Time Embedded Software”, Software and System Modeling (Springer), 2014
33. Objectives
• Model-based System testing
– Black-box
– Environment models
33
Environment
Simulator
Test cases
Environment Models
Test oracle
36. Test Case Generation
• Test objectives: Reach “error” states (critical environment states)
• Test Case: (1) Environment and (2) Simulation Configuration
– (1) Number of instances for each component in domain model,
e.g., number of items on conveying belt
– (2) Setting non-deterministic properties of the environment, e.g.,
speed of sorter’s left and right arms
• Oracle: Reaching an “error” state
• SBST: Heuristics
– Distance from error state
– Distance from satisfying OCL guards
– Time distance
– Time in “risky” states
– …
36
37. Schedulability Analysis and Stress Testing of Safety-
Critical Drivers (Kongsberg Maritime)
References:
37
• L. Briand, Y. Labiche, and M. Shousha, “Using genetic algorithms for early schedulability
analysis and stress testing in real-time systems”, Genetic Programming and Evolvable
Machines, vol. 7 no. 2, pp. 145-170, 2006
• S. Nejati, S. Di Alesio, M. Sabetzadeh, and L. Briand, “Modeling and analysis of cpu usage in
safety-critical embedded systems to support stress testing,” in Model Driven Engineering
Languages and Systems. Springer, 2012, pp. 759–775.
• S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Stress Testing of Task Deadlines: A Constraint
Programming Approach”, ISSRE 2013, San Jose, USA!
• S. Di Alesio, S. Nejati, L. Briand. A. Gotlieb, “Worst-Case Scheduling of Software Tasks – A
Constraint Optimization Model to Support Performance Testing, Constraint Programming (CP),
2014
38. Fire/Gas Detection and Emergency Shutdown
38
Drivers
(Software-Hardware Interface)
Control Modules Alarm Devices
(Hardware)
Multicore Archt.
Real Time Operating System
Monitor gas leaks and fire in oil
extraction platforms
39. Performance Requirements are Hard to Verify
39
They constraint the entire system’s behavior
and thus can’t be checked locally
They depend on the environment the
software interacts with (hw devices)
They depend on the computing platform
on which the software runs
40. Schedulability Analysis and Testing
• RTES have concurrent interdependent tasks which have to
finish before their deadlines
• Each task has a deadline (i.e., latest finishing time) w.r.t. its
arrival time
• Some task properties depend on the environment, some are
design choices
• Tasks can trigger other tasks, and can share computational
resources with other tasks
• Schedulability analysis encompasses techniques that try to
predict whether all (critical) tasks are schedulable, i.e., meet
their deadlines
• Stress testing runs carefully selected test cases that have a high
probability of leading to deadline misses
40
41. Arrival Times Determine Deadline Misses
41
0
1
2
3
4
5
6
7
8
9
𝒋 𝟎, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟏, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒋 𝟐 arrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and mustarrive at 𝒂𝒕 𝟎, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟏, 𝒂𝒕 𝟐 and must, 𝒂𝒕 𝟐 and mustand must
finish before 𝒅𝒍 𝟎, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟏, 𝒅𝒍 𝟐, 𝒅𝒍 𝟐
𝒋 𝟏 can miss its deadline 𝒅𝒍 𝟏 depending oncan miss its deadline 𝒅𝒍 𝟏 depending ondepending on
when 𝒂 𝒕↓ 𝟐 occurs!
0
1
2
3
4
5
6
7
8
9
𝒋 𝟐
𝒂𝒕 𝟐
𝒅𝒍 𝟐
𝒋 𝟏
𝒂𝒕 𝟏
𝒅𝒍 𝟏
𝑻
𝒋 𝟎
𝒂𝒕 𝟎
𝒅𝒍 𝟎
𝒋 𝟐
𝒂𝒕 𝟐
𝒅𝒍 𝟐
𝒋 𝟏
𝒂𝒕 𝟏
𝒅𝒍 𝟏
𝑻
𝒋 𝟎
𝒂𝒕 𝟎
𝒅𝒍 𝟎
42. Search-Based Approaches
• This problem can be tackled as a search problem in the space
of arrival times for aperiodic tasks
• Identify worst-case scenarios for testing
• No assumptions
• Genetic algorithms: Briand et al., 2003-2006
• Constraint Programming (e.g., OPL, ILOG CP Optimizer)
– Nejati et al., 2012
– Di Alesio et al., 2013-2014
42
43. Constraint Optimization
43
Constraint Optimization Problem
Static Properties of Tasks
(Constants)
Dynamic Properties of
Tasks
(Variables)
Performance Requirement
(Objective Function)
OS Scheduler Behaviour
(Constraints)
44. Process and Technologies
44
UML Modeling
Automated Search
Optimization Problem
(Find arrival times that maximize
the chance of deadline misses)
System Platform
Solutions
(Task arrival times likely to
lead to deadline misses)
Deadline Misses
Analysis
System Design Design Model (Time
and Concurrency
Information)
INPUT
OUTPUT
Genetic
Algorithms
(GA)
Stress Test Cases
Constraint
Programming
(CP)
𝒂 𝒕↓ 𝟎 =𝟏
𝒂 𝒕↓ 𝟏 =𝟑
𝒂 𝒕↓ 𝟐 =𝟒
45. Results and Current Work
• GA tends to be more efficient but less effective than CP
– More efficient: Find deadline misses quicker
– More effective: Find worse deadline misses
• CP is deterministic, evolutionary search is randomized
• For testing we want a diverse test of stress test cases
• Combining GA and CP (Di Alesio’s dissertation):
– Achieve an efficiency close to GA and an effectiveness close to
CP
– Use GA first and improve worst solutions found by GA by
performing a CP complete search in the neighborhood of
solutions
– Results on five case studies are very encouraging
45
46. SBST in Industry: Discussion
• Scalability
• Applicability
• Variety of heuristics as a function of test objectives, available
information, assumptions, etc.
• Search as a piece of the solution: multidisciplinarity
• Combining search with other techniques: Likely candidates
46
47. Scalability
• Search spaces are huge in practice
• Fitness computation is often computationally-intensive
• Test execution can be expensive
– Web applications or phone apps versus embedded systems
with HIL
– Models, simulation to guide the search
• Simulation is always expensive
– Simulink models, e.g., 31s for a 2s simulation
– Surrogate modeling?
• In many situations, models of the system can help guide the search
47
48. Applicability
• Many academic solutions are not applicable in practice
• Context matters
• Scalability -> applicability
• But also inputs required for guiding the search
• Integrated to the rest of the development process
– E.g., design models, WCET analysis, Simulink development
48
49. A Large Variety of Heuristics
• Test objectives differ a great deal depending on context
– Performance, robustness, critical environment states …
• Available information also differs, both for guiding test generation
and oracles
– Purely black-box testing
– Design information, e.g., through models
• Working assumptions
– About process, technology, …
– E.g., availability of plant/environment models in Simulink
• In a given context, some degree of tailoring is usually required for
applying SBST
49
50. Multidisciplinarity
• Typically, meta-heuristic search is only part of a solution to a
testing problem
• Dedicated system or environment modeling, e.g., in Cisco and
WesternGeco studies
• Machine learning, e.g., regression trees in Delphi study
• Statistical analysis, e.g., EEA and non-linear regression in
Delphi study
• Constraint programming, e.g., in Kongsberg study
50
51. Search-Based Software Testing in Industry
---
Research collaborations and Lessons Learned
Lionel Briand
Interdisciplinary Centre for ICT Security, Reliability, and Trust (SnT)
University of Luxembourg, Luxembourg
SBST, Hyderabad, 2014
SVV lab: svv.lu
SnT: www.securityandtrust.lu