This document discusses running acceptance tests as monitors in production environments. It describes why this can be useful for failure detection, capacity planning, and gaining business insights. However, care must be taken to avoid polluting data or sending unnecessary alerts. The author demonstrates a tool called atam4j that treats acceptance tests as a microservice, similar to other applications, making them easier to deploy and monitor in production.
6. Signing off your build
Monitoring your environment
CC BY 2.0, https://commons.wikimedia.org/w/index.php?curid=587238
7. Monitoring: “Standard” health checks
Your
App
Upstream
App A
Upstream
App B
Your
App’s DB
Monitoring/
Alerting
GET /healthcheck
GET /ping
GET /ping
select * from dual;
Prometheus
Dashing,
Nagios etc etc
9. Monitor User/Client Activity
This is one of the best things you can do.
● If you can see actual events - you know it’s ok.
● Low cost - High Benefit
But… can you alert?
● Pretty Graph - Does it alert?
10. Low traffic apps
Is it broken - or is it Christmas Day?
Anomaly Detection can’t be responsive with low volumes
Maybe you can’t Alert?
11. Instead of alerting that build is broken, alert that prod is broken
They:
● pass or fail - simple!
● descriptive names - helpful!
● Can have helpful errors - helps incident response
Tests can lend themselves well to asserting the health of a system.
Build time tests can be used as monitoring
12. REDUCE
By Juergen Rosskamp, wiki+spam@eindruckschinderdomain.de - digital still picture, CC BY-SA 2.5, https://commons.wikimedia.org/w/index.php?curid=1877742
13. Know the Impact of an issue
Your
App
App 1
App 2
Your
App’s DB
Monitoring/
Alerting
GET /healthcheck GET /ping
select * from dual;
Search Results Test
Product Summary Test
Product Details Test
14. Supplementing a Canary release
Real Clients/Users
Load
Balancer
Old Version
New Version
Acceptance
Tests as
Monitors
17. Background
Website for a media company with login. Java code and Java tests.
Very concerned about site availability
We were sold. How do we do it?
18. 1. Login with correct username/password creates a session
2. Login with incorrect username/password gives error
user.for.test.monitoring.membership-team@thecompany.com
What to test?
19. First thought
“Let’s run (some of) our acceptance tests from Jenkins against production”
Not good:
● Does your CI Server get as much love as production in your org?
How do we do it?
20. Second thought
“Let’s run our tests from Nagios (monitoring server) in production”
Not good:
● Nagios (and it’s many forks/variants) rely on quick simple checks.
○ We ended with chaos!
● Harder to deploy
How do we do it?
21. Third thought
“Let’s just treat our acceptance tests as another microservice in production”
Great:
● Treated like any other service.
○ Development
○ Deployment
○ Monitoring
How do we do it?
22. atam4j
Acceptance Tests As Monitors 4 Java
Anurag Kapur @anuragkapur and myself created atam4j
https://github.com/atam4j
26. Solving the Test Data & Event issues
Data
● Can you hide it?
● Mark it as test data
○ Sometimes you might need to add extra fields “testData”: true
Events
● How far should events propagate ?
○ You might want to cut them short
○ Talk to your downstream systems consuming the data
● Always exclude from monitoring real transactions
27. Atam4j - New Features
● Prometheus Support? - Not yet, but it is on a branch… WIP
● Atam4Node? - Considering it!
28. Conclusion
It can have benefits
Cost Benefit will depend on your domain
If you do it:
● treat the tests as similar as possible to other tests
● treat the deployment and monitoring as similar as possible to other apps
Tools will help, but....
Hardest problems are specific to your domain