SlideShare a Scribd company logo
1 of 106
Download to read offline
#atlassian
Heavenly Hell 
Automated Tests at Scale 
WOJCIECH SELIGA • SENIOR DEV MANAGER • ATLASSIAN • @WSELIGA
About me 
• Coding since 6 yo 
• Agile Practices (inc. TDD) since 2003 
• Dev Nerd, Tech Leader, Agile Coach, 
Speaker, PHB 
• 7 years with Atlassian 
(JIRA Senior Dev Manager) 
• Spartez Co-founder & CEO
XP Promise 
Cost of Change 
Waterfall 
XP 
Time
XP Promise 
Cost of Change 
Waterfall 
XP 
Time
The Story
About 2.5 years ago
Almost 10 years 
of accumulating 
legacy automatic tests
About 20 000 tests 
on all levels of abstraction 
*just in core JIRA
Very slow (even hours) 
and fragile feedback loop
Serious performance and 
reliability issues
Dispirited devs 
accepting RED as a norm
Feedback 
Speed 
` 
Test 
Quality
Design 
Restructure 
Share 
Respect 
Prune 
Test Code is Not Trash 
Refactor Maintain 
Review 
Discuss 
Rewrite
Test Pyramid 
Selenium 
REST / HTML Tests 
Unit Tests (including QUnit) 
Fastest, lowest overall confidence 
Slowest, highest overall confidence
Selenium 
REST / HTML Tests 
Unit Tests (including QUnit) 
Test Pyramid 
90% 
9% 
1%
Optimum Balance
Optimum Balance 
Isolation
Optimum Balance 
Isolation Speed
Optimum Balance 
Isolation Speed Coverage
Optimum Balance 
Isolation Speed Coverage Level
Optimum Balance 
Isolation Speed Coverage Level Access
Optimum Balance 
Isolation Speed Coverage Level Access Effort
Dangerous to temper with
Dangerous to temper with 
Quality / Determinism
Dangerous to temper with 
Quality / Determinism Maintainability
Almost two years later…
People
People - Motivation 
Making GREEN the norm
Shades of Red
Build Tiers and Policy 
Tier A1 - green soon after all commits 
unit tests and functional* tests 
Tier A2 - green at the end of the day 
WebDriver and bundled plugins tests 
Tier A3 - green at the end of the iteration 
supported platforms tests, compatibility tests
Wallboards: 
Constant 
Awareness
Training 
• Favouring assertThat over assertTrue/False and assertEquals 
• Avoiding races - Atlassian Selenium with its TimedElement 
• Favouring unit tests over functional tests (including QUnit 
over WebDriver) 
• Promoting Page Objects 
• Brownbags, blog posts, code reviews
Quality
Re-run failed tests and see if they pass 
Automatic Flakiness Detection 
Quarantine
Quarantine - Healing
SlowMo - expose races
Selenium 1
Selenium 1
Selenium ditching 
Sky did not fall in
Ditching - benefits 
• Freed build agents - better system throughput 
• Boosted morale 
• Gazillion of developer hours saved 
• Money saved on infrastructure
Ditching - due diligence 
• conducting the audit - analysis of the coverage we lost 
• determining which tests needs to rewritten (e.g. security related) 
• rewriting the tests (good job for new hires + a senior mentor)
Flaky Browser-based Tests 
Races between test code and asynchronous page logic 
Playing with "loading" CSS class does not really help
Races Removal with Tracing 
// in the browser:! 
function mySearchClickHandler() {! 
doSomeXhr().always(function() {! 
// This executes when the XHR has completed (either success or failure)! 
JIRA.trace("search.completed");" 
});! 
}! 
// In production code JIRA.trace is a no-op 
// in my page object:! 
@Inject! 
TraceContext traceContext;! 
! 
public SearchResults doASearch() {! 
Tracer snapshot = traceContext.checkpoint();! 
getSearchButton().click(); // causes mySearchClickHandler to be invoked! 
// This waits until the "search.completed" 
// event has been emitted, *after* previous snapshot ! 
traceContext.waitFor(snapshot, "search.completed"); ! 
return pageBinder.bind(SearchResults.class);! 
}!
Speed
Can we halve our build times?
Parallel Execution - Theory 
End of Build 
Batches 
Start of Build
Parallel Execution 
End of Build 
Batches 
Start of Build
Parallel Execution - Reality Bites 
End of Build 
Agent 
availability 
Batches 
Start of Build
Dynamic Test Execution 
Dispatch - Hallelujah
Dynamic Test Execution 
Dispatch - Hallelujah
"You can't manage what 
you can't measure." 
not by W. Edwards Deming
If you believe just in it 
you are doomed. 
"You can't manage what 
you can't measure." 
not by W. Edwards Deming
You can't improve the system 
if you can't measure it
You can't improve the system 
if you can't measure it 
Profiler, Build statistics, Logs, statsd → Graphite
Compilation 
Packaging 
Executing Tests 
Anatomy of Build*
Fetching Dependencies 
Compilation 
Packaging 
Executing Tests 
Anatomy of Build*
Fetching Dependencies 
Compilation 
Packaging 
Executing Tests 
Anatomy of Build* 
*Any resemblance to maven build is entirely accidental
Fetching Dependencies 
Compilation 
Packaging 
Executing Tests 
SCM Update 
Anatomy of Build* 
*Any resemblance to maven build is entirely accidental
Agent Availability/Setup 
Fetching Dependencies 
Compilation 
Packaging 
Executing Tests 
SCM Update 
Anatomy of Build* 
*Any resemblance to maven build is entirely accidental
Agent Availability/Setup 
Fetching Dependencies 
Compilation 
Packaging 
Publishing Results 
Executing Tests 
SCM Update 
Anatomy of Build* 
*Any resemblance to maven build is entirely accidental
Compilation (7min) 
JIRA Unit Tests Build
Compilation (7min) 
Packaging (0min) 
JIRA Unit Tests Build
Compilation (7min) 
Packaging (0min) 
Executing Tests (7min) 
JIRA Unit Tests Build
Compilation (7min) 
Publishing Results (1min) 
Packaging (0min) 
Executing Tests (7min) 
JIRA Unit Tests Build
Compilation (7min) 
Publishing Results (1min) 
Packaging (0min) 
Executing Tests (7min) 
Fetching Dependencies (1.5min) 
JIRA Unit Tests Build
Compilation (7min) 
Publishing Results (1min) 
Packaging (0min) 
Executing Tests (7min) 
SCM Update (2min) 
Fetching Dependencies (1.5min) 
JIRA Unit Tests Build
Agent Availability/Setup (mean 10min) 
Compilation (7min) 
Publishing Results (1min) 
Packaging (0min) 
Executing Tests (7min) 
SCM Update (2min) 
Fetching Dependencies (1.5min) 
JIRA Unit Tests Build
Decreasing test 
execution time to 
ZERO 
alone would not let 
us achieve our goal!
Agent Availability/Setup 
• starved builds due to 
busy agents building 
very long builds 
• time synchronization 
issue - NTPD problem
SCM Update - Checkout time 
• Proximity of SCM repo 
• shallow git clones are not so fast and lightweight + generating extra 
git server CPU load 
• git clone per agent/plan + git pull + git clone per build (hard links!) 
• Much less load on Stash server (no need to queue up)
SCM Update - Checkout time 
• Proximity of SCM repo 
• shallow git clones are not so fast and lightweight + generating extra 
git server CPU load 
• git clone per agent/plan + git pull + git clone per build (hard links!) 
• Much less load on Stash server (no need to queue up) 
2 min → 5 seconds
Fetching Dependencies 
• Fix Predator 
• Sandboxing/isolation agent trade-off: 
rm -rf $HOME/.m2/repository/com/atlassian/* 
into 
find $HOME/.m2/repository/com/atlassian/ 
-name “*SNAPSHOT*” | xargs rm 
• Network hardware failure found 
(dropping packets)
Fetching Dependencies 
• Fix Predator 
• Sandboxing/isolation agent trade-off: 
rm -rf $HOME/.m2/repository/com/atlassian/* 
into 
find $HOME/.m2/repository/com/atlassian/ 
-name “*SNAPSHOT*” | xargs rm 
• Network hardware failure found 
(dropping packets) 
1.5 min → 10 seconds
Compilation 
• Restructuring multi-pom maven project and dependencies 
• Maven 3 parallel compilation FTW! 
-T 1.5C 
*optimal factor thanks to scientific trial and error research
Compilation 
• Restructuring multi-pom maven project and dependencies 
• Maven 3 parallel compilation FTW! 
-T 1.5C 
*optimal factor thanks to scientific trial and error research 
7 min → 1 min
Unit Test Execution 
• Splitting unit tests into 2 buckets: good and legacy (much longer) 
• Maven 3 parallel test execution (-T 1.5C) 
3000 poor tests 
(5min) 
11000 good tests 
(1.5min) 
Rewritten entirely 
over next year
Unit Test Execution 
• Splitting unit tests into 2 buckets: good and legacy (much longer) 
• Maven 3 parallel test execution (-T 1.5C) 
3000 poor tests 
(5min) 
11000 good tests 
(1.5min) 
7 min → 5 min 
Rewritten entirely 
over next year
Functional Tests 
• Selenium 1 removal did help 
• Faster reset/restore (avoid unnecessary stuff, intercepting SQL 
operations for debug purposes - building stacktraces is costly) 
• Restoring via Backdoor REST API (JIRA TestKit) 
• Using REST API for common setup/teardown operations
Functional Tests
Publishing Results 
• Server log allocation per test → using now Backdoor 
REST API (was Selenium) 
• Bamboo DB performance degradation for rich build 
history
Publishing Results 
• Server log allocation per test → using now Backdoor 
REST API (was Selenium) 
• Bamboo DB performance degradation for rich build 
history 
1 min → 40 s
Unexpected Problem 
• Stability Issues with our CI server (hardware) 
• The bottleneck changed from I/O to CPU 
• Too many agents per physical machine
Compilation (1min) 
JIRA Unit Tests Build Improved
Compilation (1min) 
Packaging (0min) 
JIRA Unit Tests Build Improved
Compilation (1min) 
Packaging (0min) 
Executing Tests (5min) 
JIRA Unit Tests Build Improved
Compilation (1min) 
Packaging (0min) 
Publishing Results (40sec) 
Executing Tests (5min) 
JIRA Unit Tests Build Improved
Compilation (1min) 
Packaging (0min) 
Publishing Results (40sec) 
Executing Tests (5min) 
Fetching Dependencies (10sec) 
JIRA Unit Tests Build Improved
Compilation (1min) 
Packaging (0min) 
Publishing Results (40sec) 
Executing Tests (5min) 
SCM Update (5sec) 
Fetching Dependencies (10sec) 
JIRA Unit Tests Build Improved
Agent Availability/Setup (3min)* 
Compilation (1min) 
Packaging (0min) 
Publishing Results (40sec) 
Executing Tests (5min) 
SCM Update (5sec) 
Fetching Dependencies (10sec) 
JIRA Unit Tests Build Improved
Improvements Summary 
Tests Before After Improvement % 
Unit tests 29 min 17 min 41% 
Functional tests 56 min 34 min 39% 
WebDriver tests 39 min 21 min 46% 
Overall 124 min 72 min 42% 
* Additional ca. 5% improvement expected once new git clone 
strategy is consistently rolled-out everywhere
Better speed increases 
responsibility 
Fewer commits (authors) per single build 
vs.
The Quality Follows
The Quality Follows
The Quality Follows
But that's still bad 
We want CI feedback loop in a few minutes maximum
Splitting The Codebase
Inevitable Split - Fears 
• Organizational concerns - understanding, managing, 
integrating, releasing, coordinating 
• Mindset change - if something worked for 10+ years why 
to change it? 
• Trust - does this library still work? 
• We damned ourselves with big buckets for all tests - 
where do they belong to?
Splitting code base 
• Step 0 - JIRA Importers Plugin (3.5 years ago) 
• Step 1- New Issue View and Navigator 
• Step 2 - now everything else follows (e.g. Workflow Designer) 
JIRA 6.0
Getting back from hell to heaven is difficult. 
Hell sucks in your soul.
Key takeaways: 
• Visibility and problem awareness help 
• Maintaining huge testbed is difficult and costly 
• Measure the problem - to baseline 
• No prejudice - no sacred cows 
• Automated tests are not one-off investment, it's a continuous journey 
• Performance is a damn important feature 
#atlassian
Test performance 
is a damn 
important feature!
XP vs Sad Reality 
Cost of Change 
Waterfall 
XP - ideal 
Sad Reality 
Time
Images - Credits 
• Green Traffic Light - by flrnt, CC-BY-SA-2.0 
• Turtle - by Jonathan Zander, CC-BY-SA-3.0 
• Loading - by MatthewJ13, CC-SA-3.0 
• Merlin Tool - by L. Mahin, CC-BY-SA-3.0 
• Flashing Red Light - by Chris Phan, CC BY 2.0 
• In Heaven - by Daniel Pascoal, CC BY-NC-ND 2.0
Thank you! 
WOJCIECH SELIGA • SENIOR DEV MANAGER • ATLASSIAN • @WSELIGA

More Related Content

What's hot

Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
Atlassian
 
Cultivating Content: Designing Wiki Solutions That Scale
Cultivating Content: Designing Wiki Solutions That ScaleCultivating Content: Designing Wiki Solutions That Scale
Cultivating Content: Designing Wiki Solutions That Scale
colleenfry
 

What's hot (20)

Scaling Atlassian - What's New in Data Center
Scaling Atlassian - What's New in Data CenterScaling Atlassian - What's New in Data Center
Scaling Atlassian - What's New in Data Center
 
Kanban in Action Workshop
Kanban in Action WorkshopKanban in Action Workshop
Kanban in Action Workshop
 
A Brave Journey in Merge Waters: How Paysafe Consolidated Their Atlassian Tools
A Brave Journey in Merge Waters: How Paysafe Consolidated Their Atlassian ToolsA Brave Journey in Merge Waters: How Paysafe Consolidated Their Atlassian Tools
A Brave Journey in Merge Waters: How Paysafe Consolidated Their Atlassian Tools
 
Seven deadly wastes
Seven deadly wastesSeven deadly wastes
Seven deadly wastes
 
SAf
SAfSAf
SAf
 
Modern agile v2.0 by Artem Bykovets
Modern agile v2.0 by Artem BykovetsModern agile v2.0 by Artem Bykovets
Modern agile v2.0 by Artem Bykovets
 
DaKiRY_BAQ2016_QADay_Артем Биковець «Agile testing»
DaKiRY_BAQ2016_QADay_Артем Биковець «Agile testing»DaKiRY_BAQ2016_QADay_Артем Биковець «Agile testing»
DaKiRY_BAQ2016_QADay_Артем Биковець «Agile testing»
 
Black Belt Tips for JIRA Software
Black Belt Tips for JIRA SoftwareBlack Belt Tips for JIRA Software
Black Belt Tips for JIRA Software
 
Agile scrum как не угробить ваш продукт простым инструментом, Артем Быковец
Agile scrum как не угробить ваш продукт простым инструментом, Артем БыковецAgile scrum как не угробить ваш продукт простым инструментом, Артем Быковец
Agile scrum как не угробить ваш продукт простым инструментом, Артем Быковец
 
Test team dynamics, Антон Мужайло
Test team dynamics, Антон МужайлоTest team dynamics, Антон Мужайло
Test team dynamics, Антон Мужайло
 
Intro to Kanban
Intro to KanbanIntro to Kanban
Intro to Kanban
 
Advocating Adoption: Best Practices for User-Friendly Jira Configurations
Advocating Adoption: Best Practices for User-Friendly Jira ConfigurationsAdvocating Adoption: Best Practices for User-Friendly Jira Configurations
Advocating Adoption: Best Practices for User-Friendly Jira Configurations
 
Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
Scrum Shock Therapy: Going Back to Basics - Atlassian Summit 2012
 
Modern Professional Scrum using Flow and Kanban - Agile and Beyond Detroit 2019
Modern Professional Scrum using Flow and Kanban - Agile and Beyond Detroit 2019Modern Professional Scrum using Flow and Kanban - Agile and Beyond Detroit 2019
Modern Professional Scrum using Flow and Kanban - Agile and Beyond Detroit 2019
 
Fiverr - delivering fast w/ no QA - Agile Israel 2016 Gil Wasserman
Fiverr - delivering fast w/ no QA - Agile Israel 2016   Gil WassermanFiverr - delivering fast w/ no QA - Agile Israel 2016   Gil Wasserman
Fiverr - delivering fast w/ no QA - Agile Israel 2016 Gil Wasserman
 
Scaling And Measuring Agile
Scaling And Measuring AgileScaling And Measuring Agile
Scaling And Measuring Agile
 
How Atlassian Manages Risk and Compliance with JIRA Software and Confluence
How Atlassian Manages Risk and Compliance with JIRA Software and ConfluenceHow Atlassian Manages Risk and Compliance with JIRA Software and Confluence
How Atlassian Manages Risk and Compliance with JIRA Software and Confluence
 
Adopting SAFe with JIRA
Adopting SAFe with JIRAAdopting SAFe with JIRA
Adopting SAFe with JIRA
 
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project ServicesAgile North East Agile + DevOps by Craig Pearson of CAP Project Services
Agile North East Agile + DevOps by Craig Pearson of CAP Project Services
 
Cultivating Content: Designing Wiki Solutions That Scale
Cultivating Content: Designing Wiki Solutions That ScaleCultivating Content: Designing Wiki Solutions That Scale
Cultivating Content: Designing Wiki Solutions That Scale
 

Similar to Heavenly hell – automated tests at scale wojciech seliga

Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
CIVEL Benoit
 

Similar to Heavenly hell – automated tests at scale wojciech seliga (20)

Escaping Automated Test Hell - One Year Later
Escaping Automated Test Hell - One Year LaterEscaping Automated Test Hell - One Year Later
Escaping Automated Test Hell - One Year Later
 
Escaping Test Hell - ACCU 2014
Escaping Test Hell - ACCU 2014Escaping Test Hell - ACCU 2014
Escaping Test Hell - ACCU 2014
 
Escaping Test Hell - Our Journey - XPDays Ukraine 2013
Escaping Test Hell - Our Journey - XPDays Ukraine 2013Escaping Test Hell - Our Journey - XPDays Ukraine 2013
Escaping Test Hell - Our Journey - XPDays Ukraine 2013
 
Continuous Integration on AWS
Continuous Integration on AWSContinuous Integration on AWS
Continuous Integration on AWS
 
Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...Getting your mobile test automation process in place - using Cucumber and Cal...
Getting your mobile test automation process in place - using Cucumber and Cal...
 
Comprehensive Performance Testing: From Early Dev to Live Production
Comprehensive Performance Testing: From Early Dev to Live ProductionComprehensive Performance Testing: From Early Dev to Live Production
Comprehensive Performance Testing: From Early Dev to Live Production
 
CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!CBDW2014 - MockBox, get ready to mock your socks off!
CBDW2014 - MockBox, get ready to mock your socks off!
 
How to Un-Flake Flaky Tests - A New Hire's Toolkit
How to Un-Flake Flaky Tests - A New Hire's ToolkitHow to Un-Flake Flaky Tests - A New Hire's Toolkit
How to Un-Flake Flaky Tests - A New Hire's Toolkit
 
Performance Testing using Real Browsers with JMeter & Webdriver
Performance Testing using Real Browsers with JMeter & WebdriverPerformance Testing using Real Browsers with JMeter & Webdriver
Performance Testing using Real Browsers with JMeter & Webdriver
 
33rd degree
33rd degree33rd degree
33rd degree
 
Continuous Delivery: How RightScale Releases Weekly
Continuous Delivery: How RightScale Releases WeeklyContinuous Delivery: How RightScale Releases Weekly
Continuous Delivery: How RightScale Releases Weekly
 
Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31
Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31
Ensuring OpenStack Version up Compatibility for CloudOpen Japan 2013-05-31
 
How to Un-Flake Flaky Tests - A New Hire's Toolkit
How to Un-Flake Flaky Tests - A New Hire's ToolkitHow to Un-Flake Flaky Tests - A New Hire's Toolkit
How to Un-Flake Flaky Tests - A New Hire's Toolkit
 
Quick tour to front end unit testing using jasmine
Quick tour to front end unit testing using jasmineQuick tour to front end unit testing using jasmine
Quick tour to front end unit testing using jasmine
 
Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)Cerberus : Framework for Manual and Automated Testing (Web Application)
Cerberus : Framework for Manual and Automated Testing (Web Application)
 
Cerberus_Presentation1
Cerberus_Presentation1Cerberus_Presentation1
Cerberus_Presentation1
 
Continuous Integration Testing in Django
Continuous Integration Testing in DjangoContinuous Integration Testing in Django
Continuous Integration Testing in Django
 
Continuous Delivery with Sitecore
Continuous Delivery with SitecoreContinuous Delivery with Sitecore
Continuous Delivery with Sitecore
 
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
 Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S... Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
Tips to achieve continuous integration/delivery using HP ALM, Jenkins, and S...
 
Andreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a StandardAndreas Grabner - Performance as Code, Let's Make It a Standard
Andreas Grabner - Performance as Code, Let's Make It a Standard
 

More from Atlassian

Design Your Next App with the Atlassian Vendor Sketch Plugin
Design Your Next App with the Atlassian Vendor Sketch PluginDesign Your Next App with the Atlassian Vendor Sketch Plugin
Design Your Next App with the Atlassian Vendor Sketch Plugin
Atlassian
 

More from Atlassian (20)

International Women's Day 2020
International Women's Day 2020International Women's Day 2020
International Women's Day 2020
 
10 emerging trends that will unbreak your workplace in 2020
10 emerging trends that will unbreak your workplace in 202010 emerging trends that will unbreak your workplace in 2020
10 emerging trends that will unbreak your workplace in 2020
 
Forge App Showcase
Forge App ShowcaseForge App Showcase
Forge App Showcase
 
Let's Build an Editor Macro with Forge UI
Let's Build an Editor Macro with Forge UILet's Build an Editor Macro with Forge UI
Let's Build an Editor Macro with Forge UI
 
Meet the Forge Runtime
Meet the Forge RuntimeMeet the Forge Runtime
Meet the Forge Runtime
 
Forge UI: A New Way to Customize the Atlassian User Experience
Forge UI: A New Way to Customize the Atlassian User ExperienceForge UI: A New Way to Customize the Atlassian User Experience
Forge UI: A New Way to Customize the Atlassian User Experience
 
Take Action with Forge Triggers
Take Action with Forge TriggersTake Action with Forge Triggers
Take Action with Forge Triggers
 
Observability and Troubleshooting in Forge
Observability and Troubleshooting in ForgeObservability and Troubleshooting in Forge
Observability and Troubleshooting in Forge
 
Trusted by Default: The Forge Security & Privacy Model
Trusted by Default: The Forge Security & Privacy ModelTrusted by Default: The Forge Security & Privacy Model
Trusted by Default: The Forge Security & Privacy Model
 
Designing Forge UI: A Story of Designing an App UI System
Designing Forge UI: A Story of Designing an App UI SystemDesigning Forge UI: A Story of Designing an App UI System
Designing Forge UI: A Story of Designing an App UI System
 
Forge: Under the Hood
Forge: Under the HoodForge: Under the Hood
Forge: Under the Hood
 
Access to User Activities - Activity Platform APIs
Access to User Activities - Activity Platform APIsAccess to User Activities - Activity Platform APIs
Access to User Activities - Activity Platform APIs
 
Design Your Next App with the Atlassian Vendor Sketch Plugin
Design Your Next App with the Atlassian Vendor Sketch PluginDesign Your Next App with the Atlassian Vendor Sketch Plugin
Design Your Next App with the Atlassian Vendor Sketch Plugin
 
Tear Up Your Roadmap and Get Out of the Building
Tear Up Your Roadmap and Get Out of the BuildingTear Up Your Roadmap and Get Out of the Building
Tear Up Your Roadmap and Get Out of the Building
 
Nailing Measurement: a Framework for Measuring Metrics that Matter
Nailing Measurement: a Framework for Measuring Metrics that MatterNailing Measurement: a Framework for Measuring Metrics that Matter
Nailing Measurement: a Framework for Measuring Metrics that Matter
 
Building Apps With Color Blind Users in Mind
Building Apps With Color Blind Users in MindBuilding Apps With Color Blind Users in Mind
Building Apps With Color Blind Users in Mind
 
Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...
Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...
Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...
 
Beyond Diversity: A Guide to Building Balanced Teams
Beyond Diversity: A Guide to Building Balanced TeamsBeyond Diversity: A Guide to Building Balanced Teams
Beyond Diversity: A Guide to Building Balanced Teams
 
The Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team
The Road(map) to Las Vegas - The Story of an Emerging Self-Managed TeamThe Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team
The Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team
 
Building Apps With Enterprise in Mind
Building Apps With Enterprise in MindBuilding Apps With Enterprise in Mind
Building Apps With Enterprise in Mind
 

Heavenly hell – automated tests at scale wojciech seliga

  • 2. Heavenly Hell Automated Tests at Scale WOJCIECH SELIGA • SENIOR DEV MANAGER • ATLASSIAN • @WSELIGA
  • 3. About me • Coding since 6 yo • Agile Practices (inc. TDD) since 2003 • Dev Nerd, Tech Leader, Agile Coach, Speaker, PHB • 7 years with Atlassian (JIRA Senior Dev Manager) • Spartez Co-founder & CEO
  • 4. XP Promise Cost of Change Waterfall XP Time
  • 5. XP Promise Cost of Change Waterfall XP Time
  • 8. Almost 10 years of accumulating legacy automatic tests
  • 9. About 20 000 tests on all levels of abstraction *just in core JIRA
  • 10. Very slow (even hours) and fragile feedback loop
  • 11. Serious performance and reliability issues
  • 12. Dispirited devs accepting RED as a norm
  • 13. Feedback Speed ` Test Quality
  • 14. Design Restructure Share Respect Prune Test Code is Not Trash Refactor Maintain Review Discuss Rewrite
  • 15. Test Pyramid Selenium REST / HTML Tests Unit Tests (including QUnit) Fastest, lowest overall confidence Slowest, highest overall confidence
  • 16. Selenium REST / HTML Tests Unit Tests (including QUnit) Test Pyramid 90% 9% 1%
  • 20. Optimum Balance Isolation Speed Coverage
  • 21. Optimum Balance Isolation Speed Coverage Level
  • 22. Optimum Balance Isolation Speed Coverage Level Access
  • 23. Optimum Balance Isolation Speed Coverage Level Access Effort
  • 25. Dangerous to temper with Quality / Determinism
  • 26. Dangerous to temper with Quality / Determinism Maintainability
  • 27. Almost two years later…
  • 29. People - Motivation Making GREEN the norm
  • 31. Build Tiers and Policy Tier A1 - green soon after all commits unit tests and functional* tests Tier A2 - green at the end of the day WebDriver and bundled plugins tests Tier A3 - green at the end of the iteration supported platforms tests, compatibility tests
  • 33. Training • Favouring assertThat over assertTrue/False and assertEquals • Avoiding races - Atlassian Selenium with its TimedElement • Favouring unit tests over functional tests (including QUnit over WebDriver) • Promoting Page Objects • Brownbags, blog posts, code reviews
  • 35. Re-run failed tests and see if they pass Automatic Flakiness Detection Quarantine
  • 40. Selenium ditching Sky did not fall in
  • 41. Ditching - benefits • Freed build agents - better system throughput • Boosted morale • Gazillion of developer hours saved • Money saved on infrastructure
  • 42. Ditching - due diligence • conducting the audit - analysis of the coverage we lost • determining which tests needs to rewritten (e.g. security related) • rewriting the tests (good job for new hires + a senior mentor)
  • 43. Flaky Browser-based Tests Races between test code and asynchronous page logic Playing with "loading" CSS class does not really help
  • 44. Races Removal with Tracing // in the browser:! function mySearchClickHandler() {! doSomeXhr().always(function() {! // This executes when the XHR has completed (either success or failure)! JIRA.trace("search.completed");" });! }! // In production code JIRA.trace is a no-op // in my page object:! @Inject! TraceContext traceContext;! ! public SearchResults doASearch() {! Tracer snapshot = traceContext.checkpoint();! getSearchButton().click(); // causes mySearchClickHandler to be invoked! // This waits until the "search.completed" // event has been emitted, *after* previous snapshot ! traceContext.waitFor(snapshot, "search.completed"); ! return pageBinder.bind(SearchResults.class);! }!
  • 45. Speed
  • 46. Can we halve our build times?
  • 47. Parallel Execution - Theory End of Build Batches Start of Build
  • 48. Parallel Execution End of Build Batches Start of Build
  • 49. Parallel Execution - Reality Bites End of Build Agent availability Batches Start of Build
  • 50. Dynamic Test Execution Dispatch - Hallelujah
  • 51. Dynamic Test Execution Dispatch - Hallelujah
  • 52. "You can't manage what you can't measure." not by W. Edwards Deming
  • 53. If you believe just in it you are doomed. "You can't manage what you can't measure." not by W. Edwards Deming
  • 54. You can't improve the system if you can't measure it
  • 55. You can't improve the system if you can't measure it Profiler, Build statistics, Logs, statsd → Graphite
  • 56. Compilation Packaging Executing Tests Anatomy of Build*
  • 57. Fetching Dependencies Compilation Packaging Executing Tests Anatomy of Build*
  • 58. Fetching Dependencies Compilation Packaging Executing Tests Anatomy of Build* *Any resemblance to maven build is entirely accidental
  • 59. Fetching Dependencies Compilation Packaging Executing Tests SCM Update Anatomy of Build* *Any resemblance to maven build is entirely accidental
  • 60. Agent Availability/Setup Fetching Dependencies Compilation Packaging Executing Tests SCM Update Anatomy of Build* *Any resemblance to maven build is entirely accidental
  • 61. Agent Availability/Setup Fetching Dependencies Compilation Packaging Publishing Results Executing Tests SCM Update Anatomy of Build* *Any resemblance to maven build is entirely accidental
  • 62. Compilation (7min) JIRA Unit Tests Build
  • 63. Compilation (7min) Packaging (0min) JIRA Unit Tests Build
  • 64. Compilation (7min) Packaging (0min) Executing Tests (7min) JIRA Unit Tests Build
  • 65. Compilation (7min) Publishing Results (1min) Packaging (0min) Executing Tests (7min) JIRA Unit Tests Build
  • 66. Compilation (7min) Publishing Results (1min) Packaging (0min) Executing Tests (7min) Fetching Dependencies (1.5min) JIRA Unit Tests Build
  • 67. Compilation (7min) Publishing Results (1min) Packaging (0min) Executing Tests (7min) SCM Update (2min) Fetching Dependencies (1.5min) JIRA Unit Tests Build
  • 68. Agent Availability/Setup (mean 10min) Compilation (7min) Publishing Results (1min) Packaging (0min) Executing Tests (7min) SCM Update (2min) Fetching Dependencies (1.5min) JIRA Unit Tests Build
  • 69. Decreasing test execution time to ZERO alone would not let us achieve our goal!
  • 70. Agent Availability/Setup • starved builds due to busy agents building very long builds • time synchronization issue - NTPD problem
  • 71. SCM Update - Checkout time • Proximity of SCM repo • shallow git clones are not so fast and lightweight + generating extra git server CPU load • git clone per agent/plan + git pull + git clone per build (hard links!) • Much less load on Stash server (no need to queue up)
  • 72. SCM Update - Checkout time • Proximity of SCM repo • shallow git clones are not so fast and lightweight + generating extra git server CPU load • git clone per agent/plan + git pull + git clone per build (hard links!) • Much less load on Stash server (no need to queue up) 2 min → 5 seconds
  • 73.
  • 74. Fetching Dependencies • Fix Predator • Sandboxing/isolation agent trade-off: rm -rf $HOME/.m2/repository/com/atlassian/* into find $HOME/.m2/repository/com/atlassian/ -name “*SNAPSHOT*” | xargs rm • Network hardware failure found (dropping packets)
  • 75. Fetching Dependencies • Fix Predator • Sandboxing/isolation agent trade-off: rm -rf $HOME/.m2/repository/com/atlassian/* into find $HOME/.m2/repository/com/atlassian/ -name “*SNAPSHOT*” | xargs rm • Network hardware failure found (dropping packets) 1.5 min → 10 seconds
  • 76. Compilation • Restructuring multi-pom maven project and dependencies • Maven 3 parallel compilation FTW! -T 1.5C *optimal factor thanks to scientific trial and error research
  • 77. Compilation • Restructuring multi-pom maven project and dependencies • Maven 3 parallel compilation FTW! -T 1.5C *optimal factor thanks to scientific trial and error research 7 min → 1 min
  • 78. Unit Test Execution • Splitting unit tests into 2 buckets: good and legacy (much longer) • Maven 3 parallel test execution (-T 1.5C) 3000 poor tests (5min) 11000 good tests (1.5min) Rewritten entirely over next year
  • 79. Unit Test Execution • Splitting unit tests into 2 buckets: good and legacy (much longer) • Maven 3 parallel test execution (-T 1.5C) 3000 poor tests (5min) 11000 good tests (1.5min) 7 min → 5 min Rewritten entirely over next year
  • 80. Functional Tests • Selenium 1 removal did help • Faster reset/restore (avoid unnecessary stuff, intercepting SQL operations for debug purposes - building stacktraces is costly) • Restoring via Backdoor REST API (JIRA TestKit) • Using REST API for common setup/teardown operations
  • 82. Publishing Results • Server log allocation per test → using now Backdoor REST API (was Selenium) • Bamboo DB performance degradation for rich build history
  • 83. Publishing Results • Server log allocation per test → using now Backdoor REST API (was Selenium) • Bamboo DB performance degradation for rich build history 1 min → 40 s
  • 84. Unexpected Problem • Stability Issues with our CI server (hardware) • The bottleneck changed from I/O to CPU • Too many agents per physical machine
  • 85. Compilation (1min) JIRA Unit Tests Build Improved
  • 86. Compilation (1min) Packaging (0min) JIRA Unit Tests Build Improved
  • 87. Compilation (1min) Packaging (0min) Executing Tests (5min) JIRA Unit Tests Build Improved
  • 88. Compilation (1min) Packaging (0min) Publishing Results (40sec) Executing Tests (5min) JIRA Unit Tests Build Improved
  • 89. Compilation (1min) Packaging (0min) Publishing Results (40sec) Executing Tests (5min) Fetching Dependencies (10sec) JIRA Unit Tests Build Improved
  • 90. Compilation (1min) Packaging (0min) Publishing Results (40sec) Executing Tests (5min) SCM Update (5sec) Fetching Dependencies (10sec) JIRA Unit Tests Build Improved
  • 91. Agent Availability/Setup (3min)* Compilation (1min) Packaging (0min) Publishing Results (40sec) Executing Tests (5min) SCM Update (5sec) Fetching Dependencies (10sec) JIRA Unit Tests Build Improved
  • 92. Improvements Summary Tests Before After Improvement % Unit tests 29 min 17 min 41% Functional tests 56 min 34 min 39% WebDriver tests 39 min 21 min 46% Overall 124 min 72 min 42% * Additional ca. 5% improvement expected once new git clone strategy is consistently rolled-out everywhere
  • 93. Better speed increases responsibility Fewer commits (authors) per single build vs.
  • 97. But that's still bad We want CI feedback loop in a few minutes maximum
  • 99. Inevitable Split - Fears • Organizational concerns - understanding, managing, integrating, releasing, coordinating • Mindset change - if something worked for 10+ years why to change it? • Trust - does this library still work? • We damned ourselves with big buckets for all tests - where do they belong to?
  • 100. Splitting code base • Step 0 - JIRA Importers Plugin (3.5 years ago) • Step 1- New Issue View and Navigator • Step 2 - now everything else follows (e.g. Workflow Designer) JIRA 6.0
  • 101. Getting back from hell to heaven is difficult. Hell sucks in your soul.
  • 102. Key takeaways: • Visibility and problem awareness help • Maintaining huge testbed is difficult and costly • Measure the problem - to baseline • No prejudice - no sacred cows • Automated tests are not one-off investment, it's a continuous journey • Performance is a damn important feature #atlassian
  • 103. Test performance is a damn important feature!
  • 104. XP vs Sad Reality Cost of Change Waterfall XP - ideal Sad Reality Time
  • 105. Images - Credits • Green Traffic Light - by flrnt, CC-BY-SA-2.0 • Turtle - by Jonathan Zander, CC-BY-SA-3.0 • Loading - by MatthewJ13, CC-SA-3.0 • Merlin Tool - by L. Mahin, CC-BY-SA-3.0 • Flashing Red Light - by Chris Phan, CC BY 2.0 • In Heaven - by Daniel Pascoal, CC BY-NC-ND 2.0
  • 106. Thank you! WOJCIECH SELIGA • SENIOR DEV MANAGER • ATLASSIAN • @WSELIGA