Watch here: https://alexwilson.tech/content/770c26b5-f551-4795-90fa-0f24627ec510
Software Quality is really tricky: We all want to build the best thing for our users, but we want to ship good software, fast, without process overhead holding us back.
Join Alex in discussing what Software Quality actually is, how we can define and measure it, and how we can work towards it. And yes, there is room for a QA on your team.
8. The difficulty in defining quality is to translate future needs of the user into measurable characteristics, so that a
product can be designed and turned out to give satisfaction at a price that the user will pay.
— William E Deming
Let’s ask the experts!
@antoligy
The word quality has multiple meanings. Two of these meanings dominate the use of the word: 1. Quality consists of
those product features which meet the need of customers and thereby provide product satisfaction. 2. Quality
consists of freedom from deficiencies. Nevertheless, in a handbook such as this it is convenient to standardize on a
short definition of the word quality as "fitness for use".
— Joseph M Juran
Quality is a customer determination, not an engineer's determination, not a marketing determination, nor a general
management determination. It is based on the customer's actual experience with the product or service, measured
against his or her requirements — stated or unstated, conscious or merely sensed, technically operational or entirely
subjective — and always representing a moving target in a competitive market.
— Armand V Feigenbaum
9. ● Being “fit for purpose”, and serving a user need.
● Free from deficiencies.
● Being able to easily meet the needs of the future.
In simpler terms...
@antoligy
11. What is it?
A system provides functions that meet stated and implied needs when used under specified
conditions
How do you measure it?
● Functional completeness: CSAT, Traceability matrix (requirements vs test coverage)
Functional Suitability
@antoligy
12. How do you improve it?
● Hire a Business Analyst and QA to define complete functional requirements
● Implement user-testing
Example
The time we reverse engineered a product and its functional requirements while changing its
technology stack: It would have been far easier if we had mapped out what we knew.
Functional Suitability
@antoligy
13. What is it?
The performance relative to the amount of resources used under stated conditions
How do you measure it?
● “Core Metrics” — # Latency, # Request-Rate, # Errors
● User-centric metrics — # Time to interactive, # Time to first meaningful interaction
● Capacity — Load Testing
● Resource Utilisation — Tracing
Performance Efficiency
@antoligy
14. Performance Efficiency
@antoligy
How do you improve it?
● Choose the right technologies for the problems you are trying to solve
● Instrument “core metrics” across your stack.
● Load-test under realistic scenarios.
● Link performance improvements to business metrics (e.g. conversion)
Example
The time a website’s performance problems for years turned out to be an unnoticed,
unnecessary synchronous API callout blocking every request.
15. Usability
@antoligy
What is it?
A system can be used by specified users to achieve specific goals with effectiveness,
efficiency and satisfaction in a specified context of use
How do you measure it?
● Accessibility — WCAG 2.1 score
● User-friendliness — # User error rates, # L1 support-tickets, CSAT
● Operability — # Production incidents
16. Usability
How do you improve it?
● Automate accessibility testing
● Regularly conduct user interviews and perform user research
● Test on real user devices
● Invest in onboarding every type of user a system or product might have
Example
The time I built multi-language support into an events website to improve the accessibility for
its non-English-speaking users, but, didn’t train or make it easy for administrators to add new
locales.
@antoligy
17. Reliability
What is it?
A component functions consistently under specified conditions for a specified period of time
How do you measure it?
● Stability — Uptime, # of reported outages
● Availability — MTBF (mean-time-between-failures: mean-time-to-failure + mean-time-
to-repair)
@antoligy
18. Reliability
@antoligy
How do you improve it?
● Chaos-suite: Simulate disasters and failures before they hit you
● Knowledge-sharing exercises in the technology stack
● Apply SLAs and circuit-breakers to dependencies
● Instrument applications and services
● Invest in Observability tooling
Example
The time Puppet accidentally deleted a production database and server, but because we
used Puppet and orchestrated every aspect of AWS, we were able to recover within two
hours.
19. Security
What is it?
A system protects information and data so that persons or other products or systems have
the degree of data access appropriate to their types and levels of authorization
How do you measure it?
● Risk — OWASP Risk Rating (Likelihood * Impact)
● Auditability — # of user actions not recorded/attributed to a user
● Integrity — # of areas where PII may be found
● Compliance with standards such as PCI-DSS
@antoligy
20. How do you improve it?
● Scan dependencies for known vulnerabilities (e.g. Snyk)
● Vulnerability scanning of entire system (e.g. automated like Qualys)
● Bug-Bounties
● Role-Based-Access-Control, and Attribute-Based-Access-Control
Example
The time a Drupal website was hacked and would randomly serve cialis adverts to search-
bots only.
Security
@antoligy
21. What is it?
A system can be efficiently and effectively modified to improve it, correct it or adapt it to
changes in environment, and in requirements
How do you measure it?
● Testability — % unit code coverage, % scenario coverage, % feature coverage
● Understandability — # of lines of code, average file length, adherence to conventions
● Modifiability — Cyclomatic complexity, Cohesion
Maintainability
@antoligy
22. How do you improve it?
● Invest heavily in developer experience
● Lean on unit-testing to improve developer confidence
● Invest in code-review automation (e.g. Codacy)
● Reduce # caching layers and improve their utilisation
● Document your systems inside-out
● Invest in Observability tooling
Example
The time when revisiting our architectural documentation and increasing the verbosity of our
logging helped us diagnose an intermittent problem as being Redis replicating data
asynchronously.
Maintainability
@antoligy
23. Portability
What is it?
A system, product or component can be transferred from
one hardware, software or other operational or usage
environment to another.
How do you improve it?
Containerize
Portability & Compatibility
I’ll be honest: These are not things I think I have ever successfully solved. They are included for “completeness”.
Compatibility
What is it?
A component can exchange information with other
components, and/or perform its required functions, while
sharing the same hardware or software environment.
How do you improve it?
Namespacing, Containerize
@antoligy
24. Practical ways of improving quality
There are also some general principles we can implement.
● Defining “Done”
● Experiment and iterate
● Test in production
25. Practical Suggestions: Define “Done”
@antoligy
● Being dogmatic about every principle of quality will get us nowhere fast, so
prioritise based on what is important for our users.
● Periodically revisit this definition as a team, asking:
○ Do our stakeholders and users understand how this works?
○ Do we think this has helped improve our quality?
○ Do our users feel the same way?
● Work with “quality champions” early-on to define how new bodies of work will
be tested.
https://www.scrum.org/resources/blog/walking-through-definition-done
26. Practical Suggestions: Experiment and iterate!
@antoligy
● Get feedback from stakeholders and users early and often.
○ As quality improves, it will be easier to manage further changes upwards.
○ Team-members and developers in other areas of the business are stakeholders
too.
● Use a model of quality to measure and experiment with process and tooling,
and don’t be afraid to make mistakes.
● Link chosen metrics to tangible, easy-to-understand goals, and evaluate
success based on trends.
27. Practical Suggestions: Test in production
@antoligy
● Production is the only environment where we can prove things work.
● As well as testing safely out of production, make it safe to test in production.
○ Feature-flagging to decouple roll-out of features, from roll-out of code, so manual
tests are possible in prod before showing users a broken experience.
○ Canary releases to test with small percentage of user-base first.
○ Synthetic monitoring of critical features and journeys.
○ Especially on the web: Enhance progressively, and degrade gracefully. (i.e. if
JavaScript errors, continue serving critical functionalities)
● We’re already doing this every time we push to production!
https://martinfowler.com/articles/qa-in-production.html
Hey — Anyone familiar with this? One of my favourite games, Super Metroid, came out 25 years ago today. For those of you who aren’t familiar, it’s a side-scrolling adventure game where you explore an alien world to stop space terrorists known as space pirates from developing and using bio-weaponry. If that doesn’t sound awesome, then it’s probably because I’ve undersold it: If you’re able to, definitely give it a try.
Anyway: The game was built as a partnership between Nintendo and another company called Intelligent Systems, by a team of seventeen people called Team Deer Force. Anyone who has worked cross-organisation will know that there’s an awful lot that tends to go wrong and this was no exception: There were many logistical problems along the way, but team cohesion was never one of them and over two years they managed to build something which still holds its own today.
Why this game still stands up so well is so very relevant to this topic, because it’s the same reason that many other titles from Nintendo have generally done so well over time. Nintendo championed extensive software testing as part of an extensive quality control process.
It was actually this kind of process which helped them claw back the video-games market after an over saturation of low-quality software for the Atari 2600 platform nearly killed the entire industry in 1983.
Specifically, remember ET? That game that Atari published, which sold so terribly that Atari then found themselves paying for their stock to be buried in landfill? Yeah I’m not going to show you what that game played like, but it isn’t pretty. And it’s a good example of what happens when we skip any semblance of quality process.
At the other extreme, the quality process which led to titles like Super Metroid — I’ve found it rather hard trying to actually find out what their quality-control process looked like (it turns out that this is a closely guarded secret, like KFC’s blend of herbs and spices) — although it is possible to make guesses by looking at the general state of software testing and quality control in Japan at the time which was a blend of Japanese “kaizen” — the process of continuous improvement — and something called Total Quality Management which I’m not going to talk about today as it’s very heavy. It would have been vigorous to the point where everybody in this room would be complaining about it holding us back!
I love this example because on the one hand there are these exceptional games which captured audiences, and on the other there are these truly awful experiences which actually scared scared away their users from ever using a games console ever again. Kind of like what happens when a user tries using a buggy cart to buy a book, and ends up abandoning that particular store never to return.
So with that out of the way, who am I and why should you listen to me talk about this weird thing called quality?
My name is Alex, and I’m a self-diagnosed full-stack software person. Primarily I have either built, or have led teams building, consumer-facing software — typically on the safe, easy-to-support land of the web. Where nothing ever randomly goes wrong.
I’m just about getting a grip on this funny little notion of quality and wanted to share some thoughts.
After this …
Hopefully you’ll have taken away why quality problems can’t be pigeon-holed, and why they can only be owned by an entire team.
Hopefully you’ll also have taken away a few distinct ways of looking at and measuring quality throughout your entire delivery pipeline.
And, now that you’re hopefully able to measure quality throughout your pipeline, hopefully I’ll also have been able to give you a few ideas about how to improve it.
Starting with basics, though: What is Quality?
There have been a few!
I’ve picked out a few quotes here from William E Deming, Armand V Feigenbaum and Joseph M Juran, who are three of the “founding fathers” of quality. Or at least are three of the people whom are attributed as such by the Six Sigma distribution of quality techniques developed by Motorola in the 1980s.
William E Deming says: The difficulty in defining quality is to translate future needs of the user into measurable characteristics, so that a product can be designed and turned out to give satisfaction at a price that the user will pay. This is not easy, and as soon as one feels fairly successful in the endeavor, he finds that the needs of the consumer have changed, competitors have moved in, etc.
Armand V Feigenbaum, who is credited with TQM, says: Quality is a customer determination, not an engineer's determination, not a marketing determination, nor a general management determination. It is based on the customer's actual experience with the product or service, measured against his or her requirements — stated or unstated, conscious or merely sensed, technically operational or entirely subjective — and always representing a moving target in a competitive market.
Joseph M Juran says: The word quality has multiple meanings. Two of these meanings dominate the use of the word: 1. Quality consists of those product features which meet the need of customers and thereby provide product satisfaction.
2. Quality consists of freedom from deficiencies. Nevertheless, in a handbook such as this it is convenient to standardize on a short definition of the word quality as "fitness for use".
I’ve often been told off for being too reductive: Unfortunately that’s a lot of this talk.
The shorter version of what these people have said is that a product can be said to be of high quality when it is fit for purpose, serving a user need, when it is free from deficiencies and when it is easily able to meet the needs of the future.
But this is still a little bit abstract. I think we can do a bit better with our definition of quality.
Turns out that there is a well-crafted standard addressing this: ISO 25000, or SQuaRE: Software-product Quality Requirements and Evaluation.
- ISO 25000’s model of quality defines a set of high-level characteristics and then sub-characteristics where each is a measurable thing.
- Functional Suitability
- Performance efficiency
- Usability
- Reliability
- Security
- Maintainability
- Portability
- Compatibility
As you can see, there’s a lot here.
There’s two extremes that people tend to drift towards: Either having a pool of QA who act as a barrier for all work, who are given the sole responsibility of making sure that a system actually works well, usually using a mixture of automated and manual tesiting.
Or, not having a pool of QA at all and sort-of winging it because automated testing owned by the engineering team is considered to be enough.
I think both methodologies have merits which have a legitimate place!
The tester needs to gather information about the threat agent involved, the attack that will be used, the vulnerability involved, and the impact of a successful exploit on the business
Portability and compatibility are two concerns in quality that I don’t feel like I’ve successfully addressed. There’s a lot to them, but the overall payoff unless you are extremely risk averse doesn’t seem too high.
In the past I have definitely overprioritised these areas, and it’s resulted in wasting a lot of time, although I’m definitely interested in hearing other use-cases people are seeing of this.
“Portability” definitely makes it easier to test and reproduce problems and other error states.
Largely I’ve taken the same solution to both, and that is to focus on enforcing Heroku’s twelve-factor app principles. Containerising using a technology like Rkt or Docker will generally achieve this.
Now unfortunately this isn’t one that has worked so well for people making video-games 20 years ago. Companies still follow feature-flagging and data-driven decisions in prod though.