DockerCon SF 2019 - TDD is Dead

KEVIN CRAWLEY
DevRel, Instana
TDD is Dead

There are no wrong answers
Audience Polling

I trust my staging
environment?

Integration tests
are too slow?

Testing is the
responsibility of
the QA team?

Monitoring is the
responsibility of
the Ops team?

Our applications
are fully
instrumented?

Just kidding, there are wrong answers
• Why obsessing over testing is an anti-
pattern
• How observability (o11y) can empower your
organization
• Observability Case Study: Single Music

Code / Test
Coverage
Dashboard
(Sonar)

How much comfort
does test coverage
give you once your
code hits production?
Shouldn’t 97% coverage help us sleep at night?

This is your code running in staging

…and this is your code running in prod

Studies have analyzed the effects of TDD
http://softwareprocess.es/pubs/borle2017EMSE-TDD.pdf
http://www.sserg.org/publications/uploads/04b700e040d0cac8681ba3d039be87a56020dd41.pdf

• You’re abstracting /
coupling / adding more
logic / code
• It slows down velocity
• It reduces productivity
Unit testing is like putting your code in a tar pit

• If the temperature reaches critical, we should
insert these rods into the Nuclear Reactor
• If the sensors on this aircraft don’t match, our
software shouldn’t crash the plane
• Open Source libraries which other
organizations rely upon
What should we be testing?
Critical Systems Code / Pathways

• Is the user able to login / logout
• Can the user heart their friends
Avocado toast
• Fence-Post errors, CRUD
actions, etc.
What shouldn’t we be testing
Instead, use observability to ensure functionality

• How often you break production and how long
it takes you to fix it (MTTR)
• The responsiveness of your system and its
endpoints
• How long it takes to put a change request into
production
Test coverage is a vanity metric
Instead consider tracking these metrics:

What about Integration Testing

“Everybody has a testing environment.
Some people are lucky enough enough
to have a totally separate environment to
run production in.” – ???
When production is REALLY broken
How many of you wait for integration tests to
pass before pushing to production?

There is no such thing as a bug-free
system, choose your adventure:
• Your users see the bugs and you
already know about it
• You wait until they tell you about the
bugs (on Twitter)
You’re already testing in production
(Whether you like it or not)

• Slow Rollouts / Deployments
• Observe performance / error rates on
a small number of deployments and
increate over time (5% -> 10% -> 25%
-> 50% -> 100%)
Utilize Canary Deployments
They will enable you to effectively “test” in prod

Behaviors & I/O
• Number of retries / back-offs
• Request Parameters / Query Statements / Response
• Falling back to a default
• Top-Level Exceptions
How do we measure the internals
of an application service?
We must ask questions and emit signals from
within our applications – control theory

“A system is observable if the behavior of the
entire system can be determined by only looking
at its inputs and outputs.”
Lesson: control theory is a well-documented
approach which we can learn from vs trying to
reinvent
What is Observability?
Kálmán, 1961 paper
on the general theory of control systems

• Not just tooling
• Similar to how DevOps is a
mindset
• No longer treating services
like Schrödinger's cat
• Rich context around events
Observability
What does that word mean?
• Monitoring
• Instrumentation
• Structure Logging (tracing)
• Alerting
• Dashboards

• Also known as Distributed Structured Logging
• Much Larger Payloads
• Rich Context (Parameters, Query Strings, Response
Codes, etc)
https://w3c.github.io/trace-context
Distributed Tracing
It’s not as complicated as you think

Let’s take a few minutes to see
some of the problems we’ve solved
with Distributed Tracing at a
company I helped build called Single
Music
How does distributed tracing give
me more observability?

• Operated by 3 engineers (1 FE/1 BE/1 SRE)
• Over 20k transaction / hour, 20+ integrations,
50k LOC, with less than 15% test coverage
• Launched in 2018 with 15 microservices on
Docker Swarm – has since expanded to over 28
microservices with zero additional engineering
personnel

• We enable powerful insights into our
production applications
• Dependency mapping becomes trivial
• SREs and Engineers can track golden signals
for EVERY operational perspective on their
apps
When we begin emitting signals
from every transaction

Don’t like the dashboards your
vendor provides? Then you should be
able to use the leading open-source
solution for building your own.
You should be able to build your
own dashboards

Ed Keyes
Site Reliability Engineer – Google // 2008
“Sufficiently advanced
monitoring is indistinguishable
from testing …”

”I think we’ll stick with the old way
of doing things, we need more
test coverage.”

“I think our organization could
really benefit from more
observability”

Observability Workshop w/ Jaeger and Prometheus
Located in Workshop Room 2018
Today, 2-4pm
Tomorrow, 11:30-1:30pm
Want to learn more?
Instana Booth #S23

Rate & Share
Rate this session in the DockerCon App
Follow me @notsureifkevin
Win this droid! You can find the link on
my most recent Tweet!

DockerCon SF 2019 - TDD is Dead

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Semelhante a DockerCon SF 2019 - TDD is Dead

Semelhante a DockerCon SF 2019 - TDD is Dead (20)

Último

Último (20)

DockerCon SF 2019 - TDD is Dead