2. Usefully Wrong
“All models are wrong. Some models are useful.”
“... the practical question is: How wrong do they
have to be to not be useful?”
George E. P. Box (statistician) “Empirical Model-Building”
4. Path Dependence
● “good enough to be useful” -> ship it
● The decisions we make leave their mark on
the software we ship
● These marks remain long after the scope of
the software expands to other use cases
5. What is “Good Enough”?
● Depends on your priorities and resources
– What are you building?
– Why are you building it?
– Who are you building it for?
– Who is building it?
– What are you building it with?
– How much risk can you tolerate?
6. Context Matters
● Building an intranet web service
– Trusted network
– Enforced user base
● Building a web startup
– Hostile network
– Business lives or dies by user choice
● Building hardware control and management systems
– Usage driven by hardware
– Software as a necessary evil
8. Functionality
● Doing one (or a few) things well is often better
than doing a lot of things badly
● Adding functionality later is usually easier to
sell than taking it away (no matter how broken
it turns out to be)
9. Flexibility
● Don't make things configurable
● Configurability = testing and maintenance pain
● Do separate concerns (if you make it configurable
later, only one place needs to change)
● Do use flexible support tools
– SQL Alchemy makes it easy to change database
– Django locks in some major decisions (like ORM and
templating language) but provides a rich ecosystem of
prebuilt components that work well together
10. Security
● A lot of software is still insecure by default
– Unhashed (or poorly hashed) passwords
– Unencrypted communications channels
● Multiple layers of defence can hide this
● Try to make the “easy option” and the “secure
option” one and same
● Can be very hard to fix poor security choices
11. Reinventing Wheels
● Reuse means dependency management
● Often simpler to roll your own to start
● With good modularity, easy to replace later
● Watch for increasing complexity
12. Documentation
● How sophisticated are users expected to be?
– Installed by developers? Admins? End users?
– Intended for domain experts only?
● Is it stable enough to document?
● Documentation can highlight design flaws
13. Test Quality
● Fine grained tests pinpoint failures easily
● Coarse grained tests are often easier to write
● Can easily start with coarse grained tests, then add more
fine grained tests to narrow down failures
● Slow tests are better than no tests
● External dependencies are better than no tests
● Regression tests are great, but don't let them block fixes
for problems that can't be reproduced reliably
14. Code Reviews
● Code is written to:
– Tell the computer what to do
– Tell future maintainers what it does
● Tests cover the first, reviews the second
● Debatable value for small teams
● Highly valuable for large teams
● Needs appropriate tools
15. Performance & Scalability
● Don't stress about it if you don't need to
● Start with measurement infrastructure
● If simple is fast enough, stick with simple
16. Reliability
● Not all software is mission critical
● Pay attention to failure modes
● Error quality matters
17. Usability
● Humans are still a lot smarter than computers
● If users have no choice, they'll usually cope
● Hence, awful UX in most “enterprise” software
18. Maintainability & Business Risks
● The Bus Factor
– Most startups = 1
– Large companies want it to be higher
● Developer docs (including comments)
● Legal risks (copyrights, patents, trademarks)
19. Automation
● Critical to speeding up release cycles
● Is a process stable enough to automate?
21. Exit Strategies
● Know what you're not doing
● Have a vague idea how to fix it when needed
● Actual fixes will depend on future needs
● Sometimes, the only right answer is “No”
22. Patterns and Processes
● Keep your options open
● Minimise current complexity
● This is not easy
– Software architecture and design patterns
– Software processes and methodologies
● “interim” solutions may last a long time
● If you don't have a test suite, start there
23. Prototyping vs Implementation
● Two very different modes of development
● Prototyping
– Exploration
– Trying to figure out what is feasible
● Implementation
– Already known to be feasible
– Making it happen to a known specification
● Big difference in priorities!
24. Social Implications
● Design decisions are context dependent
● Easy to criticise in hindsight
● Design trade-offs can influence community
● Actually getting better at building software
● Ambitions are (more than?) keeping pace
26. An Innocent Start
● PulpDist: Mirroring network based on rsync
● Simple job definitions
{
"remote_server": "localhost",
"remote_path": "/demo/simple/",
"local_path": "/var/www/pub/sync_demo_raw/",
...
}
● Simple custom validator for JSON data
– Checks on individual values
– Overall sanity checks on full jobs
27. Don't Repeat Yourself
● Simple format turned out to be too simple
– Hard to modify given multiple jobs from same source
● Enhanced format with reusable elements
{
"mirror_id": "local_copy",
"tree_id": "simple_sync",
"site_id": "bne",
...
}
● Simple validator was no longer adequate
28. What To Do?
● Upgrade the existing validator
– Possible, but tedious to test properly
– Not a good wheel to reinvent
● JSON validation library
– Research would be starting from scratch
– Hard to assess quality quickly
● Relational database
– Enforces the constraints by its very nature
– Error quality would likely be poor
29. Two Birds...
● For validation, I needed to:
– Ensure identifiers were unique
– Ensure cross references were valid
● For UI purposes I also needed:
– To filter by component identifiers
– To sorting by various fields
● Sound familiar?
30. Two Birds...
● For validation, I needed to:
– Ensure identifiers were unique
– Ensure cross references were valid
● For UI purposes I also needed:
– To filter by component identifiers
– To sorting by various fields
● Sound familiar?
31. ...One Stone
● An in-memory SQLite database was perfect
● But writing SQL by hand is still horrible
● SQL Alchemy in target environment
● Problem solved!
– Config loaded into DB after simple field validation
– If the DB accepts it, references are also valid
32. How Does The Story End?
● Still some very rough edges
– Sqlite error messages are quite user hostile
– Schema changes are triple-keyed
● Future changes?
– Master in database, JSON only as export?
– Improved error messages?
– Switch to an actual schema engine?
● Other priorities!
General Philosophy ================== The statistician George E.P. Box once wrote: "All models are wrong. Some models are useful." Earlier in the same book (Empirical Model-Building) he wrote: "... the practical question is how wrong do they have to be to not be useful."
Path Dependence =============== The need to ship means that software is never perfect, only "good enough to be useful". "Path dependence" is the history of how developers have chosen to be wrong (and right!) as embodied in the interface and implementation of their software. Any long lived piece of software will always show symptoms of its origins, even when its scope expands beyonds the original use case. When our goal is "release early, release often", we need some idea of how wrong our software can be while still remaining useful.
- know your priorities - what you are building - why you are building it - what you can get away with skipping - user sophistication - target environment - development team size and trajectory - developer skill level and experience - what you can do easily - default language and framework choices - existing continuous integration infrastructure - existing documentation tools - existing UI resources - risk management
- the trade-offs change depending on what you're building - intranet web service - no risk of explosive user growth - usage mandated by corporate policy - will likely need to integrate with existing infrastructure - web-based startup - may need to cope with explosive user growth - usability is critical to business success - may choose to support existing identity platforms, or create their own
A lightning tour of ways to be wrong ==================================== Humans are actually pretty bad at developing software. Yet, the internet works, most planes don't fall out of the sky and NASA can land a rover weighing nearly a ton on Mars. It turns out there are plenty of ways to be wrong and still ship useful software. Many of these topics have entire conferences devoted to them, so this really is a lightning tour.
- functionality - better to do one thing really well than multiple things badly - implementing a quarter of your desired features is more useful than half-implementing everything
- flexibility - it's much easier to *add* to software than it is to take things away - every configurable setting makes your software harder to maintain and harder to test - if you make the wrong things flexible, you're stuck with maintaining both that *and* the flexibility you add later when you need it - make things inflexible unless there is a compelling reason to make them flexible - you know you need the flexibility (e.g. cross platform development) - the easy way is also the flexible way (e.g. SQL Alchemy, web frameworks) - "make this configurable" is a *wonderful* thing to postpone to future iterations - even with flexible infrastructure, don't feel compelled to expose that flexibility immediately - limiting flexibility can pay off in other ways - targeting a specific platform lets you rely on features of that platform - using implementation specific features can avoid rough edges in various standards and protocols
- security - many tools are still insecure by default (e.g. authenticated web apps that allow connections over HTTP) - security *can* be added later, but it's difficult if you don't code with security in mind from the start - better is to use tools that are "secure by default" (e.g. languages with automatic memory management, web frameworks that avoid common attacks in their default configuration, asynchronous web servers that resist slow loris attacks, SELinux)
- reinventing wheels - reuse isn't free, as it brings with it implications of dependency management - Do you monitor your upstream dependencies for security bugs? - What do you do if upstream releases a backwards incompatible update? - Do you maintain patches against upstream? Monkeypatch? Submit bug fixes or feature requests?
- documentation - installation docs? user docs? configuration docs? - stability of the thing being documented matters - documentation can highlight broken designs - make sure those responsible for making changes are also responsible for ensuring docs are updated appropriately - decide on a level of assumed knowledge (especially domain knowledge)
- test quality - unit tests pinpoint failures directly. Great for getting coverage of low level code with more features than are exposed by higher layers, but can be time consuming to write (especially for APIs that will be changing soon) - scenario tests are easier to write, but harder to debug when they fail. Great for when internal details are still in flux. - being able to select specific tests is better than having to run large batches at once - slow tests are better than no tests - manual infrastructure setup (e.g. starting a server) - developers running tests vs continuous integration - there are *many* nice things to have in a test suite, but you can get by without most of them for quite a while - sometimes you know how to fix a problem, but can't create a reliable automated test.
- code reviews - like automated tests, code reviews are a technique that helps with a *lot* of things - code quality - knowledge sharing - picking up on code that needs more comments - pre-commit reviews (review tools) - post-commit reviews (commit notifications) - with small development teams, skipping reviews is very common. - as a team gets larger, reviews become essential for knowledge sharing and maintaining readability
- performance and scalability - if it runs acceptably on available hardware, don't stress about it. - if performance becomes an issue, measure rather than guess - if your internal app maxes out at 5 users, supporting 100 might be worth it, supporting thousands probably means something somewhere is overengineered - good tools can let you fix this later (e.g. use SQLAlchemy to swap sqlite for PostgreSQL, or Django's caching framework to add caching)
- reliability - not all software is mission critical - "if it crashes, just restart it" is actually tolerable for some scripts and services - failure modes - quality of error messages
- usability - humans are significantly more adaptable than computers - users may not have any alternative with comparable features - yes, this is why "enterprise" software is almost universally awful from a UI point of view: functionality trumps usability when the people doing the evaluation are looking for ticks in a feature matrix rather than using your software themselves
- maintainability and business risks - the bus factor: how many individuals have to get hit by a bus before your business is going to take a serious financial hit? - Startups will often have a bus factor of 1. - larger organisations understandably try to avoid that through documentation and knowledge sharing. - Are your developer "Getting Started" docs up to scratch? - how well commented is the code (especially any evil hacks!) - Are their tracker issues for known hacks that need to be replaced with proper solutions? - how are your licensing review processes?
- automation - good automation is critical to speeding up release cycles - speeding up release cycles allows you to make software useful faster - a process that is still in flux may be better handled manually until it stabilises
Managing Path Dependence ======================== Those are just a few of the many ways we can make our software and processes worse (or avoid making them better) in order to ship products. So what can be done to avoid the software deteriorating in to an unmaintainable mess over time?
- have exit strategies in mind - when you make a conscious decision to defer dealing with a particular problem you can at least try to make sure you keep your options open - documenting known problems helps - keep an eye on your priorities, and adapt your plans accordingly - sometimes you just have to say "No, adapting this software to handle that use case isn't the cost effective option"
- general techniques - much of what you will hear about design patterns and recommended development processes are about tools for keeping your options open without making current software too complex. - suitable complexity (i.e. the simplest thing that could possibly work) - layered architectures - low coupling between components (ideally, be able to swap out entire component without the rest of the system noticing) - using standard protocols even between your own components is a great way to reduce coupling - comprehensive automated testing greatly increases the scope of problems that can be fixed, sometimes extending to fundamental architectural changes - Excessive Design Up Front can lead to unwieldy and hard to maintain software, but No Design Up Front can mean complete inability to meet changing requirements without rewriting from scratch
- prototyping vs implementation - prototyping is about finding out what is possible - implementation is about achieving specific functionality that is known to be possible - knowing which mode you're in is critical, as it fundamentally changes many of the trade-offs (e.g. test driven development is great for implementation, less useful for prototyping when you *don't know yet* how the API is going to work or the scope of what the software is going to do)
- social consequences - don't be too harsh on design decisions that were made years ago - what mattered then was probably very different from what matters now - all appearances to the contrary, we humans actually *are* getting better at developing software, as more of our collected wisdom starts moving into the tools we use, including programming languages and software libraries, source control systems, continuous integration systems, document publishing systems, automated deployment systems - different pieces of software will make different trade-offs, yet may all still qualify as "good enough" for many different purposes
Now, about that subtitle... how on Earth did I end up using SQL Alchemy as a JSON validator? - the status quo - PulpDist has a "JSON validation engine" that works by reading in the JSON file and using SQL Alchemy to dump the contents into an in memory sqlite database - if that violates a uniqueness constraint or a foreign key constraint, the file fails validation
- the starting point - a very basic config format for rsync jobs - source url - dest url - filters - a few boolean and numeric options - a few descriptive text fields - easy to validate just by checking each field in isolation, with a couple of consistency checks for the job as a whole - handled with a few iterators that threw helpful error message when they spotted a violation
- trouble in paradise! - incredibly tedious to hand derive the details for each job - revised JSON configuration format - richer data model - builds job definitions up from shared components - much easier to work with - *but* required robust uniqueness and cross-reference validation for IDs that simple validator can't handle
- considered options - upgrade existing validator - validation isn't *that* hard - testing it thoroughly is time consuming - adopt a third party JSON validation library - would be starting research from scratch - probably a good option longer term - dump the config into an in-memory SQLite database - definitely robust - also provides efficient sorting and filtering - not good from a usability point of view (foreign key constraint error messages are fairly horrible) - I'd forgotten what a pain it was to write my own SQL
- realised SQLAlchemy was already available in the target environment - no new dependency - less painful than writing SQL by hand - was planning to add it eventually anyway, since the config file format is intended to one day be an export format, with the master configuration in the database
- realised SQLAlchemy was already available in the target environment - no new dependency - less painful than writing SQL by hand - was planning to add it eventually anyway, since the config file format is intended to one day be an export format, with the master configuration in the database
- status quo - config files are run through the basic validator and then dumped into the in-memory database for constraint checking, filtering and sorting - however: - any config which makes it past initial validation is known to be good - I didn't spend a lot of time reinventing (and testing!) a custom solution to a previously solved problem
- changes to config file format require entry in multiple places - JSON validator - SQL Alchemy schema definition - test scenarios - error messages on failed validation are hard to interpret - the future - not sure, it depends on the requirements placed on future versions - may improve the error messages - may switch to generated config files so hand editing becomes the exception rather than the norm - may adopt an external JSON validator