Thanks for introduction.
As you heard, I’m Charlie Owen.
.
I’ve previously worked on sites such as the History Channel, purveyor of fine shows such as “Duck Dynasty” and “Storage Wars”
and on the site for the AAT,
A massive paid membership site for accountants.
Actually exciting - one of the largest examples of Drupal in the world.
I currently work at the BBC, where I’m building the site for the 2015 General Election.
Believe me, that’s one deadline that we’re not allowed to ask for an extension on.
Furthermore….
…I’m a mess.
I leave clothes on the floor, I leave dishes in the sink, and there are dust bunnies under my sofa.
But when it comes to my workspace, I’m the polar opposite.
Anyone who has ever worked with me for some length of time will come to know my love of tidiness.
I’m the one who keeps their desk perfectly clean, the one who lines up their pens in order, who strives for some utterly inhuman, uncluttered, ordered existence.
And it isn’t just my desk. My code is indented, separated away into folders, with modularised components. I love tests and order. My eye twitches when something breaks.
Something like this would bug the hell out of me.
Yes, I’m that person. It’s why I ended up learning about visual regression testing.
Perhaps this should be: Testing things because things getting messy makes me twitchy and people don’t like it when I get twitchy.
Let’s back up, because I can sense that some of you are a bit confused.
Because if you’re wondering what that is, don’t worry. Regression testing, let alone visual reg testing, isn’t something that we’re all familiar with.
It’s probably a safe rule of thumb to suggest that the nearer you are to backend programming, the more you’re likely to have heard of it. And as this is frontend London, not Backend London, I guess I should explain what it is.
So, what’s a regression?
I’m not talking about regression into a past life via hypnosis.
It’s not Childhood regression. It’s not some Freudian desire to escape back to childhood.
Nope, a regression is when there’s an unexpected fuckup.
- Let’s take an example.
- Rabbits.
- Cute little things.
The europeans took them over to Australia, they loved them so much.
Wanted to keep them for egg-laying, or whatever it is that rabbits are kept for.
But they started breeding like, well, rabbits.
- Before long they’d eaten everything in Australia and had forced an evacuation of the continent.
- That’s the law of unintended consequences,
- In tech terms, a regression.
- You see regressions in tech all the time. Like when you get a software update for your computer, only for the wifi to stop working.
Or when someone in your team commits a piece of code that stops the entire site from rendering.
But you also see them in the frontend. Like when you change a piece of CSS in order to achieve an effect on a page, not realising that the same code is used on the about us page, and you’ve just made the CEO’s face 700% taller than it should be.
That’s a visual regression.
“okay”, you’re thinking, “big deal. I check over all my pages when I make a change. I’ll see anything as stupid as that”.
But as your site grows, it gets more and more difficult to do stuff like that. It becomes a chore to check over 10 pages. Do it for 50 pages and you’re wasting half your day. Do it for a thousand pages, and well, that’s probably considered a cruel and unusual punishment by the UN.
So what are you going to do?
Time to make the machines work for you. Because, frankly, they’re getting a bit uppity.
Make them pay their way.
Get them to start scanning for changes. Because machines are very good at doing small repetitive tasks like this.
As well as throwing breeze blocks down the room.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
Principle is this:
1. Give a machine a chunk of HTML, styled up with CSS.
2. Let it take a screenshot.
3. Make a change to your CSS.
4. Get the machine to screenshot it again.
5. let the machine compare the two images.
6. If the images match, great. If they differ, then it’ll let you know.
This actually works really well.
Let’s have a simple test. Here’s a bootstrap demo page.
I’m going to flick to the next slide and I want you to see if you can tell me where the difference is.
Thing is, I’m not going to let you see that flick of difference.
Let’s wait here a few seconds while you forget all the edges and boundaries of that previous slide.
Ready?
And here’s that image again.
Who can tell me what the difference was?
None of us. And that’s the problem.
But the machine CAN see the difference.
Here it is, sketched in red.
All it was was 20px added to the .btn-lg class.
It’s a classic visual regression.
It’s something that an inexperienced developer (or tired, because I’ve done it) might do, in order to fix a problem on their page.
Adding a bit of padding to a button, not realising that their CSS is unscoped and that the rules are bleeding into other selectors.
And this is what I used to figure out the difference.
The ImageMagick suite comes with a tool called Compare that… well, compares.
But that’s just the start. Because there are suites of tools out there that will automate this entire process for you.
You can script in Javascript, or use Selenium. You can list out all the pages that you want to test.
PhantomCSS and Wraith are both ace.
WRITE LIST OF TOOLS ON SCREENS
The fail popped up really quickly there, so let’s have a closer look at it. This, along with the command line warnings, make it really easy to see where something went wrong.
This historical snapshot system isn’t the only way to do things, although many suites work on this basis.
The BBC’s Wraith system, for example, can work by comparing two versions of a site, each running on a different git branch. Both techniques have their merit.
So, all sorted? I can get off stage now?
Well, maybe I made it seem easier than it actually is.
Because this works if your pages are really simple. And you only have a few of them. And if your content never changes. Eh.
You see, the more complex your pages become, the more chance that any two screenshots of them will differ, *even if you’ve not changed any CSS*.
Because content changes. Navigation systems change. 3rd party widgets can change. Any of them differ and your screen shots no longer match,
Take a look at the eBay homepage here. If we take a screenshot now, and another one in five seconds, they’re not going to match. the carousel on this page would cause the tests to fail
Similarly, a facebook widget embedded on your homepage would cause the visual tests to fail.
Because the content is constantly changing, page-based screenshot tests will fail.
As would this loading spinner.
You could ensure that nobody ever updates content on your site.
Or never use animations.
In fact just never change anything ever again, just so your test system works.
Luckily we don’t have to be that stroppy. The answer is already there.
Because…. Brad Frost.
Mr Atomic Design.
He laid down the answer for us when he came up with component-based design.
Rather than testing entire pages - volatile, mutable, changeable, pages - we start testing the *components* of a page.
Brad Frost calls them atoms, molecules, organisms,
I guess I’m talking about the equivalent of organism-level components here.
whatever you want to call them, you start testing those fundamental elements, rather than entire complex pages.
Instead of testing this….
test this…
and this…
and this…
and this…
Fill these building block components with static dummy test data, and they shouldn’t change at all to your testing tool.
The only time that they will is when you deliberately change it, or you break something.
Having these component living by themselves is a bit fiddly. And, tbh, they get a bit lonely. So start grouping them together into a component library.
Makes it easier to test each individual component if they’re all in one place.
And it’s not just useful for testing.
Putting all your components in one place is mega mega mega useful for other developers on your team.
A site builder can look through here, see that there’s something useful, and plug it in to a data renderer of some kind.
A quick side note:
component libraries shouldn’t be confused with a visual style guide.
A visual style guide is where your brand, UX and developers meet. It talks about how a user interacts with your site, rather than how your site is built.
A visual style guide is built up from components taken from the component library. But it is something separate.
A component library is for developers and testing.
No, this is a component library. Great example here from twitter’s Bootstrap framework.
You can build a component library manually. It’s pretty easy to do.
Lay down an example of a styled component, and then add alongside a copy of the code used to generate it.
Jekyll is pretty nifty for this. or you can use a tool like Hologram to search through your code and build up components automatically.
Any of the major tools out there will let you loop through the components in your library and test them.
I like to keep components inside wrappers like this, so that they’re easily identifiable, both visually and in code.
Keep the baseline snapshot for all your components in your git repo.
your team members can then run tests on their machines and compare against these definitive versions.
If they altered it *deliberately*, then they generate a new definite version and commit it to the repo, along with their changes.
As always in these talks, I’ve got some caveats for you to bear in mind. Sometimes reality isn’t as great as the spiel.
First of all, this simply isn’t cost effective for small projects.
If you’re building a one off brochure-ware site for a company, this won’t help you.
But if you’re building a medium to large-scale project, then this technique will pay dividends.
Secondly, you ABSOLUTELY need to invest in component-driven design.
If you design at the page-level, this just won’t work for you, for the reasons I mentioned.
Again, component-driven development is a solution to a medium and large-scale problem.
Lastly, be aware of what you’re testing.
Many of these regression testing systems only work on one browser.
It’s possible to introduce a browser-specific regression that will get past your tests. The more advanced selenium-based systems will let you do multibrowser testing, if you care to go for that.
But despite these warnings, I think this is a technique that it’s massively worth investing time in.
I know that the least fun part of my job is chasing these regressions, or hearing about them in tickets. Imagine how many hours are lost each week chasing these kinds of errors in large sites.
Frankly, I think this technique is *indispensable*.
Please talk to me on twitter or email me if you’ve got further questions, and of course, come talk face to face.
Thank you.
Have we got time for questions?