How Reddit uses Helm in local dev, staging, and production. An overview of the primary pieces (Helm and Docker repos, CI), supporting tooling, and some best practices we've identified.
Recording: https://www.youtube.com/watch?v=7Qxuo9W5SlY
6. Today
6
● Working on a local dev story
● Mature branch-based staging system
● Production has a few early adopters,
with more rapid expansion in Q2
● Helm drives all three of the above
11. WIP: Local development overview
11
● Kubernetes + Local Dev = kev
● The kev CLI calls out to Helm for heavy lifting
● Uses our Builder AWS Account’s Helm Chart and Docker
Image repositories (like prod and staging)
● Developers should not need to be familiar with
Kubernetes to work on their applications
15. Kage overview
15
● CI-Driven (Drone CI + drone-helm)
● git pushes to canonical repo trigger builds+deploys
● Separate stateful services (DBs/caches/etc) per-branch
● Our full tapestry of services can be staged and swapped
● No Kubernetes experience is needed to use Kage
23. Greg’s soapbox
23
● Be boring, especially in the early goings
● Cookiecutter / prompt + generate what you can
● If you have non-Infra-oriented “customers”, don’t force
them to learn Kubernetes at any deep level
● Use Helm to declare your cluster’s workload
● Code review for Helm Charts is crucial!
I’m Greg Taylor from Reddit Infrastructure, and I’m here to share how we use Helm in local development, staging, and production.
We’ll start with some historical context
Reddit has been a small engineering org for most of its history
In January of 2016, there were less than two dozen engineers at Reddit
On the systems side, there was just the monolith which you can see pictured here.
Over the last two years, the engineering org grew rapidly, quadrupling in size
As Reddit’s ambitions grew, so too did its org chart. New teams were built out and staffed
As teams blinked into existence, they started standing up their own services
We now have over a hundred engineers and around two dozen services
There are far more gaudy growth stories out there, but our expansion has been rapid enough to case some pain
Looking for a better path forward, we embarked on a search for the pieces that we’d use as the foundation for Reddit services of the future.
Before we get into details on our local dev, staging, and production flows, let’s talk about tests, builds, and artifact distribution.
The central, default place for tests and builds
The central, default place for hosting and serving artifacts
The consumers of this account’s resources are other AWS accounts and Reddit Engineers
Write access is highly restricted
Read access is widely granted
To illustrate the Builder Account concept, let’s go over some of the interactions
As I mentioned on the previous slide, each of our AWS accounts gets read-only access to our Docker and Helm Chart repositories
Reddit Engineers get a similar level of access
This is an important part of how we’re able to re-use Charts and Images throughout the develop + deploy cycle
With this in mind, let’s talk about how this factors into the first phase of the cycle
As a disclaimer: This is still in the early goings for us
Now, on to something that is a bit more fleshed out!
Our local development system is called Kev, with the etymology being Kubernetes + Local Dev
Any guesses as to what our Kubernetes Staging system is called?
Let’s walk through what the Kage flow looks like
Developer pushes to a branch on the project’s canonical repo (not the dev’s fork)
Within our Builder AWS Account, the Builder Drone CI cluster runs tests and publishes Docker Images
Docker Images are tagged by commit sha and branch name
After that, the Builder Drone CI cluster emits a deploy signal.
Drone CI in the staging cluster hears the signal, uses the drone-helm plugin to install/upgrade a release of the project (and its dependencies).
The Drone worker mounts a ServiceAccount token to make access to Tiller available to the drone-helm plugin.
Standard labels and annotations are used by supporting systems and our helper CLI to poke at things being staged.
We have taken a very measured approach towards moving to production.
Have been content to grow and mature with Kubernetes.
In addition to our build, test, and staging infrastructure, we have early adopters in production from a couple of other divisions.
This is exactly the same flow as staging, which we saw a few slides back.
Reminder: Docker images are tagged by commit sha and branch name.
This is where our deploy flow stands right now.
The helmfile repo contains sub-directories for each cluster, each containing helmfiles.
Secrets are kept in Vault, not in this repository.
This could be automated, but we prefer something more semi-automated as we build operational competency with Kubernetes and Helm.
Let’s take a look at an example helmfile.