Airbnb가 직접 들려주는 Kubernetes 환경 구축 이야기
Melanie Cebula 소프트웨어 엔지니어, Airbnb
스타트업 개발자라면 Kubernetes 환경 구현을 한번쯤 생각해보셨을 것입니다. 하지만 많은 개발자들이 Kubernetes를 어렵게 느끼고 있는 것도 사실입니다. 본 세션에서는 아마존웹서비스 상에서 수백개의 Kubernetes 서비스 개발을 거치며 천여명의 개발자들이 활용하고 있는 Airbnb에서 직접 사례를 소개 드립니다. Airbnb 본사 인프라스트럭처 엔지니어 Melanie Cebula가 들려주는 시행착오와 전략, 해결책을 통해 Kubernetes 환경을 보다 쉽게 접근하시는데 도움을 드릴 예정입니다.
17. • complex configuration
• complex tooling
• integrating with your current infrastructure
• open issues
• scaling
• … and more!
@MELANIECEBULA
Challengeswithkubernetes?
18. • complex configuration
• complex tooling
• integrating with your current infrastructure
• open issues
• scaling
• … and more!
@MELANIECEBULA
Challengeswithkubernetes?
solvable problems!
26. Reducingk8s
boilerplate
WHAT WE WENT WITH
@MELANIECEBULA
Project
Apps
Containers
Files
Volumes
Dockerfile
sets
params per
environment
other files
access params
35. Everything about a service is in one place in git
• All configuration lives in _infra alongside project code
• Edit code and configuration with one pull request
• Easy to add new configuration
• Statically validated in CI/CD
Configuration
LIVES IN ONE PLACE
@MELANIECEBULA
36. What we support:
• kube-gen files
• framework boilerplate
• API boilerplate
• CI/CD
• docs
• AWS IAM roles
• project ownership
• storage
• .. and more!
Configuration
LIVES IN ONE PLACE
@MELANIECEBULA
39. Configuration
LIVES IN ONE PLACE
@MELANIECEBULA
collection of
framework-specific
generators (ex: Rails,
Dropwizard)
40. • make best practices the default (ex: deploy pipeline,
autoscaling, docs)
• run generators individually or as a group
• support for review, update, commit
Configuration
CAN BE GENERATED
@MELANIECEBULA
43. Why do we version our kube configuration?
@MELANIECEBULA
• add support for something
new (ex: k8s version)
• want to change something
(ex: deployment strategy)
• want to drop support for
something (breaking
change)
• know which versions are
bad when we make a
regression 😅
• support release cycle and
cadence
kube.yml
v1
kubev1
deploy
kubev2
deploy
kubernetes cluster
kube.yml
v2
44. How do we version our kube configuration?
@MELANIECEBULA
1. version field
2. publish binaries for each
version
3. channels point to binaries
(ex: stable)
4. generate and apply using
the appropriate binary
bonk kube-gen.yml
45. Why do we version our generated configuration?
@MELANIECEBULA
• what our project generators
generate changes over
time
• best practices change
• and bugs in the generators
are found! 😅
kube.yml
generated by
sha1
kubev1
deploy
kubernetes cluster
kube.yml
generated by
sha2
generator at
sha2 has a bug
46. How do we version our generated configuration?
@MELANIECEBULA
generator tags
generated files with
version, sha, and
timestamp
48. • services should be up-to-date with latest best practices
• update configuration to the latest supported versions
• apply security patches to images
• configuration migrations should be automated
Whydowerefactorconfiguration?
FOR HUNDREDS OF SERVICES
50. How do we refactor configuration?
@MELANIECEBULA
• collection of general
purpose scripts
• scripts are modular
• scripts cover the lifecycle of
a refactor
list-pr-urls.py
get-repos.py
update-
prs.py
refactor.py
close.py
status.py
refactorator
51. The lifecycle of a refactor
Checks out repo, finds
project, runs refactor
job, tags owners,
creates PR
Comments on the PR,
reminding owners to
verify, edit, and merge
the PR
Merges the PR with
different levels of force
RunRefactor MergeUpdate
@MELANIECEBULA
52. How do we refactor configuration?
@MELANIECEBULA
• refactorator will run a
refactor for all services
given a refactor job
• refactor job updates _infra
file(s)
• ex: upgrade kube version to
stable
refactorator
upgrade-
kube.py
refactor job
53. Bumping stable version
@MELANIECEBULA
• bump stable version
• cron job calls refactorator
with the upgrade-kube.py
refactor job to create PRs
• another cron job handling
updating and merging PRs
runs daily on
weekdays
k8s cron job
refactorator
upgrade-
kube.py
refactor job
60. ktool
USES ENV VARS
• Runs in the project home directory:
$ cd /path/to/bonk
• Environment variables for arguments:
$ k status ENV=staging
• Prints the command that it will execute:
$ k status ENV=staging
kubectl get pods --namespace=bonk-staging
@MELANIECEBULA
standardized
namespaces!
61. ktool
SIMPLIFIES BUILDS AND
DEPLOYS
• k generate generates kubernetes files
• k build performs project build, docker build and
docker push with tags
• k deploy creates namespace, applies/replaces
kubernetes files, sleeps and checks deployment status
• can chain commands; ex: k all
@MELANIECEBULA
62. ktool
A DEBUGGING TOOL
• defaults to random pod, main container:
$ k ssh ENV=staging
• specify particular pod, specific container:
$ k logs ENV=staging POD=… CONTAINER=bonk
• automates debugging with k diagnose ENV=staging
@MELANIECEBULA
63. ktool
A DEBUGGING TOOL
• defaults to random pod, main container:
$ k ssh ENV=staging
• specify particular pod, specific container:
$ k logs ENV=staging POD=… CONTAINER=bonk
• automates debugging with k diagnose ENV=staging
@MELANIECEBULA
call kubectl diagnose
81. A single deploy process for every change
Write code and config
under your project
Open a PR and merge
your code to master
Deploy all code and
config changes
Develop DeployMerge
@MELANIECEBULA
82. A single deploy process for every change
Deployment
ConfigMap
Service
AWS
Alerts
Dashboards
Project
ownership
Docs
Secrets
kubectl
apply
kubernetes cluster
kubectl
apply
Storage
Service
Discovery
API Gateway
Routes
@MELANIECEBULA
83. How do we apply k8s configuration?
Deployment
ConfigMap
Service
“kubectl
apply”
kubernetes cluster
• kubectl apply all files
• in some cases where apply
fails, replace files without
force
• always restart pods on
deploy to pick up changes
• return atomic success or
failure state by sleeping and
checking status
@MELANIECEBULA
84. How do you always restart pods on deploy?
Deployment
ConfigMap
Service
kubectl
apply
kubernetes cluster
kubectl
apply
We add a date label
to the pod spec,
which convinces k8s
to relaunch all pods
@MELANIECEBULA
85. How do we apply custom configuration?
@MELANIECEBULA
86. How do we apply custom configuration?
aws.yml
kubectl
apply
kubernetes cluster
kubectl
apply
AWS CRD
AWS
Controller
AWS
webhook
@MELANIECEBULA
87. How do we apply custom configuration?
aws.yml
kubectl
apply
kubernetes cluster
kubectl
apply
AWS CRD
AWS
Controller
AWS
webhook
1. Create a custom
resource definition
for aws.yml
@MELANIECEBULA
88. How do we apply custom configuration?
aws.yml
kubectl
apply
kubernetes cluster
kubectl
apply
AWS CRD
AWS
Controller
AWS
webhook
2. Create a controller
that calls a web hook
when aws.yml is
applied
@MELANIECEBULA
89. How do we apply custom configuration?
aws.yml
kubectl
apply
kubernetes cluster
kubectl
apply
AWS CRD
AWS
Controller
AWS
webhook
3. Create a web hook
that updates a
custom resource
@MELANIECEBULA
90. How do we apply custom configuration?
AWS CRD
AWS
Controller
AWS
webhook
@MELANIECEBULA
AWS lambda
4. AWS lambda
exposes web hook to
be called
93. • enforce best practices
• at build time with validation scripts
• at deploy time with admission controllerConfiguration
SHOULD BE VALIDATED
@MELANIECEBULA
94. How do we validate configuration at build time?
@MELANIECEBULA
95. How do we validate configuration at build time?
@MELANIECEBULA
kube
validation
script
job dispatcher
project.yml
validation
script
aws .yml
validation
script
global jobs repo
project
build
global
validation
jobs
docs
build
bonk CI jobs
96. How do we validate configuration at build time?
@MELANIECEBULA
kube
validation
script
job dispatcher
project.yml
validation
script
aws .yml
validation
script
global jobs repo
project
build
global
validation
jobs
docs
build
bonk CI jobs1. Define global job in
global jobs repo
97. How do we validate configuration at build time?
@MELANIECEBULA
kube
validation
script
job dispatcher
project.yml
validation
script
aws .yml
validation
script
global jobs repo
project
build
global
validation
jobs
docs
build
bonk CI jobs
2. job dispatcher
always dispatches
global jobs to
projects
98. How do we validate configuration at build time?
@MELANIECEBULA
kube
validation
script
job dispatcher
project.yml
validation
script
aws .yml
validation
script
global jobs repo
project
build
global
validation
jobs
docs
build
bonk CI jobs
3. global job runs
alongside project
jobs
99. What do we validate at build time?
@MELANIECEBULA
• invalid yaml
• invalid k8s configuration
• bad configuration versions
• max namespace length (63
chars)
• valid project name
• valid team owner in
project.yml
kube
validation
script
job dispatcher
project.yml
validation
script
aws .yml
validation
script
global jobs repo
100. How do we validate configuration at deploy time?
project.yml
kubectl
apply
kubernetes cluster
kubectl
apply
@MELANIECEBULA
admission
controller
admission controller
intercepts requests
to the k8s api server
prior to persistence
of the object
101. How do we validate configuration at deploy time?
project.yml
@MELANIECEBULA
admission
controller
• metadata is encoded as
annotations at generate
time
• admissioncontroller
checks for required
annotations
• reject any update to
resources that are missing
required annotations
• reject any update that
violates specified conditions
102. v
What do we validate with admission controller?
project.yml
@MELANIECEBULA
admission
controller • project ownership
annotations
• configuration stored in git
• configuration uses
minimally supported version
103. What do we validate with admission controller?
project.yml
@MELANIECEBULA
admission
controller
• production images must be
uploaded to production
ECR
• prevent deployment of
unsafe workloads
• prevent deployment of
development namespaces
to production clusters
104. What do we validate with admission controller?
project.yml
@MELANIECEBULA
admission
controller
• production images must be
uploaded to production
ECR
• prevent deployment of
unsafe workloads
• prevent deployment of
development namespaces
to production clusters
standardized
namespaces!
106. 1. Abstractaway complex kubernetes configuration
2. Standardize on environments and namespaces
3. Everything about a service should be in oneplaceingit
4. Makebestpracticesthedefault bygenerating configuration
5. Configuration should be versionedand refactoredautomatically.
6. Createanopinionatedkubectlwrapper that automates common workflows and distribute custom
commands as kubectlplugins
7. CI/CDshouldrunthesamecommands that engineers run locally, in a containerized environment
8. Validateconfiguration as part of CI/CD
9. Code and configuration should bedeployedwiththesameprocess
10.Usecustomresourcesandcustomcontrollers to integrate with your infrastructure
10Takeaways
@MELANIECEBULA
107. keno
• thousands of services running in k8s
• moving all configuration to gitops workflow w/ custom controllers (dashboards, alerts, etc)
• scaling the cluster / scaling etcd / multi cluster support
• stateful services / high memory requirements
• tighter integration with kubectl plugins
• better paved road language support
• better k8s developer environment
• envoy migration
• better security defaults
• better support for other workloads
• … and more!
2019Challenges
@MELANIECEBULA
108. • learn more @ medium.com/airbnb-engineering
• jobs @ airbnb.com/careers
• reach me @melaniecebula
Thanks!
@MELANIECEBULA