This document discusses managing multiple Kubernetes clusters without using federation. It notes that most organizations have 5-10 clusters on average spread across different regions, providers, and environments. While federation provides a way to manage multiple clusters, it has limitations around security and operations. The document proposes breaking the problem up between app owners and infra admins and leveraging existing Kubernetes concepts like RBAC, namespaces and cluster registry. It also suggests using specialized software that is more Kubernetes native, such as specialized CLIs, controllers, operators or the federation v2 project. The overall message is that managing multiple clusters requires specialized software that provides a good user experience while using Kubernetes primitives and tools.
18. 18
● Hook up to CI/CD
● Cluster discovery
● Failover between clusters
● Credential management
BREAKING UP THE PROBLEM
APP OWNER INFRA ADMIN
19. 19
APP OWNER
● Hook up to CI/CD
● Cluster discovery
● Failover between clusters
● Credential management
● Connect and track clusters
● Ensure overall security
● Lock down as much as possible
● Resource limits
INFRA ADMIN
BREAKING UP THE PROBLEM
20. ● Connect and track clusters
● Ensure overall security
● Lock down as much as possible
● Resource limits
20
APP OWNER
● Hook up to CI/CD
● Cluster discovery
● Failover between clusters
● Credential management
INFRA ADMIN
BREAKING UP THE PROBLEM
21. ● Connect and track clusters
● Ensure overall security
● Lock down as much as possible
● Resource limits
21
APP OWNER
● Hook up to CI/CD
● Cluster discovery
● Failover between clusters
● Credential management
INFRA ADMIN
BREAKING UP THE PROBLEM
23. $ kubectl get clusters
NAME AGE
west-coast-production 5d
east-coast-production 24d
23
CLUSTER REGISTRY
Upstream Kubernetes project for tracking clusters
PROBLEM
Lives on a single cluster, but is useful to all clusters
25. 25
ACCESS CONTROL
Build on Kubernetes RBAC as much as possible
Staging
Production
App Owner Engineer
SRE Robots
26. 26
ACCESS CONTROL
Build on Kubernetes RBAC as much as possible
Staging
Production
App Owner Engineer
SRE Robots
27. 27
ACCESS CONTROL
Build on Kubernetes RBAC as much as possible
Staging
Production
PROBLEM
What transforms this?
App Owner Engineer
SRE Robots
28. 28
BRING-YOUR-OWN WORKFLOWS
Customization is key to make all users successful
API server
Node Node Node
API server
Node Node Node
Federation API server
App Owner Engineer
SRE Robots
29. 29
BRING-YOUR-OWN WORKFLOWS
Customization is key to make all users successful
kind: Deployment
apiVersion: apps/v1
metadata:
annotations:
federation.kubernetes.io/deployment-preferences: |
{
"rebalance": true,
"clusters": {
"clusterA": {
"minReplicas": 1,
"maxReplicas": 3,
"weight": 1
},
33. 33
BRING-YOUR-OWN WORKFLOWS
Customization is key to make all users successful
Europe North America
Engineer
SRE
App Owner
Security
New versions, Config, Secrets
RBAC Roles/Bindings, Quota, Namespaces
34. 34
BRING-YOUR-OWN WORKFLOWS
Customization is key to make all users successful
Europe North America
Engineer
SRE
App Owner
Security
New versions, Config, Secrets
RBAC Roles/Bindings, Quota, Namespaces
35. 35
TECTONIC MULTI-CLUSTER
● Sync this to all clusters instead of
living on one
● Use registry as a selector for policy
● Agent running on the cluster
● Focused on RBAC-only
● CRUD namespaces, roles, bindings
● Updated immediately across clusters
CLUSTER REGISTRY
SYNC POLICY
kind: ClusterPolicy
apiVersion: multicluster.coreos.com/v1
spec:
selector:
cloud: aws
namespaces:
- name: "api-prod"
authorization:
bindings:
- clusterRole: view
users: ["random-user"]
groups: ["SupportTeam"]
- clusterRole: edit
groups: ["APIDevelopers"]
- clusterRole: admin
users: ["joe-team-lead"]
- name: "api-test"
authorization:
bindings:
- clusterRole: admin
36. 36
Europe US East
Cluster Registry
TECTONIC MULTI-CLUSTER
Common Roles and Bindings
US West
Prod Production ProductionStage
37. 37
Europe US East
Cluster Registry
TECTONIC MULTI-CLUSTER
Common Roles and Bindings
US West
Prod Production ProductionStage
Engineer
App Owner
Deploy to production
38. 38
Europe US East
Cluster Registry
TECTONIC MULTI-CLUSTER
Common Roles and Bindings
US West
Prod Production ProductionStage
SRE
Security Change staging resource quota
39. 39
Europe US East
Cluster Registry
TECTONIC MULTI-CLUSTER
Common Roles and Bindings
US West
Prod Stage
SRE
Security Register a new cluster
New
Production Production Production
40. 40
Europe US East
Cluster Registry
TECTONIC MULTI-CLUSTER
Common Roles and Bindings
US West
Prod Stage
SRE
Security Remove the staging cluster selector
Prod Stage Prod Stage Stage
Singapore
Prod
41. 41
Europe US East
TECTONIC MULTI-CLUSTER
US West
Cluster Registry
BETTER SECURITY
No cluster needs to access all clusters
ServiceAccounts can be audited/revoked
ServiceAccounts only read Clusters and ClusterPolicies
44. 44
Least Complex
● Simple loop through clusters
● Human in the loop
● Error handling?
● No declarative state that we love
● Loop through clusters
● Human out of the loop
● Error handling?
● Jobs might need to be long running
BASH SCRIPT JENKINS
PROBLEM
Kubernetes is smart...these aren’t
45. 45
More Complex
● Specialized CLI tool
● Using Kubernetes-native tooling
● GCP beta feature only
● Smarts outside of the cluster
MULTI-CLUSTER INGRESS
kubemci create zone-printer
--ingress=ingress/ingress.yaml
--gcp-project=$PROJECT
--kubeconfig=clusters.yaml
46. 46
More Complex
MULTI-CLUSTER INGRESS
for ctx in $(kubectl config get-contexts -o=name --kubeconfig
clusters.yaml); do
kubectl --kubeconfig clusters.yaml
--context="${ctx}"
create -f manifests/
done
47. 47
More Complex
● Specialized controller for an
application
● Model apps using CRDs
● Use cluster registry, other cluster
state and RBAC
● Requires a Kubernetes-native app
OPERATOR
kind: CompliantDatabase
metadata:
name: example-db
spec:
replicationFactor: 2
autoscale: true
backup: hourly
geography:
restriction: EU
preference: Germany
49. 49
Even Smarter
● Accept that this is a vast problem
space
● Don’t have to use all parts
● Policies modeled as CRDs
○ Add as many as you’d like
● Use existing RBAC
● Can plug into policy engine
FEDERATION V2
Your-Custom-Object
Template
Placement
Override
50. 50
Even Smarter
● UX matters!
● Implemented as an aggregated API
server
○ Can use kubectl and existing
tools/libraries
● Secured with ServiceAccounts
● Possible to implement a custom
scheduler
FEDERATION V2
Owned by SIG Multi-cluster