Mais conteúdo relacionado

Similar a Kubernetes Internals(20)


Kubernetes Internals

  1. Kubernetes
  2. Kubernetes • Kubernetes (K8S) provides a logical abstraction of treating your data-center as a single machine. • It allows for deploying, provisioning and self-healing of container groups (aka pods) across your cluster.
  3. Main Concepts • Pods – the basic building-block in K8S. A logical group of containers. • Controllers – responsible of bringing the reality to the desired state. • Services – an abstraction over pods.
  4. Pods • A pod is a group of co-located containers. • They run on the same node and share the same Linux namespaces. • Life-cycle of a pod consists of the following phases: • Pending. • Running. • Succeeded. • Failed. • Unknown. Copyright 2017 Trainologic LTD
  5. Pod Spec • An example of a spec: --- apiVersion: v1 kind: Pod metadata: name: nginx labels: app: web spec: containers: - name: front-end image: nginx ports: - containerPort: 80
  6. Specs • Specs will be saved in a persisted storage. • Etcd. • The controllers are responsible for moving into the desired state.
  7. Probes • You can define diagnostic probes for pods. • Liveness probe – determines whether the container is alive. Failures will be handled by RestartPolicy. • Readiness probe – determines whether the container is ready to serve requests. Initial state is failed. Copyright 2017 Trainologic LTD
  8. RestartPolicy • You can define a RestartPolicy for every pod. • Can be set to: • Always – the default value. • OnFailure – only if the container exited in a failed status. • Never. • A restart will have an exponential backoff and is cupped at 5 minutes. Copyright 2017 Trainologic LTD
  9. ImagePullPolicy • When a container is started in a pod, this property controls whether to check a remote registry for a newer image. • Possible values: Always and IfNotPresent. • The default is “IfNotPresent” (except for “:latest” tag). • It is strongly advised that you don’t use “latest”. Copyright 2017 Trainologic LTD
  10. Controllers • Most common one is: ReplicaSet. • Intended for you immortal containers. • Let’s see a spec example…
  11. Spec Example apiVersion: apps/v1 kind: ReplicaSet metadata: name: nginx labels: tier: frontend spec: # this replicas value is default # modify it according to your case replicas: 6 selector: matchLabels: tier: frontend template: metadata: labels: tier: frontend spec: containers: - name: nginx image: nginx ports: - containerPort: 80
  12. Controllers • For “it should someday die” containers, you have the Job controller. • Kubernetes also provides: • DaemonSet. • StatefulSet.
  13. Deployments • Deployments allow you to manage pods and Replica-Sets. • In a declarative way! • Allows to view the desired state of you Replica-Sets and pods. • The name of the Replica-Set is deploymentName-podTemplateHash. Copyright 2017 Trainologic LTD
  14. Example • Let’s consider deploying a new version. • How many ReplicaSets do we need? • How about the scaling?
  15. Deployments • By default, a deployment makes sure that at most one pod is unavailable during an update (1 max unavailable). • It will also ensure that (by default) at most one pod can be created more than the desired amount (1 max surge). • When you update a deployment, a new Replica-Set is created to scale up the new pods and the old Replica-Set (which has the same selector) will be scaled-down. Copyright 2017 Trainologic LTD
  16. Rollback • A rollback will affect only the pod’s template. kubectl rollout history deployment/NAME • You can list the deployment’s revisions by: • You can provide the --revision=N to see revision’s details. • Rollback is done with: kubectl rollout undo deployment/NAME --to-revision=N Copyright 2017 Trainologic LTD
  17. Pause & Resume • You can pause a deployment with: kubectl rollout pause deployment/NAME • Resume it with: “resume”. • You can use it for canary deployments. Copyright 2017 Trainologic LTD
  18. Services • A Service provides an abstraction over a logical set of pods. • Somewhat analogous to a micro-service. • Usually exposes a selector-based pods. • Can export a port connected to pods’ target-port which may even be a string (a name of the port inside). • Allows for great flexibility. • Supports both UDP and TCP. Copyright 2017 Trainologic LTD
  19. Virtual IPs • In proxy mode “iptables”, if a pod has failed, the client will not automatically be connected with a new one (unlike in “userspace” mode). • If relies on well-defined readiness probes. • If you want you can specify a clusterIP address for your service (must be inside your service-cluster-ip-range). Copyright 2017 Trainologic LTD
  20. DNS • A cluster add-on that creates DNS records for each service. • Doesn’t suffer from the ‘envs’ ordering problems. • You can define headless services (specify ‘None’ in clusterIP) and you’ll get only DNS (discovery) support (without proxy and load-balancing). • Headless services are currently a requirement for StatefulSet. Copyright 2017 Trainologic LTD
  21. Service Types • The following ServiceType values are supported: • ClusterIP – cluster-only internal IP (the default). • ExternalName – maps to an external host without proxying. • NodePort – in addition to internal cluster IP, expose the service on each node in the cluster (same port for all nodes). • LoadBalancer – in addition to NodePort, asks the cloud provider for a load-balancing service. Copyright 2017 Trainologic LTD
  22. Architecture source: X
  23. Components Overview • etcd – The cluster’s data backend store. A most reliable key- value store. Written in Go and uses the Raft protocol for consensus. • kubelet – The node agent. Responsible for monitoring pods healthiness. • kube-apiserver – The main API for K8s. Responsible for validation and configuration of the cluster state. • kube-controller-manager – Manages the controllers which are responsible for moving the cluster state to the desired direction.
  24. Components Overview • kube-proxy – Runs on every node. Manages basic load- balancing and TCP/UDP forwarding. • kube-scheduler – Responsible for capacity planning and workload distribution.
  25. Kubelet • Can actually be used without other K8s components. • Creates the containers according to the pod specs. • Can listen on a directory for manifests (specs). • Can also receive requests through an internal HTTP server.
  26. apiserver • A simple REST server. • Performs validation. • Updates the etcd with the changes to K8s objects.
  27. kube-proxy • Performs load-balancing on the node. • Provides a virtual IP to which clients can send requests transparently. • Responsible for updating the iptables for the Services. • Service endpoints are defined on the DNS.
  28. Volumes • Volumes are attached to the lifecycle of a Pod. • Not to its container/s. • In the pod-spec you specify the requirements from the volume and to which container and where to mount it.
  29. Persistent Volumes • Persistent Volumes (PV) are not tied to the lifecycle of a pod. • They are resources (like nodes). • Persistent volumes are consumed according to persistent volume claims (PVC). • A PV abstracts the underlying specifics of the storage. • User needs only to deal with PVCs. • A PVC has a mapping of one-to-one to a PV.
  30. PVC • Before a pod can use a PV, the user must create a PVC. • A namespaced resource in K8S. • The cluster manager then binds the PVC to the PV. • The pod can use the claim as a volume now.
  31. Lifecycle • When the user deletes a PVC, the PV is treated according to the following options of reclaim policies: • Retain – reclamation is manual. • Delete – the volume is deleted. • Recycle – ‘rm –rf …’
  32. ConfigMap • Allows to decouple a container from where the configuration comes from. • First you need to define a configMap: kubectl create configmap name source. • The source is either a file or a literal.
  33. ConfigMap • You can specify literal values with --from-literal=key=value. • You can specify a file with --from-file=path/url. • When using the file version, the key is by default the file name and its contents are the values. • You can inspect the configMap with kubectl get.
  34. Using ConfigMap • You can use a configMap value for an env variable value to a container. • You can also use all values of a configMap as env variable for a pod. • You can also use configMap variables as values for a container command. • You can also use a volume of type configMap to mount files based on the keys in the configMap and the contents will be the values. • Note that whewn a configMap is updated, it will be reflected in the pods.
  35. Secrets • Secrets holds sensitive data like keys, passwords and tokens. • Just like with configMaps, you can create secrets based on files. • Like with configMaps, you can use secrets as environment variables and mount them as files.
  36. Zones • Kubernetes supports multiple availability zones. • However, only in a single region. • A cluster can’t spawn across regions. • Kubernetes automatically attach zone labels for nodes and PVs. • Note that PVs can’t be attached to a different zone than the one they were created at. • K8S takes care of that. • Need to specify MULTIZONE=true when starting the cluster.
  37. Ingress • Services, by default, are accessible only from inside the cluster. • Also, they work at the TCP/UDP level. • Ingress is a set of rules directing incoming traffic to service endpoints. • It works at the HTTP level.
  38. Ingress Types • You can map HTTP URLs to services (fanout). • You can also include the “Host” header in your rules (for virtual hosts).
  39. Helm • Helm allows for streamlining K8S applications. • Packages in Helm are called charts. • You can use available chart for popular software. • You can use charts to template your K8S specifications. • Composed of server part (tiller) that runs inside the K8S cluster and the client (helm).
  40. Installation • Download and install the helm client. • Invoke “helm init” to install the tiller in your K8S cluster. • Execute “helm repo update” to update the latest charts versions. • You can check the repositories with “helm repo list”. • Install a chart with “helm install repo/chart”. • You can see deployments with “helm ls”.
  41. Charts • Charts have a very specific directory structure. • The name of the root directory is the basic name of the chart (without the version part). • In the root directory there must be a Chart.yaml, the base descriptor of the chart. • Helm also looks for the “templates” and “charts” sub-folders.
  42. Chart.yaml apiVersion: The chart API version, always "v1" (required) name: The name of the chart (required) version: A SemVer 2 version (required) kubeVersion: A SemVer range of compatible Kubernetes versions (optional) description: A single-sentence description of this project (optional) keywords: - A list of keywords about this project (optional) home: The URL of this project's home page (optional) sources: - A list of URLs to source code for this project (optional) maintainers: # (optional) - name: The maintainer's name (required for each maintainer) email: The maintainer's email (optional for each maintainer) url: A URL for the maintainer (optional for each maintainer) engine: gotpl # The name of the template engine (optional, defaults to gotpl) icon: A URL to an SVG or PNG image to be used as an icon (optional). appVersion: The version of the app that this contains (optional). deprecated: Whether this chart is deprecated (optional, boolean) tillerVersion: The version of Tiller that this chart requires.
  43. Dependencies • The “charts” directory can have a requirements.yaml file specifying charts that the current one depends on. • Execute ”helm dependency update” to download the dependencies archives into the “charts” directory.
  44. Templates • Chart templates are written in Go template language. • They are stored under the ”templates” directory. • Every file in this directory passed through the template engine at the time of rendering. • You can specify values for the templates with: • A default values file: values.yaml at the root directory. • Pass a yaml file with values on “helm install”. • Take a look at what is created with “helm create”.
  45. Extensibility • Kubernetes is highly extensible. There are many extension points that you can use, depending on the use-case. • Let’s start with terminology.
  46. Extension Patterns • If your extension is a client of K8s, then your extension is called a controller. • When K8S is the client, we have two flavors: • Remote service accepting a network request: Webhook Backend. • A binary executed by K8S: Binary Plugin.
  47. api-server • At the heart of K8S sits the api-server. • Provides REST endpoints for the cluster-state. • All communications between K8S components go through the api-server. • Can be extended of-course…
  48. api-server flow • An incoming request goes through 3 stages: • Authentication. • Authorization. • Admission Control.
  49. Authentication • At the authentication phase, we differentiate between 2 types of users: • Normal user accounts. • Service accounts. • K8s doesn’t manage normal users’ account and doesn’t have an object representation for them. • It does so for service accounts.
  50. Authentication • Each request is couple with either: • A normal user. • A service account. • Is an anonymous request. • Built-in authentication can be either: • HTTP basic authentication. • Client certificate (supports user groups as of version > 1.4). • Bearer-token. • Authentication Proxy.
  51. Webhook • You can configure a Webhook to handle bearer-tokens. • The configuration file is passed through the --authentication- token-webhook-file flag. • The webhook will receive a POST request with the token and should return a status field with the authentication result.
  52. Authentication Proxy • If there is an authentication proxy in your organization, you can configure K8s to acknowledge the specific headers set by the authentication proxy. • For example: --requestheader-username-headers=X-Original-User --requestheader-group-headers=X-Original-Group --requestheader-extra-headers-prefix=X-Original-Attribute-
  53. Authorization • There are several authorization modules that are shipped with K8s. • Configured through the --authorization-mode flag. • When multiple modules are specified, they are invoked in a serial manner. • If a module rejects or accepts the request, no further module will be executed. • If all modules didn’t have an opinion, the request is rejected.
  54. Request Attributes • A request can either be a resource-API request, or a non- resource request. • For non-resource requests, authorization concerns the HTTP verb and Request-path fields. • For resource requests, authorization concerns the API, API request verb, resource, namespace and API group fields. • Common fields to both are: user, group and extra (authentication provided attributes).
  55. Authorization Modules • K8s provides the following authorization modules: • ABAC. • RBAC. • Node. • Webhook.
  56. ABAC • Stands for: Attribute-Based Access Control. • Policy file should be specified with the flag: --authorization- policy-file. • The policy file holds one JSON per line. • Changing the policy file requires a restart of the api-server. • Note that an unspecified property default to its zero value. • Let’s see a policy example…
  57. RBAC • Stands for: Role-Based Access Control. • When enabled, defines 4 object types: Role, ClusterRole, RoleBinding and ClusterRoleBinding. • Users can interact with these types just like any other K8s types (e.g., pods). • Let’s review them…
  58. Roles & ClusterRoles • Role objects specify permissions to a single namespace. • ClusterRole objects specify permissions cluster-wide (across all namespaces). • Example: kind: Role apiVersion: metadata: namespace: app1 name: default-deployment rules: - apiGroups: [”apps"] # "" indicates the core API group resources: [”deployments"] verbs: ["get","watch","list”, “create”]
  59. RoleBinding • Binds role to user/s. • As before, RoleBinding refers to a single namespace, whereas ClusterRoleBinding refers to all namespaces. • Think, what does it mean to bind a ClusterRole using a RoleBinding.
  60. RoleBinding • Example: kind: RoleBinding apiVersion: metadata: name: update-configmaps namespace: app1 subjects: - kind: User name: shimi apiGroup: roleRef: kind: ClusterRole name: update-configmaps
  61. Subjects • Note that subjects in a Binding can be one of: users, groups or service accounts. • Groups are provided by the authentication methods. • For both users and groups the prefix: “system:” should be disallowed and reserved for K8s system use.
  62. Authorization – Webhook • As mentioned before, a webhook is a REST extension to which K8s send requests. • In this case, authorization requests. • Adding a webhook requires: • A configuration file for the webhook. • A service responding to SuccessAccessReview POST request.
  63. Admission Control • Admission control components execute after the request has been authentication and authorized. • They are built-in components that are compiled into the api- server. • They are specified as a flag to the api-server (--admission- control). • Order is important.
  64. Admission Control • Each admission-control component can operate in either (or both) of two phases: mutation and validation. • Mutating components can modify the object the request operates on. • For example: AlwaysPullImages (very useful in multi-tenancy scenarios).
  65. Extension Points • MutatingAdmissionWebhook – runs in the mutation phase. Invokes webhooks defined as MutatingWebhookConfiguration. Matching requests are invoked in a serial manner. • ValidatingAdmissionWebhook – runs in the validation phase. Can’t mutate the state. Matching requests are invoked in parallel. Configuration is based on ValidatingWebhookConfiguration objects. • ImagePolicyWebhook – Allows for reviewing container images that are requested to be used.
  66. ImagePolicyWebhook • Requires a configuration file specified with the flag: --admission- control-config-file. • json or yaml format. Example: imagePolicy: kubeConfigFile: /Users/shimi/k8s/reviewer.yml # time in s to cache approval allowTTL: 50 # time in s to cache denial denyTTL: 50 # time in ms to wait between retries retryBackoff: 500 # determines behavior if the webhook backend fails defaultAllow: false
  67. ImagePolicyWebhook • The admission-control configuration file points to the webhook configuration: clusters: - name: image-review-server cluster: server: https://host1:9090/reviewer #users refers to the API server's webhook configuration. users: - name: kube-apiserver user: token: blue-token current-context: webhook contexts: - context: cluster: image-review-server user: kube-apiserver name: webhook
  68. ImageReview • Your webhook will receive a POST request with an ImageReview JSON document. • It must fill the status field with an allow subfield of either true or false values.
  69. Objects & Resources • Objects are persisted entities in K8S. • Each object is composed of a spec and a status. • A spec is a “record of intent” (e.g., a pod spec). • A resource is a K8S endpoint that represents a collection of objects.
  70. Custom Resources • Custom resources allows you to incorporate third-party resources into K8S management. • The easiest way (albeit the less flexible one) for defining a custom resource is by: CustomResourceDefinition (CDR). • Let’s see an example…
  71. CDR apiVersion: kind: CustomResourceDefinition metadata: # name must match the spec fields below, and be in the form:<plural>.<group> name: spec: group: version: v1 # can be either Namespaced or Cluster scope: Namespaced names: plural: databases singular: database kind: DataBase shortNames: - db - dbs
  72. CDR • Now, kubectl ”understands” databases. • E.g.: kubectl get dbs • And we can create resources of this custom type. • However, this only allows us to manage simple CRUD operations.

Notas do Editor