2. Plan
• Docker what next ?
• Cluster and Cluster management and why !?
• Open source solutions of orchestration , Mesos
• Kubernetes , Kubernetes Vs Mesos Vs Swarm
• Kubernetes users , features , architecture and kubernetes design
• Demo
• Conclusion
214/12/2017
3. Docker what next !?
We need more than just packing and isolation
Scheduling: where should my containers run?
Lifecycle and health check: keep my containers running despite
failures
Discovery: where are my containers now?
Monitoring: what's happening with my containers?
Aggregates: compose sets of containers into jobs
Scaling: making jobs bigger or smaller
Running a server cluster on a set of Docker containers,
on a single Docker host is vulnerable to single point of
failure!
314/12/2017
4. Cluster ?
A cluster is a group of
servers and other
ressources that act like
a single system and
enable high availibility
and,in some cases, load
balancing and parallel
processing
414/12/2017
5. Why a cluster ?!
Storage
High availability: ( Active-Passive Cluster )
Load balancing: (Active – Active )-
High performance
514/12/2017
6. Cluster manager
A cluster manager usually is a
backend graphical user interface (GUI) or
command-line software that runs on one
or all cluster nodes (in some cases it runs
on a different server or cluster of
management servers.)
614/12/2017
7. Open source solutions of Clustering
• Apache Mesos, of Apache Software
Foundation
• Kubernetes, founded by Google Inc, from the
Cloud Native Computing Foundation
• Linux Cluster Manager (LCM)
• Heartbeat, of Linux-HA
• OpenHPC
714/12/2017
8. Mesos
Apache Mesos is a cluster manager that provides
efficient resource isolation and sharing across
distributed applications or frameworks. Mesos is a
open source software originally developed at the
University of California at Berkeley. It sits between
the application layer and the operating system and
makes it easier to deploy and manage applications
in large-scale clustered environments more
efficiently.Prominent users of Mesos
include Twitter, Airbnb, MediaCrossing, Xogito and
Categorize.
814/12/2017
9. • container orchestrator
• support multiple containers technologies
• support multiple cloud and bare-metal
environments
• support existing OSS apps
• 100% opensource
• written in Go
• provides load-balancing, auto-healing,
scaling features
• Started by Google in 2014, now Google,
CodeOS, Redhat, Mesosphere,
Microsocontainer orchestrator
914/12/2017
Kubernetes
11. Kubernetes VS Mesos VS Swarm
Swarm: Easy to integrate and set up, flexible API, but limited customization,
1,000 nodes , 50,000 containers , Great value and easily scalable for small to
medium systems.
Kubernetes: Highly versatile, large open-source dev community, but more
expensive, 1,000 nodes , 50,000 containers , Best for medium-scale, highly
redundant systems, but requires a larger IT staff.
Mesos: Best for large systems and designed for maximum
redundancy, 50,000 nodes , The most stable platform, but overly complex
for small-scale systems under 10-20 nodes.
1114/12/2017
14. Kubernetes Vocabulary
Pod - A group of Containers
● Labels - Labels for identifying pods
● Kubelet - Container Agent
● Proxy - A load balancer for Pods
● etcd - A metadata service
● cAdvisor - Container Advisor provides resource
usage/performance statistics
● Replication Controller - Manages replication
of pods
● Scheduler - Schedules pods in worker nodes
● API Server - Kubernetes API server
1414/12/2017
18. Kubernetes POD
Group of one or more containers that are
always co-located,
co-scheduled, and run in a shared context
● Containers in the same pod have the same
hostname
● Each pod is isolated by
○ Process ID (PID) namespace
○ Network namespace
○ Interprocess Communication (IPC)
namespace
○ Unix Time Sharing (UTS) namespace
● Alternative to a VM with multiple processes 1814/12/2017
19. Labels / selectors
Use to determine which objects to
apply
an operation to
Queryable by selectors
• think SQL ‘select ... where ...’
Use to determine which objects to
apply
an operation to
• pods under a
ReplicationController
• pods in a Service
• capabilities of a node (scheduling
constraints)
1914/12/2017
20. Services
An abstraction to define a logical set of Pods
bound by a policy by to access them
2014/12/2017
21. Controllers
• A controller is a reconciliation loop that drives actual cluster
state toward the desired cluster state
• A ReplicationController ensures that a specified number of pod
replicas are running at any one time. In other words, a
ReplicationController makes sure that a pod or a homogeneous
set of pods is always up and available.
• DaemonSet Controller for running exactly one pod on every
machine (or some subset of machines)
• A job creates one or more pods and ensures that a specified
number of them successfully terminate
2114/12/2017
Storage – storage Clustering is the use of two or more storage servers working together to increase performance,capacity or reliability.
High availibility ( Active-Passive Cluster ) – A high availibility cluster is a group of hosts that act like a single system and provide continuous uptime. High availibility clusters are often used for load balancing,backup and failover purposes
Load balancing( Active – Active )-Load balancing scales the performance of server-based programs,such as a web server,by distributing client requests across multiple servers
High performance –As name says, to have have high performance we use this clusters
Clusters are usually deployed to improve performance and availability over that of a single computer, while typically being much more cost-effective than single computers of comparable speed or availability.[3]
Computer clusters emerged as a result of convergence of a number of computing trends including the availability of low-cost microprocessors, high-speed networks, and software for high-performance distributed computing.[citation needed] They have a wide range of applicability and deployment, ranging from small business clusters with a handful of nodes to some of the fastest supercomputers in the world such as IBM's Sequoia.[4]
The cluster manager works together with a cluster management agent. These agents run on each node of the cluster to manage and configure services, a set of services, or to manage and configure the complete cluster server itself (see super computing.) In some cases the cluster manager is mostly used to dispatch work for the cluster (or cloud) to perform. In this last case a subset of the cluster manager can be a remote desktop application that is used not for configuration but just to send work and get back work results from a cluster. In other cases the cluster is more related to availability and load balancing than to computational or specific service clusters.
Box : "Kubernetes has the opportunity to be the new cloud platform. The amount of innovation that's going to come from being able to standardize on Kubernetes as a platform is incredibly exciting - more exciting than anything I've seen in the last 10 years of working on the cloud.
Box (formerly Box.net), based in Redwood City, California, is a cloud content management and file sharing service for businesses. The company uses a freemium business model to provide cloud storage and file hosting for personal accounts and businesses.[6] Official clients and apps are available for Windows, macOS, and several mobile platforms. Box was founded in 2005.
Automatic binpacking
Automatically places containers based on their resource requirements and other constraints, while not sacrificing availability. Mix critical and best-effort workloads in order to drive up utilization and save even more resources.
Self-healing
Restarts containers that fail, replaces and reschedules containers when nodes die, kills containers that don't respond to your user-defined health check, and doesn't advertise them to clients until they are ready to serve.
Horizontal scaling
Scale your application up and down with a simple command, with a UI, or automatically based on CPU usage.
Service discovery and load balancing
No need to modify your application to use an unfamiliar service discovery mechanism. Kubernetes gives containers their own IP addresses and a single DNS name for a set of containers, and can load-balance across them.
Batch execution
In addition to services, Kubernetes can manage your batch and CI workloads, replacing containers that fail, if desired.
Secret and configuration management
Deploy and update secrets and application configuration without rebuilding your image and without exposing secrets in your stack configuration.
Automated rollouts and rollbacks
Kubernetes progressively rolls out changes to your application or its configuration, while monitoring application health to ensure it doesn't kill all your instances at the same time. If something goes wrong, Kubernetes will rollback the change for you. Take advantage of a growing ecosystem of deployment solutions.
Storage orchestration
Automatically mount the storage system of your choice, whether from local storage, a public cloud provider such as GCP or AWS, or a network storage system such as NFS, iSCSI, Gluster, Ceph, Cinder, or Flocker.
etcd[edit]
etcd is a persistent, lightweight, distributed, key-value data store developed by CoreOS that reliably stores the configuration data of the cluster, representing the overall state of the cluster at any given point of time. Other components watch for changes to this store to bring themselves into the desired state.[30]
API server[edit]
The API server is a key component and serves the Kubernetes API using JSON over HTTP, which provides both the internal and external interface to Kubernetes.[22][31] The API server processes and validates REST requests and updates state of the API objects in etcd, thereby allowing clients to configure workloads and containers across Worker nodes.
Scheduler[edit]
The scheduler is the pluggable component that selects which node an unscheduled pod (the basic entity managed by the scheduler) should run on based on resource availability. Scheduler tracks resource utilization on each node to ensure that workload is not scheduled in excess of the available resources. For this purpose, the scheduler must know the resource requirements, resource availability and a variety of other user-provided constraints and policy directives such as quality-of-service, affinity/anti-affinity requirements, data locality and so on. In essence, the scheduler’s role is to match resource "supply" to workload "demand".[32]
A expliquer au tableau
The Kubernetes Master is the main controlling unit of the cluster that manages its workload and directs communication across the system. The Kubernetes control plane consists of various components, each its own process, that can run both on a single master node or on multiple masters supporting high-availability clusters.[30]The various components of Kubernetes control plane are as follows:
API server[edit]
The API server is a key component and serves the Kubernetes API using JSON over HTTP, which provides both the internal and external interface to Kubernetes.[22][31] The API server processes and validates REST requests and updates state of the API objects in etcd, thereby allowing clients to configure workloads and containers across Worker nodes.
Controller manager[edit]
The controller manager is the process that the core Kubernetes controllers like DaemonSet Controller and Replication Controller run in. The controllers communicate with the API server to create, update and delete the resources they manage (pods, service endpoints, etc.)[31]
Scheduler[edit]
The scheduler is the pluggable component that selects which node an unscheduled pod (the basic entity managed by the scheduler) should run on based on resource availability. Scheduler tracks resource utilization on each node to ensure that workload is not scheduled in excess of the available resources. For this purpose, the scheduler must know the resource requirements, resource availability and a variety of other user-provided constraints and policy directives such as quality-of-service, affinity/anti-affinity requirements, data locality and so on. In essence, the scheduler’s role is to match resource "supply" to workload "demand".[32]
etcd is a persistent, lightweight, distributed, key-value data store developed by CoreOS that reliably stores the configuration data of the cluster, representing the overall state of the cluster at any given point of time. Other components watch for changes to this store to bring themselves into the desired state.[30]
Kubernetes node[edit]
The Node also known as Worker or Minion is the single machine (or virtual machine) where containers (workloads) are deployed. Every node in the cluster must run the container runtime (such as Docker), as well as the below-mentioned components, for communication with master for network configuration of these containers.
Kubelet[edit]
Kubelet is responsible for the running state of each node (that is, ensuring that all containers on the node are healthy). It takes care of starting, stopping, and maintaining application containers (organized into pods) as directed by the control plane.[22][33]
Kubelet monitors the state of a pod and if not in the desired state, the pod will be redeployed to the same node. The node status is relayed every few seconds via heartbeat messages to the master. Once the master detects a node failure, the Replication Controller observes this state change and launches pods on other healthy nodes.[citation needed]
Kube-proxy[edit]
The Kube-proxy is an implementation of a network proxy and a load balancer, and it supports the service abstraction along with other networking operation.[22] It is responsible for routing traffic to the appropriate container based on IP and port number of the incoming request.
cAdvisor[edit]
cAdvisor is an agent that monitors and gathers resource usage and performance metrics such as CPU, memory, file and network usage of containers on each node.
Replication controller : It also handles creating replacement pods if the underlying node fails
The set of pods that a controller manages is determined by label selectors that are part of the controller’s definition.