Solve the colocation conundrum: Performance and density at scale with Kubernetes

Solve the colocation conundrum
Performance and density at scale with Kubernetes
Niklas Nielsen – Intel Corp

Legal Notices and Disclaimers
 Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or
service activation. Learn more at intel.com, or from the OEM or retailer.
 No computer system can be absolutely secure.
 Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.
Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or
configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider
your purchase. For more complete information about performance and benchmark results, visit
http://www.intel.com/performance.
 This document contains information on products, services and/or processes in development. All information provided here is
subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and
roadmaps.
 The products described may contain design defects or errors known as errata which may cause the product to deviate from
published specifications. Current characterized errata are available on request.
 No license (express or implied, by estoppel or otherwise) to any intellectual property rights is granted by this document.
 Intel does not control or audit third-party benchmark data or the web sites referenced in this document. You should visit the
referenced web site and confirm whether referenced data are accurate.
 Intel, Xeon, Atom, Core, and the Intel logo are trademarks of Intel Corporation in the United States and other countries.
 *Other names and brands may be claimed as the property of others.
 © 2017 Intel Corporation.

0 1 2 3 4 5 6
First
Second
First and Second
First ‘O’
Done typing
OSCON in autocomplete list
OSCON 2016 is
autocomplete list
Pushed enter
OSCON 2017 is found
Rest of search
OSCON context
OSCON logo
2 seconds
>5 seconds

Let’s talk about micro services

Everyone is pursuing micro service architectures

Single outliers have a big impact at scale

Monolithic service
uService
A
uService
B
uService
C
uService
E
uService
D

Developer
Velocity
Resiliency Scale

The number of components increase linearly
0
2
4
6
8
10
12
1 2 3 4 5 6 7 8 9 10

The number of internal requests grow super linearly

With one hundred services
involved
…

one out of hundred
requests takes over one
second…
1/100
1/100
1/100

One late request for the entire request to be slow
Come on
hurry up!

How many users overall will experience a latency
above one second?
A <30%
B 30-60%
C 60-100%

C: 63%
Experiencing one second or worse!
28% of customers will not return to a slow site[1]
[1] 2016 Holiday Retail Insights Report

1/100
P(>1s) = 1 – (1 – R)^N
R = 1/100
N=3
P(>1s) = 2.9701%
R = 1/100
N=100
P(>1s) = 63.3%

Jeffrey Dean and Luiz André Barroso. 2013. The tail at
scale. Commun. ACM 56, 2 (February 2013), 74-80

Variability accumulates when more than one system
serves a request

0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0 2 4 6 8 10 12
Series1
Latency
Frequency
99% 1%
The “tail”

With micro services, scale is easy but hard to
control when coming to tail latency

You will have to deal with this

Aggressor
Antagoniser
Noisy neighbor
In
Best effort
tasks
Interference
Contention
Variability
for
High priority
tasks
Causes

How have large infrastructure operators dealt with
variability?
Hedge your bets

Server 1
Server 2 Server 3 Server 4

We built a tool to help you gain insight into causes
of variability
Swan

100% Load
Objective
Latency
Best caseWorse

100% Load10% Load
Best case
Interference #1
Interference #2
Best case
Interference #1
Interference #2

for load := 10% 20% ... 100%
for aggressor := A ... C
for repetition := 1 ... 3
start_kubernetes()
start_memcached()
sustain_QPS(load)
record_metrics()
start(aggressor)
experiment.go
Import swan
experiment = Experiment(‘9F2DE9AF-177E-4E6F-
A994-2FF59075448B’)
experiment.profile()
Cassandra
Snap

Why didn’t Kubernetes usual performance isolation
protect the workload?
Not a Kubernetes issue (only)

Logical
Core1
Logical
Core2
Time
Process 1
Process 2
Process 3

Cgroups cpu shares is the defacto cpu isolation in
container schedulers
1024
2048 1024
210240

A tiny fraction of cpu time is enough to cause severe
performance issues

Modern CPUs is helping reduce the causes of these
interferences

Core Core Core Core
Interconnect
Last Level Cache
Memory bandwidth
Core Core

IntelⓇ Resource Director
Technology is an umbrella
Cache occupancy
Memory bandwidth
Cache Allocation
Code Data Prioritization

Scenario / Load 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0
Baseline 49% 46% 53% 48% 64% 73% 98% 108% 131% 113%
Experiment 876% 945% 946% 893% 953% 898% 887% 921% 851% 901%
Scenario / Load 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0
Baseline 52% 51% 45% 54% 60% 69% 89% 100% 101% 111%
Experiment 167% 504% 458% 521% 545% 917% 948% 878% 886% 971%
Scenario / Load 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0
Baseline 36% 34% 29% 40% 34% 42% 50% 67% 77% 98%
Experiment 31% 31% 30% 37% 47% 50% 65% 84% 346% 353%
Kubernetes QoS
Core isolation
Intel RDT

Cache Allocation
# mount -t resctrl resctrl /sys/fs/resctrl
# cd /sys/fs/resctrl
# mkdir p0 p1
# echo "L3:0=3" > /sys/fs/resctrl/p0/schemata
# echo "L3:0=c" > /sys/fs/resctrl/p1/schemata
0xc
0x3
0xfFull L3 cache
P0
P1

Cache Allocation
# echo 1234 > /sys/fs/resctrl/p0/tasks
# echo C0 > /sys/fs/resctrl/p1/cpus
Core
0
Core
1
Core
2
Core
3
P0
P1

Cache Allocation
Code
Data
Process
Heap
Stack
Core
I D
pc
*(0xf940)
L2
L3

Cache Allocation
# mount -t resctrl resctrl -o cdp /sys/fs/resctrl
# mkdir –p /sys/fs/resctrl/p0
# echo "L3data:0=3" >> /sys/fs/resctrl/p0/schemata
# echo "L3code:0=c" >> /sys/fs/resctrl/p0/schemata
Core
I D
L2
Core
I D
L2
L3
L1

Available in Linux 4.10
Cache Allocation

Cache occupancy
Memory bandwidth
# perf stat -e intel_cqm/llc_occupancy/ -I 1000 dd if=/dev/zero of=/dev/null
# time counts unit events
1.000128952 229,376 Bytes intel_cqm/llc_occupancy/

How do you use this number?
$ lscpu
...
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 12288K
Last Level Cache
Process
Occupanc
y

Cache occupancy
Memory bandwidth
# perf stat -e intel_cqm/local_bytes/ -I 1000 dd if=/dev/zero of=/dev/null
# time counts unit events
1.000129604 0.20 MB intel_cqm/local_bytes/

How do you use this number?
Core Core
Interconnect
Last Level Cache
Memory bandwidth
CoreCore
Process
Bandwidth

Cache occupancy
Memory bandwidth
Cache Monitoring
Technology (CMT)
Memory Bandwidth
Monitoring

The number of services involved in a request is increasing super
linearly

The largest cluster users have dealt with accumulated variability for
years

IntelⓇ helps by using priority to reduce the sources of variability
through IntelⓇ RDT

Swan is a tool to understand the effects of interference and how to
avoid it

Swan is under Apache 2.0 License and available for download today
https://github.com/intelsdi-x/swan
Read more about how to use Intel Ⓡ RDT
https://github.com/01org/intel-cmt-cat/

Thanks to all involved in this project
 Maciej Iwanowski, Pawel Palucki, Szymon Konefal, Maciej Patelczyk, Michal Stachowski,
Arek Chylinski and the rest of the Swan team
 Andrew Herdich and the Intel RDT teams
 Tony Luck, Fenghua Yu and Intel Linux Kernel teams

Solve the colocation conundrum: Performance and density at scale with Kubernetes

Recomendados

Recomendados

Mais conteúdo relacionado

Mais procurados

Mais procurados (19)

Semelhante a Solve the colocation conundrum: Performance and density at scale with Kubernetes

Semelhante a Solve the colocation conundrum: Performance and density at scale with Kubernetes (20)

Último

Último (20)

Solve the colocation conundrum: Performance and density at scale with Kubernetes

Notas do Editor