1. Analysis of Performance of
Docker for Varying I/O
Intensive Workloads
Kaushik Padmanabhan
Chander Mohan
David Stewart
Analysis of Performance of Docker for Varying I/O Intensive
Workloads
Kaushik Padmanabhan
Chander Mohan
David Stewart
Analysis of Performance of Docker for Varying I/O Intensive
Workloads
Kaushik Padmanabhan Supervision of Professor,
Chander Mohan Dr. Ningfang Mi
David Stewart
2. Docker
Alternative to Virtual Machine.
Provides a way to run applications securely, isolated in a container, packaged
with all its dependencies and libraries.
A Docker possesses a Docker hub (Docker Registry) that hosts the images of
applications which we want to run.
Each image can have numerous containers holding homogenous instances of
that particular application.
6. Abstract
In this project, we are considering homogenous instances of a particular
application in Docker containers under that particular application’s image,
which can hold numerous records of data.
Using this, we are going to evaluate the optimum number of containers under
the same application’s image which is adversely affected by the number of
records in numerous instances of an application within those containers.
7. Performance analysis tools
‘blktrace’ is a block layer IO tracing mechanism which provides detailed
information about request queue operations up to user space.
blktrace
A utility which transfers event traces from the kernel into either long-term on-
disk storage, or provides direct formatted output (via blkparse).
blkparse
A utility which formats events stored in files, or when run in live mode directly
outputs data collected by blktrace.
iostat
Reports Central Processing Unit (CPU) statistics and input/output statistics for
devices and partitions.
tpmc
Gives details on the number of transactions taken place in a database per
minute.
9. Our Method
We are deploying MySQL images into the Docker containers, with TPCC
benchmark.
The motive of our project,
To collect the blktrace and work on performance correlation of some
features from blktrace.
observe the optimal number of containers in terms of performance metrics.
We have captured different features from blktrace for performance
measurement for logical block addresses like,
Frequency.
Sequentiality.
Block Size.
Read/Write.
10. Benchmark Used
TPC-C Benchmark
tpcc is a write intensive test.
We downloaded the tpcc benchmark (tpcc_mysql_master) to load and
measure the performance of MySQL.
tpcc_load – This will load the multiple records into the table without
considering the manual loading.
tpcc_start – This will start and run the database and measures the
transactions happening between the connections mentioned.
11. Considerations
6 Containers and the image used is MySQL: 5.7
tpcc benchmark ran for,
Container1: ramp up time – 900 secs and measure time 3600 secs
Container2: ramp up time – 840 secs and measure time 3600 secs
Container3: ramp up time – 780 secs and measure time 3600 secs
Container4: ramp up time – 720 secs and measure time 3600 secs
Container5: ramp up time – 660 secs and measure time 3600 secs
Container6: ramp up time – 600 secs and measure time 3600 secs
blktrace calculated for 1000 secs and iostat for 100 secs.
12. Process Flow
We created containers and pulled MySQL images into the containers.
We created database and imported ‘create_table.sql’ and ‘add_foreign_key.sql’
into the database.
We ran the tpcc_load command to load the data into the tables in each container.
We need ran the tpcc_start to start the execution of the tables.
Simultaneously, we measured the performance of the containers using blktrace,
blkparse and iostat commands.
13. Performance Metrics Considered
Total I/O entries and size variation with respect to increasing Containers
(blktrace, blkparse and iostat).
CPU Utilization with respect to increasing Containers (iostat).
Frequency of Logical Block Addressing with respect to increasing Containers
(blktrace and blkparse).
Sequential and Random Reads/Writes percentage variation with respect to
increasing Containers (blktrace and blkparse).
Read to Write ratio variation with respect to increasing Containers (blktrace
and blkparse).
14. Total I/O Entries and Size comparison - blktrace
21291760
22986336 23130872 22681392 22737824 22608464
1
5000001
10000001
15000001
20000001
25000001
1 2 3 4 5 6
SizeinBytes
No. of Containers
Total I/O size
412859
440947 447577 431115 432504 431306
1
100001
200001
300001
400001
500001
1 2 3 4 5 6
NumberofI/OEntries
No. of Containers
Total number of I/O Entries
15. Total I/O Size and Service Time – iostat for 100sec
2.36
2.15
2.06
2.4 2.39 2.41
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
1 2 3 4 5 6
TimeinSeconds
No. of Containers
Service time
89964.4
97728.44 102075 99165.93 97461.64 97059.68
1
20001
40001
60001
80001
100001
120001
1 2 3 4 5 6
SizeinBytes
No. of Containers
Total I/O Size
17. Inferences Made
Using blktrace
The number of entries and the size saturate.
Shows the I/O entries made (reads and writes) are saturating after running
3 containers simultaneously.
Using iostat
The same happened with respect to the size.
When the Service Time (svctm) changes, taken by the I/O for performing
the Read/Write, are considered (obtained using iostat), the saturation of the I/O
size and entries produced can be easily explained.
Using tpmc
Analyzed tpmc metrics with increasing containers.
The analysis reveals the saturation of tpmc after running 3 containers,
which gives information on the saturation of I/O entries and size.
21. Inferences Made
The frequency of LBA reuse on secondary storage device decreases.
Means less LBA hits in Secondary Storage Device.
This means cache is being utilized effectively.
Decrease in frequency means equivalent rise in cache hits when more
containers use redundant data.
24. Inferences Made
Due to pre-fetch policy in the cache management, the number of sequential
accesses from the secondary storage device reduce.
That leads to increase in the random accesses from the secondary storage
device at the same point.
25. Read to Write Ratio
0.000475
0.0007309
0.00147
0.00204322 0.00212
0.003018
0
0.0005
0.001
0.0015
0.002
0.0025
0.003
0.0035
1 2 3 4 5 6
Read/Writeratio
No. of Containers
Read/Write
28. Inferences Made
▶ Since the data has already been written for previous containers, if we increase
the containers, the read will be more than writes.
▶ As the containers increase, there is a utilization of cache memory present
between mysql containers and hard drive resulting in decrease of actual
writes to secondary memory.
▶ The reads increase though due to cache misses which leads to increased
random accesses in the secondary storage device.
29. Conclusion
Default system cache does good job.
The performance of Docker containers are measured with increasing containers
for different metrics.
It is found that different metrics have affected differently
A cache designed just for containers as seen in slacker paper may boost
performance more.
It may be helpful especially in large number of containers of same image.
Future Work