2. The distribution of Docker images can be done in couple
of ways. One is through the docker-cli; image is saved into a
tarball, which then can then be untarred and register on the
destination host. The other is by using an independent
application called Docker Registry. Docker Registry is a
centralized store that keep Docker images. Some of well-
known Docker registries are Docker Hub, Quay.io and Google
Container Registry. All of these registries host popular Docker
images such as Alpine, Ubuntu, Redis, Mongo and more. User
can create repository and push their images to the registry
which are separated by repository_name, image_name and
tags. Tags is an identifier that differentiate the same image
name. While repository_name is an arbitrary name given by
the user. If an image push request has the same name as an
existing image, the latter image is then replaced by the former
image. Hence having tags is crucial to identify differences
within the same images.
Behind the Registry are two main API that is used to push
and pull Docker images. We describe what each of these API
call do:
PULL - when a pull command is issued from Docker
daemon, a manifest file is then fetch using GET method. This
manifest file is then used to determine if a layer is available
locally or needed to be pulled from the Registry. A HEAD
methods is used to check for available layers in the Registry.
When a layer is needed to be pulled from the Registry, a GET
method is then issued to the Registry blobs (a compressed
layer in the Registry), downloads and extracts it in the host
[8].
PUSH - In order to upload image from local host to the
Registry, a push command is issued from the Docker daemon.
This command is the exact opposite of PULL command. From
the created manifest during image build, a HEAD method is
issued to the Registry to check for any available layers already
in the Registry. If the layer is not found in the Registry, a
POST method is then issued to get the uuid (unique upload
identifier) and then upload is initiated using chunked transfers
by PUT method. After all the layers are uploaded, the manifest
is then uploaded to the Registry [8].
III. RELATED WORK
Improvements in Docker Registry have been proposed
ever since its inception. Some of the work have been made in
network and security area. Whilst most of what we suggested
in this paper resolve around operational features, we also
listed several previous works that touches on Docker Registry
implementation that are on the edge.
A method for risk assessment of container-based cloud
platform suggests three stages for quantitative risk
assessment. They are Image Assessment, Configuration
Assessment and Service Assessment. The measurement of all
three assessment is compiled into a quantitative measurement
that evaluates the risk of a container during runtime. Despite
a higher coverage of risk assessment in the three stages, we
propose an earlier assessment of Docker image, that is after an
image upload, where the assessment can be made before a
container is set to run [9] [10].
CoMICon, a co-operative management system for Docker
container images proposed a co-operative Registry nodes
through a peer-to-peer (P2P) protocol. In this implementation,
the node will pull a missing layer from the closest node (if
available) before eventually pulling if from the main Registry.
This method is achieved by storing images in a form of layers
and sharing them between registries through P2P method.
Contrast to our method here where we use a shared storage,
CoMICon can utilized the speed of local storage while
maintaining image distribution with all Registry nodes [11].
Bolt, a hyper convergence design for container registries
proposed a tightly connected clusters of registries with the
same consolidated roles. The design uses consistent hashing
and Zookeeper making the nodes storage-aware and allows for
efficient caching strategies. The registries then form a
consistent hashing ring where Zookeeper is used to identify
them from each other. A modified Docker daemon (the
daemon needs to be aware of which registries stores which
layers) is then used on the client side to directly query the
Registry nodes for the requested layers. Compared with our
proposal, we are only looking at the scaling portion [12].
IV. PROPOSED METHOD
In this section, we discuss the implementations of three
new features for Docker Registry. (1) Design the Registry to
scale more than one instances, (2) add self-authentication and
authorization methods for accessing the Registry and (3)
adding an image vulnerability scanning service to the
Registry.
A. Scalable Docker Registry
Docker Registry in itself have the capability to scale
thanks to its stateless nature. Stateless here means the Registry
is not keeping track of any communication states or sessions.
Taking advantage of this, we implement a proxy
(NGINX)[13] in front of the Registry. Through this proxy, we
can intercept requests made to the Registry and re-routing
them and possibly load-balance incoming request to the
Registry servers.
We configure NGINX with two different upstream:
1) PUSH upstream with ip_hash balancing
2) PULL upstream with least_conn balancing
PUSH upstream is designed to handle PUT, POST and
PATCH request. This is because we wanted Registry upload
request to stick to a single Registry server when uploading.
The objective is to maintain the image layer consistency when
uploading.
PULL upstream on the other hand is configured to have
least_conn balancing for HEAD and GET request. Least
connection will use whichever server that is having the least
amount of connections at the time of request. With this
method, each of image layers can be pulled from different
Registry server thus contributing to faster image download.
By implementing these two methods, Docker Registry can
now be scale to a certain number. A single point of failure
would not be an issue. Performance in upload and download
of Docker images from and to the node increases.
B. User management
The user management takes the advantage of Docker
Registry built in token authentication feature. The default
Registry handles only validation. We utilize this feature to
incorporate our own user management feature.
The user management consist of two different parts. The
first part is the authentication. By following the flow from
figure 1, we can observe the process of authentication as such:
3. 1. Docker client initiate a request either for pulling,
pushing or login, it does not matter as each request is
being check for a valid token.
2. Docker Registry checks for valid token. If no token is
found, our request will be replied with the address of
where we can authenticate and request for token.
3. Docker daemon now forward our request to the user
management server (this is the address that is given to
us by the Registry server) with our authentication
values for validation.
4. Using authentication values from the request, the user
management checks its validity from the user
database.
5. User validation is then returned to the user
management.
6. User management then returns the response to the
Docker daemon. This response will contain a token if
the user is valid and an error messages if the user is
not valid.
7. A second request is then made to the Docker Registry,
this time with token.
Since we are now a valid user, a request to the Docker
Registry is now valid. User can now pull or push
images into the Docker Registry.
Fig. 1: Authentication flow
Authorization grants the permission for user to make a
PUSH or PULL request to user’s’ repository. A repository is
a dedicated assign work area where images are kept. As an
example, observe the following Registry path:
registry.mycompany.org:443/project1/alpine:latest
Project1 in this case is the repository. This repository can
be created beforehand by using the user management UI.
Through a Registry notification system, the user
management can intercept a request made and determine if a
particular user is authorized to pull an image or push an image
to and from the repository. We designed the authorization
portion to allow owner of images to decide the visibility of
their repository and the images in it. This visibility is defined
as such:
TABLE I: REPOSITORY VISIBILITY
Visibility Description
PUBLIC
Any member & non-member can pull and push
images into this repository
PULL ONLY
Only member can pull images from the
repository
PUSH & PULL
Only member can pull and push images into the
repository
PRIVATE
Only repository/workspace owner has the
permission to pull and push images.
C. Image vulnerability scanning
We use a tool called Anchore in this setup[14]. Anchore is
an open source Docker image inspection, analysis and
certification tool. It runs standalone or can be integrated
within an orchestration’s platform. There is also an enterprise
version of Anchore, that adds graphical UI for all its
management and backend-end control, but in this
implementation, we adopt the open source version.
By utilizing Docker Registry notification, we set up a
messaging service that will store the name of images that is
being pushed into the Registry. From here, we then register
this name into Anchore through a REST request. Anchore in
this setup is configured to pull any images from the Registry
and scanned them. The result of the scanned images is kept
within Anchore and can be reach via Anchore CLI or REST
request. User is then presented through a UI of their images in
the Registry with the results of the scanning.
V. CONCLUSION AND DISCUSSION
The importance of Docker Registry can be observed by
looking at the adoption rate of Docker containers. According
to Datadog, based on companies that uses their product, nearly
one quarter of companies have adopted Docker and 20% of
hosts monitored by them are a Docker hosts[15]. In 2018, it
has been observed with a 75% growth of adaptation. With this
positive trending towards Docker usage, the importance of
having a scalable and secure Docker Registry is not a mere
suggestion anymore, it is crucial.
In this paper, we outline three of the most crucial
implementations that increases productivity, performance and
security. Productivity as in managing authentication and
authorization to valid users, performance as in scaling the
Registry into multi hosts image servers and security as in
scanning images for vulnerability before using them.
Looking forward, the features of Docker Registry will
always evolve, but in retrospective, our paper addresses the
core features that should be enhance.
REFERENCES
[1] D. Bernstein, “Containers and Cloud: From LXC to Docker to
Kubernetes,” IEEE Cloud Comput., vol. 1, no. 3, pp. 81–84, 2014.
[2] B. I. Ismail et al., “Evaluation of Docker as Edge computing platform,”
in 2015 IEEE Conference on Open Systems (ICOS), 2015, pp. 130–
135.
[3] C. Anderson, “Docker, Software Engineering,” in IEEE Software,
2015, vol. 32, no. 3, pp. 102–105.
[4] M. Satyanarayanan, P. Bahl, R. Caceres, and N. Davies, “The Case for
VM-Base Cloudlets in Mobile Computing,” Pervasive Comput., vol. 8,
no. 4, pp. 14–23, 2009.
[5] M. G. Xavier, M. V Neves, F. D. Rossi, T. C. Ferreto, T. Lange, and C.
a F. De Rose, “Performance Evaluation of Container-based
Virtualization for High Performance Computing Environments,” Proc.
4. 2013 21st Euromicro Int. Conf. Parallel, Distrib. Network-Based
Process., no. LXC, pp. 233–240, 2013.
[6] W. Felter, A. Ferreira, R. Rajamony, and J. Rubio, “An Updated
Performance Comparison of Virtual Machines and Linux Containers,”
Technology, vol. 25482, 2014.
[7] M. G. Xavier, I. C. De Oliveira, F. D. Rossi, R. D. Dos Passos, K. J.
Matteussi, and C. a. F. De Rose, “A Performance Isolation Analysis of
Disk-Intensive Workloads on Container-Based Clouds,” 2015 23rd
Euromicro Int. Conf. Parallel, Distrib. Network-Based Process., no.
FEBRUARY, pp. 253–260, 2015.
[8] Docker Inc, “About Registry.” [Online]. Available:
https://docs.docker.com/registry/introduction/. [Accessed: 06-Apr-
2020].
[9] E. Mostajeran, M. N. M. Mydin, M. F. Khalid, B. I. Ismail, R. Kandan,
and O. H. Hoe, “Quantitative risk assessment of container based cloud
platform,” in 2017 IEEE Conference on Application, Information and
Network Security (AINS), 2017, pp. 19–24.
[10] E. Mostajeran, M. F. Khalid, M. N. M. Mydin, B. I. Ismail, and H. H.
Ong, “Multifaceted Trust Assessment Framework for Container based
Edge Computing Platform,” in Fifth International Conference On
Advances in Computing, Control and Networking - ACCN 2016, 2016.
[11] S. Nathan, R. Ghosh, T. Mukherjee, and K. Narayanan, “CoMICon: A
Co-Operative Management System for Docker Container Images,” in
2017 IEEE International Conference on Cloud Engineering (IC2E),
2017, pp. 116–126.
[12] M. Littley et al., “Bolt: Towards a Scalable Docker Registry via
Hyperconvergence,” in 2019 IEEE 12th International Conference on
Cloud Computing (CLOUD), 2019, pp. 358–366.
[13] F5 Networks Inc, “What is NGINX?” [Online]. Available:
https://www.nginx.com/resources/glossary/nginx/. [Accessed: 03-
Mar-2020].
[14] I. Anchore, “Anchore Engine AN OPEN SOURCE TOOL FOR DEEP
IMAGE INSPECTION AND VULNERABILITY SCANNING.”
[Online]. Available: https://anchore.com/opensource/. [Accessed: 03-
Mar-2020].
[15] DataDog, “8 suprising facts about real docker adoption,” 2018.
[Online]. Available: https://www.datadoghq.com/docker-adoption/.
[Accessed: 02-Apr-2020].