2. “Brief History of Containers”
2001 2002 2003 20052004
First implementation of
containers based on syscall
interposition — Columbia
3. “Brief History of Containers”
2001 2002 2003 20052004
First implementation of
containers based on syscall
interposition — Columbia
First research paper on
Linux Containers —
OSDI’02
4. “Brief History of Containers”
2001 2002 2003 20052004
First research paper on
Linux Containers —
OSDI’02
First container-based
distributed checkpointing —
HP Labs
First implementation of
containers based on syscall
interposition — Columbia
5. “Brief History of Containers”
2001 2002 2003 2005
Enterprise Linux
Container solution —
Meiosys
2004
First research paper on
Linux Containers —
OSDI’02
First container-based
distributed checkpointing —
HP Labs
First implementation of
containers based on syscall
interposition — Columbia
6. “Brief History of Containers”
2001 2002 2003 2005
Enterprise Linux
Container solution —
Meiosys
2004
First research paper on
Linux Containers —
OSDI’02
IBM acquires Meiosys —
Focus shifted to AIX
First container-based
distributed checkpointing —
HP Labs
First implementation of
containers based on syscall
interposition — Columbia
7. “Brief History of Containers”
2001 2002 2003 2005
Enterprise Linux
Container solution —
Meiosys
2004
First research paper on
Linux Containers —
OSDI’02
IBM acquires Meiosys —
Focus shifted to AIX
First container-based
distributed checkpointing —
HP Labs
First implementation of
containers based on syscall
interposition — Columbia
8. “Brief History of Containers”
2001 2002 2003 2005
Enterprise Linux
Container solution —
Meiosys
2004
First research paper on
Linux Containers —
OSDI’02
IBM acquires Meiosys —
Focus shifted to AIX
First container-based
distributed checkpointing —
HP Labs
First implementation of
containers based on syscall
interposition — Columbia
Most core kernel changes
finally made into Linux mainline
12. Why not Virtual Machines?
Application — Hardware misalignment
Hypervisor
Container Host
Application
Application
Applications have round edges
— system call interface
Hypervisors expose square holes
— hardware interface
Lightweight abstraction without
IO overhead or startup latency
13. Why not Virtual Machines?
Application — Hardware misalignment
Hypervisor
Container Host
Application
Applications have round edges
— system call interface
Hypervisors expose square holes
— hardware interface
Lightweight abstraction without
IO overhead or startup latency
The unwelcome
Guest OS
Application
14. Host
iSCSI, NFS
Image Format Interpreter
Virtual Device
VM Exit (Context Switch)
Guest Driver
Guest File System
Host
Application
Why not Virtual Machines?
Layers of Intermediate Software
VMsContainers
Application
High IO overhead due to
many intermediate layers
15. Why not Virtual Machines?
The Unwelcome Guest OS
Slow startup time
Guest OS licensing and maintenance burden
Poor scalability
High resource consumption due to duplication
Obfuscated network / storage / compute topologies
Application semantic information is lost
18. !
Node Manager
Customer A
Task 1
Customer B
Task 1
Containers on YARN
Node Manager Spawned Tasks as Containers
Container Virtualization
Customer A
Task 2
Customer C
Task 1
Tasks representing the same job share the same container
19. Containers on YARN
Advantages
Secure multitenancy
Performance Isolation
Utilization via coscheduling IO and CPU tasks
Consistent cluster environment
Isolation of software dependencies / configuration
Reproducible way to define app environment
Rapid provisioning
20. ❏ Recent addition to the kernel
❏ Superuser in container maps to a
regular user on the host
❏ Docker support for UID virtualization
Privilege Isolation through UID namespaces
Host
Container
Container root
UID 0
Regular user
UID 100
UID Virtualization
U
Host root
UID 0
21. References
!
❏ Blog post describing UID virtualization support in Docker
❏ https://www.altiscale.com/making-docker-work-yarn/
❏ Apache wiki page tracking work status across Docker and YARN projects
❏ https://wiki.apache.org/hadoop/dineshs/IsolatingYarnAppsInDockerContainers
❏ JIRA tracking Docker integration into YARN
❏ https://issues.apache.org/jira/browse/YARN-1964
❏ Related Docker tickets
❏ Several tickets linked from: https://github.com/dotcloud/docker/pull/4572
dineshs@altiscale.com
Questions?
24. Hadoop on Separate Physical Clusters
Customer 1 Customer 2 Customer 3
Cannot scale the business this way!
Poor utilization
Host platform is a huge maintenance burden
❖ Customer 1 needs R
❖ Customer 2 needs Matlab
❖ Customer 3 needs ß∂ø…
Utilization: 6
Spare: 0
Unused: 3
Utilization: 1
Spare: 6
Unused: 2
Utilization: 4
Spare: 3
Unused: 2
25. Container Clusters to Decouple Host from Customer
Each customer gets a container image
❖ Encapsulates customer specific software and
configuration
❖ Host platform remains lean and simple
Utilization: 6
Spare: 0
Unused: 3
Utilization: 1
Spare: 6
Unused: 2
Utilization: 4
Spare: 3
Unused: 2
Poor utilization
Customer 1 Customer 2 Customer 3
26. Global Pool of Resources
Global Utilization: 11
Spare: 16
Unused: 0
Container Clusters to Drive Utilization
Each customer gets a container image
❖ Encapsulates customer specific software and
configuration
❖ Host platform remains lean and simple
Densely pack containers together
27. Global Pool of Resources
Containers with Fine-grain Resources
❖ Container resource levels adjusted dynamically per
customer
➢ As dictated by business policy
❖ Fractional resource allocation
28. Global Pool of Resources
Disaggregated Compute and Storage
DNNM
❖ Add more storage to Customer 1 cluster from a storage rich node
➢ While a compute intensive job from Customer 2 utilizes the available compute capacity on the
same node
Independently scale compute and storage
Editor's Notes
Loss of locality etc. doesn’t make material difference
Suboptimal scheduling
No sharing (IA usecase: universities sharing data over a common HDFS)
Loss of locality etc. doesn’t make material difference
Suboptimal scheduling
No sharing (IA usecase: universities sharing data over a common HDFS)
Loss of locality etc. doesn’t make material difference
Suboptimal scheduling
No sharing (IA usecase: universities sharing data over a common HDFS)