Accessible hpc for everyone with docker and containers

… with Docker and Containers
Accessible High Performance
Computing for Everyone

Solution Engineer, Docker
Christine Lovett
Technical Account Manager,
Docker
Christian Kniep

Solving today’s most challenging problems in:
Driving Use Cases in HPC
Stockpile Stewardship
Life Science
Molecular Dynamics
Earth Sciences

Evergreen Problem: Weather/Climate

Earth-orbiting Satellites Collect Data

Responding to Natural Disasters with IT
● Extreme weather and natural disasters
occurring at greater frequencies
● Containers and Cloud Services for
Disaster Recovery
● Real time monitoring and social media
aiding in damage prediction and loss of life

Understand the
Past
Respond to the
Present
Prepare for the
Future

1. Computation has to be done sequentially
High Performance Computing?
t=0 t=1 t=2

1. Domains exchange intermediate results
1. Decomposition to compute on multiple cores

1. Fit into compute node(s)
compute1
1. Domains exchange intermediate results
1. Decomposition to compute on multiple cores
compute0
compute0

Brief History of HPC
1970 1980’s 1990’s 2000’s 2010’s
Cray-1
(136 MFLOPS)
CDC 6600
(3 MFLOPS)
Cray X-MP
(800 MFLOPS)
Cray 2
(1.9 GFLOPS)
NEC SX-3
(600 GFLOPS)
Intel Core i7
(~200 GFLOPS)
Intel Xeon Platinum
(~3 TFLOPS)
ASCI Red
(1 TFLOPS)
Beowulf
(Cluster w/ Linux)
Earth Sim.
(35 TFLOPS)
Blue Gene/L
(70 TFLOPS)
Roadrunner
(AMD, 1 PFLOPS)
Tianhe 1A
(2.5 PFLOPS)
Cray Titan
(+GPU, 17.5 PFLOPS)
Sunway TaihuLight
(93 PFLOPS)
DGX-2
(224 TFLOPS, AI: 2 PFLOPS)
IBM Summit
(~150 PFLOPS)
today

Advancements in GPU computing enable new use-cases.
New Kid: AI / Deep Learning

What’s this got to do with containers?

SciOps
Peer Review +
Collaboration
Researcher
Workstation
HPC Compute
Nodes

Bridging the Technology Gap
Containers removing barriers to entry:
- Hardware Expertise
- Software Expertise
- Cost

Prepping the Next Generation
Now Later

How are containers used in HPC today?

ShipBuild
Given birth to, since Docker(CE back then) did not provide the features necessary
to run on HPC systems.
Current Solutions
Development Build hub.docker.com
HPC Runtimes
Pull Image
Extract File-System Store on /share
chroot /container

Traditionally container workloads are scheduled descriptive and
as a task (pod) on a worker.
HPC schedules a workload as a batch job on multiple nodes.
Service/Batch Scheduling
Docker Engine
SWARM Kubernetes
Shared System
process1 process2
node0
Shared System
job-process2
agent
nodeN
job-process2
agent
manager
controller
Distributed
Process

HPC-specific workarounds
+ Drop-in replacement as it wraps the job
- Not OCI compliant
- No Secure Supply Chain
- no integration with upstream ecosystem
Current Solutions [cont]
node0
Shared System
job-process2
HPC-runtime
agent
nodeN
job-process2
agent
manager
controller

To achieve the highest performance the kernel got squeezed out
of the equation for some technologies.
Kernel-bypassing Devices
Hardware
OS Kernel
Userland
ETH
TCP/IP
GPU
CUDA
IB
OFEDlibnet
Application

Scientific end-users expect the environment to be set up for
them, without prior knowledge about the specific cluster.
Scientific Environments
Service Cluster Compute Cluster
Storage
/home/
/proj/
Engine
rank0
Engine
rank1
Engine
rank2
Engine
AI

The Stack
runc
containerd
Engine
Client
--device=/dev/nvidia0
"Devices": [{
"PathOnHost": "/dev/nvidia0",
"PathInContainer": "/dev/nvidia0",
"CgroupPermissions": "rwm"
}],
"devices": [{
"path": "/dev/nvidia0", "type": "c",
"major": 195, "minor": 0,"fileMode": 8630,
"uid": 0, "gid": 0
},
"hooks": { "prestart": [ {"path": “/usr/local/bin/nvidia.sh"}]}

Image RegistrySecurity scan
& sign
Traditional
Third Party
HPC Workloads
docker store
Control
Plane
Leveraging HPC in the Enterprise

advanced
Multi Node,
Shared Storage
intermediate
Single Node,
Shared Storage
beginner
Convergence of AI and HPC
Complexity
Maturity
Single node,
Local Storage
non-GPU
MPI
shared file-system
device passthrough
a.k.a. HPC!

HPC Conferences
HPC-Meeting Opportunities
● June 19th: Container Days 2018, Hamburg
● June 20th/21th: ADAC6, Zurich
● June 24th-28th: ISC 2018, Frankfurt
○ ISC Student Cluster Competition, 25th - 27th
○ HPC Workshop, 28th
● HPC Advisory Council Workshops
● SC18, Dallas

Accessible hpc for everyone with docker and containers

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Accessible hpc for everyone with docker and containers

Similar to Accessible hpc for everyone with docker and containers (20)

More from Docker, Inc.

More from Docker, Inc. (20)

Recently uploaded

Recently uploaded (20)

Accessible hpc for everyone with docker and containers