Univa webinar: High Performance Computing (HPC) in the Cloud?

www.univa.com
February 2016
High Performance
Computing in the
Cloud?
RECORDING
Q&A – Part 1 & Part 2

 Commoditized cloud applications
 Latency: The gating factor for HPC-in-the-Cloud
 A very brief taxonomy of clouds
 Latency notwithstanding …
 Embarrassingly parallel HPC applications
 The impact of large data volumes
 The impact of execution time
 Distributed-memory parallel HPC applications - MPI and beyond
 The impact of containerization
 The past, present and future viability of HPC in the cloud
Agenda

http://img04.deviantart.net/bd14/i/2013/046/3/d/space_the_final_frontier_by_unusualsuspex-d5v0h8m.jpg

Latency is physically a consequence of the
limited velocity with which any physical
interaction can propagate.
https://en.wikipedia.org/wiki/Latency_(engineering)

http://www.cartoonsidrew.com/2011/05/einsteins-speed-limit.html

Current consumer devices have appallingly bad latency …
It's the Latency, Stupid
https://rescomp.stanford.edu/~cheshire/rants/Latency.html

… if you have a network link with low bandwidth then
it's an easy matter of putting several in parallel to make
a combined link with higher bandwidth, but if you have
a network link with bad latency then no amount of
money can turn any number of them into a link with
good latency.
It's the Latency, Stupid
https://rescomp.stanford.edu/~cheshire/rants/Latency.html

Latency
… and the best we can do? Try to ‘HIDE’ it!

www.univa.com
20
Cloud Taxonomy
 Private Clouds
o Use containers and VMs to increase data center
workflow by dynamically optimizing the configuration
of the cluster based on job priority
 Hybrid Clouds
o Combine servers in the cloud with a company’s data
center servers, making it look like one seamless
cluster
 Public Clouds
o Quickly provision a cluster in the Cloud, and pay only
for what you need

www.univa.com
21
Use Cases
 Building a physical Univa Grid Engine cluster
 Creating a Univa Grid Engine cluster on Google Compute, Amazon EC2, Azure,
OpenStack, …..
 Mixed clusters with more than one Cloud provider
 Creating a mixed physical and VMware virtual Univa Grid Engine cluster on your
own hardware
 Creating an internal cluster that can ‘burst out’ to the Cloud on demand

http://c59951.r51.cf2.rackcdn.com/4994-1182-lumb.pdf

www.univa.com
24
Case Study: The Broad Institute
Challenge: Augment on-premise HPC resources with cost-effective,
scalable cloud based offering for bioinformatics workloads
Solution: 50K cores on Google Compute Engine via Cycle Computing
and Univa Grid Engine
Results
 Ran 30 years of cancer research calculations in just a few hours
 Made use of 1.4 million sequenced or genotyped biological samples
http://www.nextplatform.com/2015/09/08/google-cycle-computing-pair-for-broad-genomics-effort/

www.univa.com
Univa Short Jobs: Architecture
Workflow
Submission
• Policy ctrl through
launchers
• Pull vs push  fast
Copyright © 2016 Univa Corporation, All Rights Reserved. 26

http://www.mellanox.com/page/performance_infiniband

MPI Apps Remain a Challenge …
 … for
 cloud use
 containerization
 Constrain MPI apps to mitigate concerns with latency
 Run HPC on-premise OR in a cloud, but not between
 Containers?
o Just say no???
 Seek alternatives
 Apache Spark ???
 Message busses ???
 Shifter ???

https://insights.sei.cmu.edu/assets/content/VM-Diagram.png

www.univa.com
33
Full Docker Integration
 Docker Test scripts useful however:
 No interactive containers
 No runtime resource usage for containers
 No accounting for containers
 Complete Docker integration in progress:
 Integrated into the Execution Daemon
 Beta shipping now!!! will ship in November 2015
o If you are interested please contact us!
Copyright © Univa Corporation, 2016. All Rights Reserved

www.univa.com
Copyright © Univa Corporation, 2016. All Rights Reserved 34
Docker with Univa Grid Engine
 Launch Docker Container on best machine in cluster
 Reduces the time wasted (it can be minutes) waiting for the Docker
image to download from the Docker registry. Container runs faster
increasing throughput in the cluster.
 Run Docker Containers in a Univa Grid Engine Cluster
 Business Critical containers are prioritized over other
containers. Increases efficiency of the overall organization.
 Job Control and Limits for Docker Containers
 Provides user and administrator control over containers running on Grid
Engine Hosts.

www.univa.com
Copyright © Univa Corporation, 2015. All Rights Reserved 35
Docker Integration with Unvia
 Accounting for Docker Containers
 Keeps track of containers. Share policies require accounting.
 Data file Management for Docker Containers
 Transparent access to input, output and error files. Simplifies the
management of input and output files for Docker Containers and
ensures any output or error files are moved to a location where the user
can access them.
 Interactive Docker Containers
 Good for debugging when containers don’t work correctly!

HPC as a Containerized Cloud Based Service
http://insidehpc.com/2015/11/ubercloud-delivers-cae-as-a-service-with-univa-grid-engine-
container-edition/

Cloud Native Computing Foundation (CNCF)
 For current applications and services
 Uptake of cloud computing remains an afterthought from a systems-
architecture perspective
 CNCF aims to introduce a cloud-native paradigm shift that
emphasizes:
 Containerization
 Dynamic scheduling
 Orientation around micro services
 Making use of Kubernetes as a ‘seed technology’
 #1 priority: Integrate the orchestration layer of the container
ecosystem
 Univa is a Founding Member
 Along with Google, IBM, Intel, Red Hat and numerous others ...
 Prototype implementations becoming available
https://cncf.io/

www.univa.com
THANK YOU
Ian Lumb
Solutions Architect
+1 630 303 9068 ilumb@univa.com
RECORDING
Q&A – Part 1 & Part 2

http://www0.cloudbootcamp.com/node/660946

https://c2.staticflickr.com/8/7174/6406442009_70cc52d8aa_b.jpg

http://runge.math.smu.edu/SMUHPC_workshop_Summer14/_images/flynn.png
Flynn’s
Taxonomy

GPUs in the Cloud? The Top Four Reasons
1.You can realize possibilities using the cloud
a. You can scale up and scale out
2.You still realize the promise of GPU programmability
a. … via HPC in the cloud
3.Your use of the cloud is transparent
a. You’ve found ways to `hide’ latency
i. Constraints apply for MPI apps
4.Your go-to apps still work in the cloud
http://info.brightcomputing.com/Blog/bid/196290/The-Top-4-Reasons-You-Should-Try-Cloud-Based-
GPUs-for-HPC

https://aws.amazon.com/ec2/instance-types/

www.univa.com
50
Docker
 What is Docker?
 Docker is a tool that packages an application, filesystem, and all other
dependencies into a easily distributable software package that can be
installed and run on any modern Linux Server.
 What is a Software Container?
 Similar to a Virtual Machine but a single Operating System is shared.
o Faster than Virtual Machines
o Less overhead than Virtual Machines
o You can run more Software Containers on a machine than VMs.
 Not a new concept, Sun Microsystems has ‘Solaris Zones’.
 Why is Docker different?

http://dockone.io/uploads/article/20150329/aa61c8ee04d815507d575c9d
0a3c162f.png

www.univa.com
52
Docker
 What is Docker?
 Docker is a tool that packages an application, filesystem, and all other
dependencies into a easily distributable software package that can be
installed and run on any modern Linux Server.
 What is a Software Container?
 Similar to a Virtual Machine but a single Operating System is shared.
o Faster than Virtual Machines
o Less overhead than Virtual Machines
o You can run more Software Containers on a machine than VMs.
 Not a new concept, Sun Microsystems has ‘Solaris Zones’.
 Why is Docker different?

www.univa.com
53
Docker on Google Trends
Interest in Docker (US only)
Rapid growth since the end
of 2013 … continues …

www.univa.com
54
Kubernetes
 What is Kubernetes?
 Kubernetes is a workload and service orchestration tool for
containerized applications and services running on a cluster or cloud
infrastructure.
 Where did it come from?
 It is derived from research work Google has been doing (called Omega),
drawing from the experience of Google has gained with their own in-
house orchestration system (Borg) in the past 10+ years.
 Why is it important?
 Google wants Kubernetes to become a standard container orchestration
platform for Clouds and Enterprises.
 Running multiple containers on multiple machines is hard, you need
Kubernetes

“The wonderful thing
about standards is
that there are so
many of them to
choose from.”
https://en.wikiquote.org/wiki/Grace
_Hopper

Cloud Computing
is bereft of standards!!!

Cloud Computing
is bereft of standards!!!
...but, FLUSH with implementations!!!

Univa webinar: High Performance Computing (HPC) in the Cloud?

Recomendados

Recomendados

Mais conteúdo relacionado

Mais de Ian Lumb

Mais de Ian Lumb (11)

Último

Último (20)

Univa webinar: High Performance Computing (HPC) in the Cloud?

Notas do Editor