SlideShare uma empresa Scribd logo
1 de 21
Baixar para ler offline
Nuttapong Chakthranont†, Phonlawat Khunphet†, !
Ryousei Takano‡, Tsutomu Ikegami‡
†King Mongkut’s University of Technology North Bangkok, Thailand!
‡National Institute of Advanced Industrial Science and Technology, Japan
IEEE CloudCom 2014@Singapore, 18 Dec. 2014
Exploring the Performance
Impact of Virtualization !
on an HPC Cloud
Outline
•  HPC Cloud
•  AIST Super Green Cloud (ASGC)
•  Experiment
•  Conclusion
2
HPC Cloud
•  HPC users begin to take an interest in the Cloud.
–  e.g., CycleCloud, Penguin on Demand
•  Virtualization is a key technology.
–  Pro: a customized software environment, elasticity, etc
–  Con: a large overhead, spoiling I/O performance.
•  VMM-bypass I/O technologies, e.g., PCI passthrough and !
SR-IOV, can significantly mitigate the overhead.
Motivating Observation
•  Performance evaluation of HPC cloud
–  (Para-)virtualized I/O incurs a large overhead.
–  PCI passthrough significantly mitigate the overhead.
0
50
100
150
200
250
300
BT CG EP FT LU
Executiontime[seconds]
BMM (IB) BMM (10GbE)
KVM (IB) KVM (virtio)
The overhead of I/O virtualization on the NAS Parallel
Benchmarks 3.3.1 class C, 64 processes.
BMM: Bare Metal Machine
KVM (virtio)
VM1
10GbE NIC
VMM
Guest
driver
Physical
driver
Guest OS
KVM (IB)
VM1
IB QDR HCA
VMM
Physical
driver
Guest OS
Bypass
Improvement by !
PCI passthrough
HPC Cloud
•  HPC users begin to take an interest in the Cloud.
–  e.g., CycleCloud, Penguin on Demand
•  Virtualization is a key technology.
–  Pro: a customized software environment, elasticity, etc
–  Con: a large overhead, spoiling I/O performance.
•  VMM-bypass I/O technologies, e.g., PCI passthrough and !
SR-IOV, can significantly mitigate the overhead.
We quantitatively assess the performance impact of virtualization!
to demonstrate the feasibility of HPC Cloud.
Outline
•  HPC Cloud
•  AIST Super Green Cloud (ASGC)
•  Experiment
•  Conclusion
6
Easy to Use Supercomputer!
- Usage Model of AIST Cloud -
7
Allow users to
customize their
virtual clusters
Web appsBigDataHPC
1. Choose a template image of!
a virtual machine
2. Add required software!
package
VM template image files
HPC
+
Ease of use
Virtual cluster
deploy
take snapshots
Launch a virtual
machine when
necessary
3. Save a user-customized!
template image
8
Login node
Image!
repository
sgc-tools
Create a
virtual cluster
Easy to Use Supercomputer!
- Elastic Virtual Cluster -
Frontend node (VM)
cmp.!
node
cmp.!
node
cmp.!
node
NFSdJob scheduler
Virtual Cluster
InfiniBand
Scale in/!
scale out
Submit a job
Note: a singleVM runs on a node.
ASGC Hardware Spec.
9
Compute Node
CPU Intel Xeon E5-2680v2/2.8GHz !
(10 core) x 2CPU
Memory 128 GB DDR3-1866
InfiniBand Mellanox ConnectX-3 (FDR)
Ethernet Intel X520-DA2 (10 GbE)
Disk Intel SSD DC S3500 600 GB
•  155 node-cluster consists of Cray H2312 blade server
•  The theoretical peak performance is 69.44 TFLOPS
•  The operation started from July, 2014
ASGC Software Stack
Management Stack
–  CentOS 6.5 (QEMU/KVM 0.12.1.2)
–  Apache CloudStack 4.3 + our extensions
•  PCI passthrough/SR-IOV support (KVM only)
•  sgc-tools: Virtual cluster construction utility
–  RADOS cluster storage
HPC Stack (Virtual Cluster)
–  Intel Compiler/Math Kernel Library SP1 1.1.106
–  Open MPI 1.6.5
–  Mellanox OFED 2.1
–  Torque job scheduler
10
Outline
•  HPC Cloud
•  AIST Super Green Cloud (ASGC)
•  Experiment
•  Conclusion
11
Benchmark Programs
Micro benchmark
–  Intel Micro Benchmark (IMB) version 3.2.4
Application-level benchmark
–  HPC Challenge (HPCC) version 1.4.3
•  G-HPL
•  EP-STREAM
•  G-RandomAccess
•  G-FFT
–  OpenMX version 3.7.4
–  Graph 500 version 2.1.4
  
MPI Point-to-point
communication
13
0.1$
1$
10$
1$ 1024$
Throughput)(GB/s)
Message)Size)(KB)
Physical$Cluster$
Virtual$Cluster$
5.85GB/s
5.69GB/s
The overhead is less than 3% with large message,
though it is up to 25% with small message.
IMB
MPI Collectives (64bytes)
  
0
1000
2000
3000
4000
5000
0 32 64 96 128
ExecutionTime(usec)
Number of Nodes
Physical Cluster
Virtual Cluster
0
200
400
600
800
1,000
1,200
0 32 64 96 128
ExecutionTime(usec)
Number of Nodes
Physical Cluster
Virtual Cluster
0
2000
4000
6000
0 32 64 96 128
ExecutionTime(usec)
Number of Nodes
Physical Cluster
Virtual Cluster
Allgather Allreduce
Alltoall
IMB
The overhead becomes
significant as the number
of nodes increases.
… load imbalance?
+77% +88%
+43%
G-HPL (LINPACK)
15
0
10
20
30
40
50
60
0 32 64 96 128
Performance(TFLOPS)
Number of Nodes
Physical Cluster
Virtual Cluster
Performance degradation:!
5.4 - 6.6%
Efficiency* on 128 nodes
Physical: 90%
Virtual: 84%
*) Rmax / Rpeak
HPCC
EP-STREAM and G-FFT
  
0
2
4
6
0 32 64 96 128
Performance(GB/s)
Number of Nodes
Physical Cluster
Virtual Cluster
0
40
80
120
160
0 32 64 96 128
Performance(GFLOPS)
Number of Nodes
Physical Cluster
Virtual Cluster
EP-STREAM G-FFT
HPCC
The overheads are ignorable.
memory intensive
with no communication
all-to-all communication!
with large messages
Graph500 (replicated-csc, scale 26)
17
1.00E+07
1.00E+08
1.00E+09
1.00E+10
0 16 32 48 64
Performance(TEPS)
Number of Nodes
Physical Cluster
Virtual Cluster
Graph500
Performance degradation:!
2% (64node)
Graph500 is a Hybrid parallel program (MPI + OpenMP).
We used a combination of 2 MPI processes and 10 OpenMP threads.
Findings
•  PCI passthrough is effective in improving the I/O
performance, however, it is still unable to achieve
the low communication latency of a physical cluster
due to a virtual interrupt injection.
•  VCPU pinning improves the performance for HPC
applications.
•  Almost all MPI collectives suffer from the scalability
issue.
•  The overhead of virtualization has less impact on
actual applications.
  
Outline
•  HPC Cloud
•  AIST Super Green Cloud (ASGC)
•  Experiment
•  Conclusion
19
Conclusion and Future work
•  HPC Cloud is promising.
–  Micro benchmark: MPI collectives have the scalability issue.
–  Application-level benchmarks: the negative impact is
limited and the virtualization overhead is about 5%.
–  Our HPC Cloud operation started from July, 2014.
•  Virtualization can contribute to system utilization
improvements.
–  SR-IOV
–  VM placement optimization based on the workloads of
virtual clusters
20
Question?
Thank you for your attention!
21
Acknowledgments:
The authors would like to thank Assoc. Prof. Vara Varavithya, KMUTNB, and Dr.
Yoshio Tanaka,AIST, for valuable guidance and advice. In addition, the authors would also
like to thank the ASGC support team for preparation and trouble shooting of the
experiments. This work was partly supported by JSPS KAKENHI Grant Number 24700040.

Mais conteúdo relacionado

Mais procurados

Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Ryousei Takano
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
Jason Shih
 

Mais procurados (20)

Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical ResearchBruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
 
産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み産総研におけるプライベートクラウドへの取り組み
産総研におけるプライベートクラウドへの取り組み
 
Stig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputerStig Telfer - OpenStack and the Software-Defined SuperComputer
Stig Telfer - OpenStack and the Software-Defined SuperComputer
 
Exascale Capabl
Exascale CapablExascale Capabl
Exascale Capabl
 
UberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for BeginnersUberCloud HPC Experiment Introduction for Beginners
UberCloud HPC Experiment Introduction for Beginners
 
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
LCA13: Jason Taylor Keynote - ARM & Disaggregated Rack - LCA13-Hong - 6 March...
 
High performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspectiveHigh performance computing - building blocks, production & perspective
High performance computing - building blocks, production & perspective
 
PG-Strom - A FDW module utilizing GPU device
PG-Strom - A FDW module utilizing GPU devicePG-Strom - A FDW module utilizing GPU device
PG-Strom - A FDW module utilizing GPU device
 
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
Microsoft Project Olympus AI Accelerator Chassis (HGX-1)
 
Deep Learning on the SaturnV Cluster
Deep Learning on the SaturnV ClusterDeep Learning on the SaturnV Cluster
Deep Learning on the SaturnV Cluster
 
CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016CPU Optimizations in the CERN Cloud - February 2016
CPU Optimizations in the CERN Cloud - February 2016
 
dCUDA: Distributed GPU Computing with Hardware Overlap
 dCUDA: Distributed GPU Computing with Hardware Overlap dCUDA: Distributed GPU Computing with Hardware Overlap
dCUDA: Distributed GPU Computing with Hardware Overlap
 
On heap cache vs off-heap cache
On heap cache vs off-heap cacheOn heap cache vs off-heap cache
On heap cache vs off-heap cache
 
Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*Introduction to High-Performance Computing (HPC) Containers and Singularity*
Introduction to High-Performance Computing (HPC) Containers and Singularity*
 
CEPH DAY BERLIN - 5 REASONS TO USE ARM-BASED MICRO-SERVER ARCHITECTURE FOR CE...
CEPH DAY BERLIN - 5 REASONS TO USE ARM-BASED MICRO-SERVER ARCHITECTURE FOR CE...CEPH DAY BERLIN - 5 REASONS TO USE ARM-BASED MICRO-SERVER ARCHITECTURE FOR CE...
CEPH DAY BERLIN - 5 REASONS TO USE ARM-BASED MICRO-SERVER ARCHITECTURE FOR CE...
 
Programmable Exascale Supercomputer
Programmable Exascale SupercomputerProgrammable Exascale Supercomputer
Programmable Exascale Supercomputer
 
Omp tutorial cpugpu_programming_cdac
Omp tutorial cpugpu_programming_cdacOmp tutorial cpugpu_programming_cdac
Omp tutorial cpugpu_programming_cdac
 
Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9Inside the Volta GPU Architecture and CUDA 9
Inside the Volta GPU Architecture and CUDA 9
 

Semelhante a Exploring the Performance Impact of Virtualization on an HPC Cloud

SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
Intel® Software
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
 

Semelhante a Exploring the Performance Impact of Virtualization on an HPC Cloud (20)

PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...PLNOG16: Obsługa 100M pps na platformie PC, Przemysław Frasunek, Paweł Mała...
PLNOG16: Obsługa 100M pps na platformie PC , Przemysław Frasunek, Paweł Mała...
 
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performanceAI Bridging Cloud Infrastructure (ABCI) and its communication performance
AI Bridging Cloud Infrastructure (ABCI) and its communication performance
 
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*
 
LEGaTO: Software Stack Runtimes
LEGaTO: Software Stack RuntimesLEGaTO: Software Stack Runtimes
LEGaTO: Software Stack Runtimes
 
Large-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC WorkloadsLarge-Scale Optimization Strategies for Typical HPC Workloads
Large-Scale Optimization Strategies for Typical HPC Workloads
 
SDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's StampedeSDVIs and In-Situ Visualization on TACC's Stampede
SDVIs and In-Situ Visualization on TACC's Stampede
 
Lrz kurs: big data analysis
Lrz kurs: big data analysisLrz kurs: big data analysis
Lrz kurs: big data analysis
 
Exploring Gpgpu Workloads
Exploring Gpgpu WorkloadsExploring Gpgpu Workloads
Exploring Gpgpu Workloads
 
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
CIF16: Building the Superfluid Cloud with Unikernels (Simon Kuenzer, NEC Europe)
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
GIST AI-X Computing Cluster
GIST AI-X Computing ClusterGIST AI-X Computing Cluster
GIST AI-X Computing Cluster
 
Hardware architecture of Summit Supercomputer
 Hardware architecture of Summit Supercomputer Hardware architecture of Summit Supercomputer
Hardware architecture of Summit Supercomputer
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Scallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systemsScallable Distributed Deep Learning on OpenPOWER systems
Scallable Distributed Deep Learning on OpenPOWER systems
 
HPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand ChallengeHPC Infrastructure To Solve The CFD Grand Challenge
HPC Infrastructure To Solve The CFD Grand Challenge
 
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core ArchitecturesPerformance Optimization of SPH Algorithms for Multi/Many-Core Architectures
Performance Optimization of SPH Algorithms for Multi/Many-Core Architectures
 
NWU and HPC
NWU and HPCNWU and HPC
NWU and HPC
 

Mais de Ryousei Takano

クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
Ryousei Takano
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
Ryousei Takano
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
Ryousei Takano
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構
Ryousei Takano
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Ryousei Takano
 

Mais de Ryousei Takano (20)

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
ABCI Data Center
ABCI Data CenterABCI Data Center
ABCI Data Center
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
IEEE/ACM SC2013報告
IEEE/ACM SC2013報告IEEE/ACM SC2013報告
IEEE/ACM SC2013報告
 
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
A Scalable and Distributed Electrical Power Monitoring System Utilizing Cloud...
 
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム伸縮自在なデータセンターを実現するインタークラウド資源管理システム
伸縮自在なデータセンターを実現するインタークラウド資源管理システム
 
SoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired NetworksSoNIC: Precise Realtime Software Access and Control of Wired Networks
SoNIC: Precise Realtime Software Access and Control of Wired Networks
 
異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構異種クラスタを跨がる仮想マシンマイグレーション機構
異種クラスタを跨がる仮想マシンマイグレーション機構
 
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
動的ネットワーク切替を用いた省電力指向トラフィックオフロード方式
 
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
Ninja Migration: An Interconnect transparent Migration for Heterogeneous Data...
 
インタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システムインタークラウドにおける仮想インフラ構築システム
インタークラウドにおける仮想インフラ構築システム
 
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
Preliminary Experiment of Disaster Recovery based on Interconnect-transparent...
 
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
動的ネットワークパス構築と連携したエッジオーバレイ帯域制御
 

Último

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Exploring the Performance Impact of Virtualization on an HPC Cloud

  • 1. Nuttapong Chakthranont†, Phonlawat Khunphet†, ! Ryousei Takano‡, Tsutomu Ikegami‡ †King Mongkut’s University of Technology North Bangkok, Thailand! ‡National Institute of Advanced Industrial Science and Technology, Japan IEEE CloudCom 2014@Singapore, 18 Dec. 2014 Exploring the Performance Impact of Virtualization ! on an HPC Cloud
  • 2. Outline •  HPC Cloud •  AIST Super Green Cloud (ASGC) •  Experiment •  Conclusion 2
  • 3. HPC Cloud •  HPC users begin to take an interest in the Cloud. –  e.g., CycleCloud, Penguin on Demand •  Virtualization is a key technology. –  Pro: a customized software environment, elasticity, etc –  Con: a large overhead, spoiling I/O performance. •  VMM-bypass I/O technologies, e.g., PCI passthrough and ! SR-IOV, can significantly mitigate the overhead.
  • 4. Motivating Observation •  Performance evaluation of HPC cloud –  (Para-)virtualized I/O incurs a large overhead. –  PCI passthrough significantly mitigate the overhead. 0 50 100 150 200 250 300 BT CG EP FT LU Executiontime[seconds] BMM (IB) BMM (10GbE) KVM (IB) KVM (virtio) The overhead of I/O virtualization on the NAS Parallel Benchmarks 3.3.1 class C, 64 processes. BMM: Bare Metal Machine KVM (virtio) VM1 10GbE NIC VMM Guest driver Physical driver Guest OS KVM (IB) VM1 IB QDR HCA VMM Physical driver Guest OS Bypass Improvement by ! PCI passthrough
  • 5. HPC Cloud •  HPC users begin to take an interest in the Cloud. –  e.g., CycleCloud, Penguin on Demand •  Virtualization is a key technology. –  Pro: a customized software environment, elasticity, etc –  Con: a large overhead, spoiling I/O performance. •  VMM-bypass I/O technologies, e.g., PCI passthrough and ! SR-IOV, can significantly mitigate the overhead. We quantitatively assess the performance impact of virtualization! to demonstrate the feasibility of HPC Cloud.
  • 6. Outline •  HPC Cloud •  AIST Super Green Cloud (ASGC) •  Experiment •  Conclusion 6
  • 7. Easy to Use Supercomputer! - Usage Model of AIST Cloud - 7 Allow users to customize their virtual clusters Web appsBigDataHPC 1. Choose a template image of! a virtual machine 2. Add required software! package VM template image files HPC + Ease of use Virtual cluster deploy take snapshots Launch a virtual machine when necessary 3. Save a user-customized! template image
  • 8. 8 Login node Image! repository sgc-tools Create a virtual cluster Easy to Use Supercomputer! - Elastic Virtual Cluster - Frontend node (VM) cmp.! node cmp.! node cmp.! node NFSdJob scheduler Virtual Cluster InfiniBand Scale in/! scale out Submit a job Note: a singleVM runs on a node.
  • 9. ASGC Hardware Spec. 9 Compute Node CPU Intel Xeon E5-2680v2/2.8GHz ! (10 core) x 2CPU Memory 128 GB DDR3-1866 InfiniBand Mellanox ConnectX-3 (FDR) Ethernet Intel X520-DA2 (10 GbE) Disk Intel SSD DC S3500 600 GB •  155 node-cluster consists of Cray H2312 blade server •  The theoretical peak performance is 69.44 TFLOPS •  The operation started from July, 2014
  • 10. ASGC Software Stack Management Stack –  CentOS 6.5 (QEMU/KVM 0.12.1.2) –  Apache CloudStack 4.3 + our extensions •  PCI passthrough/SR-IOV support (KVM only) •  sgc-tools: Virtual cluster construction utility –  RADOS cluster storage HPC Stack (Virtual Cluster) –  Intel Compiler/Math Kernel Library SP1 1.1.106 –  Open MPI 1.6.5 –  Mellanox OFED 2.1 –  Torque job scheduler 10
  • 11. Outline •  HPC Cloud •  AIST Super Green Cloud (ASGC) •  Experiment •  Conclusion 11
  • 12. Benchmark Programs Micro benchmark –  Intel Micro Benchmark (IMB) version 3.2.4 Application-level benchmark –  HPC Challenge (HPCC) version 1.4.3 •  G-HPL •  EP-STREAM •  G-RandomAccess •  G-FFT –  OpenMX version 3.7.4 –  Graph 500 version 2.1.4  
  • 14. MPI Collectives (64bytes)   0 1000 2000 3000 4000 5000 0 32 64 96 128 ExecutionTime(usec) Number of Nodes Physical Cluster Virtual Cluster 0 200 400 600 800 1,000 1,200 0 32 64 96 128 ExecutionTime(usec) Number of Nodes Physical Cluster Virtual Cluster 0 2000 4000 6000 0 32 64 96 128 ExecutionTime(usec) Number of Nodes Physical Cluster Virtual Cluster Allgather Allreduce Alltoall IMB The overhead becomes significant as the number of nodes increases. … load imbalance? +77% +88% +43%
  • 15. G-HPL (LINPACK) 15 0 10 20 30 40 50 60 0 32 64 96 128 Performance(TFLOPS) Number of Nodes Physical Cluster Virtual Cluster Performance degradation:! 5.4 - 6.6% Efficiency* on 128 nodes Physical: 90% Virtual: 84% *) Rmax / Rpeak HPCC
  • 16. EP-STREAM and G-FFT   0 2 4 6 0 32 64 96 128 Performance(GB/s) Number of Nodes Physical Cluster Virtual Cluster 0 40 80 120 160 0 32 64 96 128 Performance(GFLOPS) Number of Nodes Physical Cluster Virtual Cluster EP-STREAM G-FFT HPCC The overheads are ignorable. memory intensive with no communication all-to-all communication! with large messages
  • 17. Graph500 (replicated-csc, scale 26) 17 1.00E+07 1.00E+08 1.00E+09 1.00E+10 0 16 32 48 64 Performance(TEPS) Number of Nodes Physical Cluster Virtual Cluster Graph500 Performance degradation:! 2% (64node) Graph500 is a Hybrid parallel program (MPI + OpenMP). We used a combination of 2 MPI processes and 10 OpenMP threads.
  • 18. Findings •  PCI passthrough is effective in improving the I/O performance, however, it is still unable to achieve the low communication latency of a physical cluster due to a virtual interrupt injection. •  VCPU pinning improves the performance for HPC applications. •  Almost all MPI collectives suffer from the scalability issue. •  The overhead of virtualization has less impact on actual applications.  
  • 19. Outline •  HPC Cloud •  AIST Super Green Cloud (ASGC) •  Experiment •  Conclusion 19
  • 20. Conclusion and Future work •  HPC Cloud is promising. –  Micro benchmark: MPI collectives have the scalability issue. –  Application-level benchmarks: the negative impact is limited and the virtualization overhead is about 5%. –  Our HPC Cloud operation started from July, 2014. •  Virtualization can contribute to system utilization improvements. –  SR-IOV –  VM placement optimization based on the workloads of virtual clusters 20
  • 21. Question? Thank you for your attention! 21 Acknowledgments: The authors would like to thank Assoc. Prof. Vara Varavithya, KMUTNB, and Dr. Yoshio Tanaka,AIST, for valuable guidance and advice. In addition, the authors would also like to thank the ASGC support team for preparation and trouble shooting of the experiments. This work was partly supported by JSPS KAKENHI Grant Number 24700040.