SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Toward a practical “HPC Cloud”:
  Performance tuning of a virtualized HPC cluster


                       Ryousei Takano

                              Information Technology Research Institute,
National Institute of Advanced Industrial Science and Technology (AIST),
                                                                  Japan


                SC2011@Seattle, Nov.15 2011
Outline
•  What is HPC Cloud?
•  Performance tuning method for HPC Cloud
  –  PCI passthrough
  –  NUMA affinity
  –  VMM noise reduction
•  Performance evaluation




                                             2
HPC Cloud
HPC Cloud utilizes cloud resources in High
Performance Computing (HPC) applications
Virtualized
 Clusters




      Users require resources   Provider allocates users a dedicated
      according to needs        virtual cluster on demand

 Physical
 Cluster



                                                                       3
HPC Cloud (cont’d)
•  Pros:
   –  User side: easy to deployment
   –  Provider side: high resource utilization
•  Cons:
   –  Performance degradation?

  The method of performance tuning on a virtualized
  environment is not established.




                                                      4
Toward a practical HPC Cloud
                          To reduce the overhead of                      “True” HPC Cloud
 VM1
                          interrupt virtualization                         The performance is
    Guest OS
                          To disable unnecessary services                 closing to that of bare
        Physical
         driver
                          on the host OS (i.e., ksmd).                           metals.

 VMM
                                                            Reduce
                                                            VMM noise
  NIC
                                  Set NUMA                  (not completed)
                                  affinity
                                                                          VM (QEMU process)
                                                                                 Guest OS
                                                                                Threads


                   Use PCI                                                    VCPU
                                                                              threads
                   passthrough
                                                                          Linux kernel

  Current                                                                      KVM
 HPC Cloud
Its performance is
   not good and                                                           Physical
     unstable.                                                            CPU
                                                                                          CPU socket
                                                                                                       5
PCI passthrough
  IO emulation                    PCI passthrough                     SR-IOV
VM1              VM2             VM1             VM2            VM1            VM2
 Guest OS                         Guest OS                       Guest OS
                       …                                …                             …
  Guest                            Physical                      Physical
  driver                            driver                        driver


VMM                             VMM                             VMM
            vSwitch

            Physical
             driver

NIC                              NIC                            NIC

                                                                       Switch (VEB)

                           IO emulation       PCI passthrough     SR-IOV
      VM sharing
      Performance
                                                                                          6
Virtual CPU scheduling
         Bare Metal
            Xen                                          KVM
            VM (Xen DomU)                 VM (QEMU process)

VM                  Guest OS                     Guest OS
(Dom0)     Threads                            Threads
                                                                                Virtual Machine
         A guest OS can not run numactl

                                           VCPU
             V0 V1         V2    V3                      V0   V1   V2   V3
                                           threads
            VCPU

Xen Hypervisor                            Linux kernel

                                               KVM
                        Domain                                Process           Virtual Machine
                        scheduler                             scheduler         Monitor (VMM)


 Physical                                 Physical CPU
 CPU           P0    P1     P2    P3                     P0   P1   P2     P3       Hardware
                                                                   CPU socket
                                                                                              7
NUMA affinity
        Bare Metal                                KVM
Linux                              VM (QEMU process)

  Threads                                 Guest OS
                                       Threads
                  numactl
                                                       numactl                bind threads
                 Process                                                      to vSocket
                                    VCPU
                 scheduler                        V0     V1      V2   V3
                                    threads


                                   Linux kernel
                                                          taskset
                                        KVM
                                                                              pin vCPU to
                                                         Process              CPU (Vn = Pn)
                                                         scheduler
                      CPU socket
Physical
CPU         P0   P1     P2   P3
                                      Physical
                                      CPU         P0     P1      P2   P3
            memory      memory
                                                                 CPU socket
                                                                                             8
Evaluation
 Evaluation of HPC applications on 16 nodes cluster
 (part of AIST Green Cloud Cluster)
   Compute node Dell PowerEdge M610                Host machine environment
CPU       Intel quad-core Xeon E5540/2.53GHz x2   OS             Debian 6.0.1

Chipset   Intel 5520                              Linux kernel   2.6.32-5-amd64

Memory    48 GB DDR3                              KVM            0.12.50

InfiniBand Mellanox ConnectX (MT26428)            Compiler       gcc/gfortran 4.4.5
                                                  MPI            Open MPI 1.4.2
                Blade switch                               VM environment
InfiniBand Mellanox M3601Q (QDR 16 ports)         VCPU       8
                                                  Memory     45 GB


                                                                                      9
MPI Point-to-Point
                     communication performance
                     10000
                                 (higher is better)

                      1000
Bandwidth [MB/sec]




                       100




                        10               PCI passthrough improves MPI communication
                                         throughput close to that of bare metal machines.
                                                                  Bare Metal
                                                                       KVM
                         1
                             1     10    100    1k  10k 100k 1M        10M 100M     1G
                                                 Message size [byte]     Bare Metal: non-virtualized cluster
                                                                                                          10
NUMA affinity
Execution time on a single node: NPB multi-zone
(Computational Fluid Dynamics) and Bloss (Non-linear
eignsolver)

                 SP-MZ [sec]    BT-MZ [sec]   Bloss [min]
   Bare Metal    94.41 (1.00)   138.01 (1.00) 21.02 (1.00)
   KVM           104.57 (1.11) 141.69 (1.03) 22.12 (1.05)
   KVM (w/ bind) 96.14 (1.02)   139.32 (1.01) 21.28 (1.01)


NUMA affinity is an important performance factor not only
on bare metal machines but also on virtual machines.


                                                             11
NPB BT-MZ: Parallel efficiency
                                                                        (higher is better)
                            300                                                              100
Performance [Gop/s total]




                            250    Degradation of PE:                                        80




                                                                                                  Parallel efficiency [%]
                                     KVM: 2%, EC2: 14%
                            200
                                  Bare Metal                                                 60
                            150   KVM
                                  Amazon EC2
                                                                                             40
                            100   Bare Metal (PE)
                                  KVM (PE)
                                                                                             20
                             50   Amazon EC2 (PE)


                              0                                                              0
                                  1            2          4         8            16
                                                    Number of nodes
                                                                                                                            12
Bloss: Parallel efficiency
                          Bloss: non-linear internal eigensolver
                                –  Hierarchical parallel program by MPI and OpenMP
                          120
                                                                                Overhead of communication
                          100
                                                                                and virtualization
Parallel Efficiency [%]




                           80


                           60
                                    Degradation of PE:
                                      KVM: 8%, EC2: 22%
                           40


                           20                                 Bare Metal
                                                                   KVM
                                                             Amazon EC2
                                                                   Ideal
                            0
                                1        2           4            8        16
                                               Number of nodes
                                                                                                        13
Summary
HPC Cloud is promising!
•  The performance of coarse-grained parallel
   applications is comparable to bare metal
   machines
•  We plan to operate a private cloud service
   “AIST Cloud” for HPC users
•  Open issues
  –  VMM noise reduction
  –  VMM-bypass device-aware VM scheduling
  –  Live migration with VMM-bypass devices
                                                14
LINPACK Efficiency
                                                                            TOP500 June 2011
           100                                                             InfiniBand: 79%

                 80
Efficiency (%)




                                                                      10 Gigabit Ethernet: 74%
                 60


                 40                                     Gigabit Ethernet: 54%
                      GPGPU machines
                                                                         #451 Amazon EC2
                               InfiniBand                                cluster compute instances
                 20
                               Gigabit Ethernet
                               10 Gigabit Ethernet
                                                                      Virtualization causes the
                  0                                                   performance degradation!

                                                     TOP500 rank
                      Efficiency   Maximum LINPACK performance Rmax    Theoretical peak performance Rpeak
Bloss: Parallel efficiency
                          Bloss: non-linear internal eigensolver
                                –  Hierarchical parallel program by MPI and OpenMP
                          120


                          100
Parallel Efficiency [%]




                           80


                           60
                                                                  Binding threads and physical CPUs can
                                                                  be sensitive to VMM noise and degrade
                                                                  the performance.
                           40
                                                                Bare Metal
                           20                                         KVM
                                                              KVM (w/ bind)
                                                               Amazon EC2
                                                                      Ideal
                            0
                                1         2           4              8          16
                                                Number of nodes
                                                                                                          16

Mais conteúdo relacionado

Mais procurados

XCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudXCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudThe Linux Foundation
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilThe Linux Foundation
 
Vmware management-with-vcli-5.0
Vmware management-with-vcli-5.0Vmware management-with-vcli-5.0
Vmware management-with-vcli-5.0Sathishkumar A
 
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x The Linux Foundation
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronicsAgora Group
 
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)The Linux Foundation
 
Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase
 
Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13DaWane Wanek
 
Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Todd Deshane
 
Securing your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresSecuring your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresThe Linux Foundation
 
Cots moves to multicore: AMD
Cots moves to multicore: AMDCots moves to multicore: AMD
Cots moves to multicore: AMDKonrad Witte
 
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012The Linux Foundation
 
SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zIBM India Smarter Computing
 

Mais procurados (20)

XCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the CloudXCP: The Art of Open Virtualization for the Enterprise and the Cloud
XCP: The Art of Open Virtualization for the Enterprise and the Cloud
 
Xen Project Update LinuxCon Brazil
Xen Project Update LinuxCon BrazilXen Project Update LinuxCon Brazil
Xen Project Update LinuxCon Brazil
 
XS Boston 2008 ARM
XS Boston 2008 ARMXS Boston 2008 ARM
XS Boston 2008 ARM
 
XS Boston 2008 Memory Overcommit
XS Boston 2008 Memory OvercommitXS Boston 2008 Memory Overcommit
XS Boston 2008 Memory Overcommit
 
Xen in the Cloud at SCALE 10x
Xen in the Cloud at SCALE 10xXen in the Cloud at SCALE 10x
Xen in the Cloud at SCALE 10x
 
cinder-agent
cinder-agentcinder-agent
cinder-agent
 
XS Boston 2008 XenLoop
XS Boston 2008 XenLoopXS Boston 2008 XenLoop
XS Boston 2008 XenLoop
 
Vmware management-with-vcli-5.0
Vmware management-with-vcli-5.0Vmware management-with-vcli-5.0
Vmware management-with-vcli-5.0
 
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x Xen Cloud Platform at Build a Cloud Day at SCALE 10x
Xen Cloud Platform at Build a Cloud Day at SCALE 10x
 
9sept2009 concept electronics
9sept2009 concept electronics9sept2009 concept electronics
9sept2009 concept electronics
 
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
Xen cloud platform v1.1 (given at Build a Cloud Day in Antwerp)
 
Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011Membase Meetup Chicago - january 2011
Membase Meetup Chicago - january 2011
 
Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13Avnet & Rorke Data - Open Compute Summit '13
Avnet & Rorke Data - Open Compute Summit '13
 
Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud? Why Choose Xen For Your Cloud?
Why Choose Xen For Your Cloud?
 
Linux PV on HVM
Linux PV on HVMLinux PV on HVM
Linux PV on HVM
 
XS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt EnglishXS Japan 2008 Xen Mgmt English
XS Japan 2008 Xen Mgmt English
 
Securing your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security featuresSecuring your cloud with Xen's advanced security features
Securing your cloud with Xen's advanced security features
 
Cots moves to multicore: AMD
Cots moves to multicore: AMDCots moves to multicore: AMD
Cots moves to multicore: AMD
 
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
Virtualization in the Cloud @ Build a Cloud Day SFO May 2012
 
SLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System zSLES 11 SP2 PerformanceEvaluation for Linux on System z
SLES 11 SP2 PerformanceEvaluation for Linux on System z
 

Destaque

Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_bMohammad Reza Beygi
 
visual resume- ronie maydan
visual resume- ronie maydanvisual resume- ronie maydan
visual resume- ronie maydanronie_
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudAmazon Web Services
 
Laura Gainor Utilizing Social Media
Laura Gainor Utilizing Social MediaLaura Gainor Utilizing Social Media
Laura Gainor Utilizing Social MediaLaura Gainor
 
Visual Resume: Carlos Segura
Visual Resume: Carlos SeguraVisual Resume: Carlos Segura
Visual Resume: Carlos SeguraCarlos Segura
 
Vipindas - Visual Resume
Vipindas - Visual ResumeVipindas - Visual Resume
Vipindas - Visual ResumeVipindas S
 
Visual Resume 2.0
Visual Resume 2.0Visual Resume 2.0
Visual Resume 2.0mmp151
 
Visual Resume (Draft)
Visual Resume (Draft)Visual Resume (Draft)
Visual Resume (Draft)Corbin Otwell
 
Pieces Of Me: My Visual Resume
Pieces Of Me: My Visual ResumePieces Of Me: My Visual Resume
Pieces Of Me: My Visual ResumeMariehdb
 
Jeremy Baker's Visual Resume
Jeremy Baker's Visual ResumeJeremy Baker's Visual Resume
Jeremy Baker's Visual ResumeJeremy Baker
 
Superteacher Visual Resume
Superteacher Visual Resume Superteacher Visual Resume
Superteacher Visual Resume Chiara Ojeda
 
Rethinking Resumes
Rethinking ResumesRethinking Resumes
Rethinking ResumesKarla Wiles
 
10 minutes of me: Giordano Scalzo's Visual Resume
10 minutes of me: Giordano Scalzo's Visual Resume10 minutes of me: Giordano Scalzo's Visual Resume
10 minutes of me: Giordano Scalzo's Visual ResumeGiordano Scalzo
 
Tyler's Presentation Resume
Tyler's Presentation ResumeTyler's Presentation Resume
Tyler's Presentation ResumeTyler Totman
 

Destaque (20)

Unix _linux_fundamentals_for_hpc-_b
Unix  _linux_fundamentals_for_hpc-_bUnix  _linux_fundamentals_for_hpc-_b
Unix _linux_fundamentals_for_hpc-_b
 
HPC in the Cloud
HPC in the CloudHPC in the Cloud
HPC in the Cloud
 
visual resume- ronie maydan
visual resume- ronie maydanvisual resume- ronie maydan
visual resume- ronie maydan
 
Big Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS CloudBig Data and High Performance Computing Solutions in the AWS Cloud
Big Data and High Performance Computing Solutions in the AWS Cloud
 
Laura Gainor Utilizing Social Media
Laura Gainor Utilizing Social MediaLaura Gainor Utilizing Social Media
Laura Gainor Utilizing Social Media
 
HPC Market Update from IDC
HPC Market Update from IDCHPC Market Update from IDC
HPC Market Update from IDC
 
Visual Resume: Carlos Segura
Visual Resume: Carlos SeguraVisual Resume: Carlos Segura
Visual Resume: Carlos Segura
 
Wence's DigiResume
Wence's DigiResumeWence's DigiResume
Wence's DigiResume
 
Vipindas - Visual Resume
Vipindas - Visual ResumeVipindas - Visual Resume
Vipindas - Visual Resume
 
Visual Resume 2.0
Visual Resume 2.0Visual Resume 2.0
Visual Resume 2.0
 
Visual Resume
Visual Resume Visual Resume
Visual Resume
 
Visual Resume (Draft)
Visual Resume (Draft)Visual Resume (Draft)
Visual Resume (Draft)
 
UX designer visual resume
UX designer visual resumeUX designer visual resume
UX designer visual resume
 
Pieces Of Me: My Visual Resume
Pieces Of Me: My Visual ResumePieces Of Me: My Visual Resume
Pieces Of Me: My Visual Resume
 
Jeremy Baker's Visual Resume
Jeremy Baker's Visual ResumeJeremy Baker's Visual Resume
Jeremy Baker's Visual Resume
 
Superteacher Visual Resume
Superteacher Visual Resume Superteacher Visual Resume
Superteacher Visual Resume
 
Rethinking Resumes
Rethinking ResumesRethinking Resumes
Rethinking Resumes
 
Visual Resume
Visual ResumeVisual Resume
Visual Resume
 
10 minutes of me: Giordano Scalzo's Visual Resume
10 minutes of me: Giordano Scalzo's Visual Resume10 minutes of me: Giordano Scalzo's Visual Resume
10 minutes of me: Giordano Scalzo's Visual Resume
 
Tyler's Presentation Resume
Tyler's Presentation ResumeTyler's Presentation Resume
Tyler's Presentation Resume
 

Semelhante a Practical HPC Cloud: Performance tuning virtualized cluster

Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...
Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...
Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...Ryousei Takano
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜Ryousei Takano
 
ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012Peter Chang
 
LCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackLCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackDevananda Van Der Veen
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsopenstackindia
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java DevelopersRichard McDougall
 
Hardware supports for Virtualization
Hardware supports for VirtualizationHardware supports for Virtualization
Hardware supports for VirtualizationYoonje Choi
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalOpenCity Community
 
Realtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTRealtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTThe Linux Foundation
 
VMware Nova Compute Driver
VMware Nova Compute DriverVMware Nova Compute Driver
VMware Nova Compute DriverSean Chen
 
virtualization tutorial at ACM bangalore Compute 2009
virtualization tutorial at ACM bangalore Compute 2009virtualization tutorial at ACM bangalore Compute 2009
virtualization tutorial at ACM bangalore Compute 2009ACMBangalore
 
Virtualization Technology Overview
Virtualization Technology OverviewVirtualization Technology Overview
Virtualization Technology OverviewOpenCity Community
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCRyousei Takano
 
Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)hypervnu
 
Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCPThe Linux Foundation
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...Jim St. Leger
 
Windows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingWindows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingAidan Finn
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesDataWorks Summit
 
Windows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewWindows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewDavid Chou
 

Semelhante a Practical HPC Cloud: Performance tuning virtualized cluster (20)

Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...
Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...
Cooperative VM Migration for a virtualized HPC Cluster with VMM-bypass I/O de...
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012ARMvisor @ Linux Symposium 2012
ARMvisor @ Linux Symposium 2012
 
LCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with OpenstackLCA 2013 - Baremetal Provisioning with Openstack
LCA 2013 - Baremetal Provisioning with Openstack
 
Nova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute modelsNova for Physicalization and Virtualization compute models
Nova for Physicalization and Virtualization compute models
 
Virtualization Primer for Java Developers
Virtualization Primer for Java DevelopersVirtualization Primer for Java Developers
Virtualization Primer for Java Developers
 
Hardware supports for Virtualization
Hardware supports for VirtualizationHardware supports for Virtualization
Hardware supports for Virtualization
 
Am 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-finalAm 04 track1--salvatore orlando--openstack-apac-2012-final
Am 04 track1--salvatore orlando--openstack-apac-2012-final
 
Realtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKTRealtime scheduling for virtual machines in SKT
Realtime scheduling for virtual machines in SKT
 
VMware Nova Compute Driver
VMware Nova Compute DriverVMware Nova Compute Driver
VMware Nova Compute Driver
 
The kvm virtualization way
The kvm virtualization wayThe kvm virtualization way
The kvm virtualization way
 
virtualization tutorial at ACM bangalore Compute 2009
virtualization tutorial at ACM bangalore Compute 2009virtualization tutorial at ACM bangalore Compute 2009
virtualization tutorial at ACM bangalore Compute 2009
 
Virtualization Technology Overview
Virtualization Technology OverviewVirtualization Technology Overview
Virtualization Technology Overview
 
HPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPCHPC Cloud: Clouds on supercomputers for HPC
HPC Cloud: Clouds on supercomputers for HPC
 
Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)Windows server 8 hyper v networking (aidan finn)
Windows server 8 hyper v networking (aidan finn)
 
Windsor: Domain 0 Disaggregation for XenServer and XCP
	Windsor: Domain 0 Disaggregation for XenServer and XCP	Windsor: Domain 0 Disaggregation for XenServer and XCP
Windsor: Domain 0 Disaggregation for XenServer and XCP
 
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
DPDK Summit - 08 Sept 2014 - Futurewei - Jun Xu - Revisit the IP Stack in Lin...
 
Windows Server 8 Hyper V Networking
Windows Server 8 Hyper V NetworkingWindows Server 8 Hyper V Networking
Windows Server 8 Hyper V Networking
 
Apache Hadoop on Virtual Machines
Apache Hadoop on Virtual MachinesApache Hadoop on Virtual Machines
Apache Hadoop on Virtual Machines
 
Windows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload OverviewWindows Server 2008 Web Workload Overview
Windows Server 2008 Web Workload Overview
 

Mais de Ryousei Takano

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive ComputingRyousei Takano
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIRyousei Takano
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentRyousei Takano
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価Ryousei Takano
 
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)Ryousei Takano
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network ProcessingRyousei Takano
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraRyousei Takano
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksRyousei Takano
 
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術Ryousei Takano
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...Ryousei Takano
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告Ryousei Takano
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchRyousei Takano
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudRyousei Takano
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何かRyousei Takano
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...Ryousei Takano
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~Ryousei Takano
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersRyousei Takano
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green CloudRyousei Takano
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterRyousei Takano
 

Mais de Ryousei Takano (20)

Error Permissive Computing
Error Permissive ComputingError Permissive Computing
Error Permissive Computing
 
Opportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCIOpportunities of ML-based data analytics in ABCI
Opportunities of ML-based data analytics in ABCI
 
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and DeploymentABCI: An Open Innovation Platform for Advancing AI Research and Deployment
ABCI: An Open Innovation Platform for Advancing AI Research and Deployment
 
ABCI Data Center
ABCI Data CenterABCI Data Center
ABCI Data Center
 
クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価クラウド環境におけるキャッシュメモリQoS制御の評価
クラウド環境におけるキャッシュメモリQoS制御の評価
 
USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)USENIX NSDI 2016 (Session: Resource Sharing)
USENIX NSDI 2016 (Session: Resource Sharing)
 
User-space Network Processing
User-space Network ProcessingUser-space Network Processing
User-space Network Processing
 
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore EraFlow-centric Computing - A Datacenter Architecture in the Post Moore Era
Flow-centric Computing - A Datacenter Architecture in the Post Moore Era
 
A Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center NetworksA Look Inside Google’s Data Center Networks
A Look Inside Google’s Data Center Networks
 
クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術クラウド時代の半導体メモリー技術
クラウド時代の半導体メモリー技術
 
AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...AIST Super Green Cloud: lessons learned from the operation and the performanc...
AIST Super Green Cloud: lessons learned from the operation and the performanc...
 
IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告IEEE CloudCom 2014参加報告
IEEE CloudCom 2014参加報告
 
Expectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software researchExpectations for optical network from the viewpoint of system software research
Expectations for optical network from the viewpoint of system software research
 
Exploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC CloudExploring the Performance Impact of Virtualization on an HPC Cloud
Exploring the Performance Impact of Virtualization on an HPC Cloud
 
不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か不揮発メモリとOS研究にまつわる何か
不揮発メモリとOS研究にまつわる何か
 
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
High-resolution Timer-based Packet Pacing Mechanism on the Linux Operating Sy...
 
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
クラウドの垣根を超えた高性能計算に向けて~AIST Super Green Cloudでの試み~
 
From Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computersFrom Rack scale computers to Warehouse scale computers
From Rack scale computers to Warehouse scale computers
 
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
高性能かつスケールアウト可能なHPCクラウド AIST Super Green Cloud
 
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data CenterIris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
Iris: Inter-cloud Resource Integration System for Elastic Cloud Data Center
 

Último

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfRankYa
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 

Último (20)

Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Search Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdfSearch Engine Optimization SEO PDF for 2024.pdf
Search Engine Optimization SEO PDF for 2024.pdf
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 

Practical HPC Cloud: Performance tuning virtualized cluster

  • 1. Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster Ryousei Takano Information Technology Research Institute, National Institute of Advanced Industrial Science and Technology (AIST), Japan SC2011@Seattle, Nov.15 2011
  • 2. Outline •  What is HPC Cloud? •  Performance tuning method for HPC Cloud –  PCI passthrough –  NUMA affinity –  VMM noise reduction •  Performance evaluation 2
  • 3. HPC Cloud HPC Cloud utilizes cloud resources in High Performance Computing (HPC) applications Virtualized Clusters Users require resources Provider allocates users a dedicated according to needs virtual cluster on demand Physical Cluster 3
  • 4. HPC Cloud (cont’d) •  Pros: –  User side: easy to deployment –  Provider side: high resource utilization •  Cons: –  Performance degradation? The method of performance tuning on a virtualized environment is not established. 4
  • 5. Toward a practical HPC Cloud To reduce the overhead of “True” HPC Cloud VM1 interrupt virtualization The performance is Guest OS To disable unnecessary services closing to that of bare Physical driver on the host OS (i.e., ksmd). metals. VMM Reduce VMM noise NIC Set NUMA (not completed) affinity VM (QEMU process) Guest OS Threads Use PCI VCPU threads passthrough Linux kernel Current KVM HPC Cloud Its performance is not good and Physical unstable. CPU CPU socket 5
  • 6. PCI passthrough IO emulation PCI passthrough SR-IOV VM1 VM2 VM1 VM2 VM1 VM2 Guest OS Guest OS Guest OS … … … Guest Physical Physical driver driver driver VMM VMM VMM vSwitch Physical driver NIC NIC NIC Switch (VEB) IO emulation PCI passthrough SR-IOV VM sharing Performance 6
  • 7. Virtual CPU scheduling Bare Metal Xen KVM VM (Xen DomU) VM (QEMU process) VM Guest OS Guest OS (Dom0) Threads Threads Virtual Machine A guest OS can not run numactl VCPU V0 V1 V2 V3 V0 V1 V2 V3 threads VCPU Xen Hypervisor Linux kernel KVM Domain Process Virtual Machine scheduler scheduler Monitor (VMM) Physical Physical CPU CPU P0 P1 P2 P3 P0 P1 P2 P3 Hardware CPU socket 7
  • 8. NUMA affinity Bare Metal KVM Linux VM (QEMU process) Threads Guest OS Threads numactl numactl bind threads Process to vSocket VCPU scheduler V0 V1 V2 V3 threads Linux kernel taskset KVM pin vCPU to Process CPU (Vn = Pn) scheduler CPU socket Physical CPU P0 P1 P2 P3 Physical CPU P0 P1 P2 P3 memory memory CPU socket 8
  • 9. Evaluation Evaluation of HPC applications on 16 nodes cluster (part of AIST Green Cloud Cluster) Compute node Dell PowerEdge M610 Host machine environment CPU Intel quad-core Xeon E5540/2.53GHz x2 OS Debian 6.0.1 Chipset Intel 5520 Linux kernel 2.6.32-5-amd64 Memory 48 GB DDR3 KVM 0.12.50 InfiniBand Mellanox ConnectX (MT26428) Compiler gcc/gfortran 4.4.5 MPI Open MPI 1.4.2 Blade switch VM environment InfiniBand Mellanox M3601Q (QDR 16 ports) VCPU 8 Memory 45 GB 9
  • 10. MPI Point-to-Point communication performance 10000 (higher is better) 1000 Bandwidth [MB/sec] 100 10 PCI passthrough improves MPI communication throughput close to that of bare metal machines. Bare Metal KVM 1 1 10 100 1k 10k 100k 1M 10M 100M 1G Message size [byte] Bare Metal: non-virtualized cluster 10
  • 11. NUMA affinity Execution time on a single node: NPB multi-zone (Computational Fluid Dynamics) and Bloss (Non-linear eignsolver) SP-MZ [sec] BT-MZ [sec] Bloss [min] Bare Metal 94.41 (1.00) 138.01 (1.00) 21.02 (1.00) KVM 104.57 (1.11) 141.69 (1.03) 22.12 (1.05) KVM (w/ bind) 96.14 (1.02) 139.32 (1.01) 21.28 (1.01) NUMA affinity is an important performance factor not only on bare metal machines but also on virtual machines. 11
  • 12. NPB BT-MZ: Parallel efficiency (higher is better) 300 100 Performance [Gop/s total] 250 Degradation of PE: 80 Parallel efficiency [%] KVM: 2%, EC2: 14% 200 Bare Metal 60 150 KVM Amazon EC2 40 100 Bare Metal (PE) KVM (PE) 20 50 Amazon EC2 (PE) 0 0 1 2 4 8 16 Number of nodes 12
  • 13. Bloss: Parallel efficiency Bloss: non-linear internal eigensolver –  Hierarchical parallel program by MPI and OpenMP 120 Overhead of communication 100 and virtualization Parallel Efficiency [%] 80 60 Degradation of PE: KVM: 8%, EC2: 22% 40 20 Bare Metal KVM Amazon EC2 Ideal 0 1 2 4 8 16 Number of nodes 13
  • 14. Summary HPC Cloud is promising! •  The performance of coarse-grained parallel applications is comparable to bare metal machines •  We plan to operate a private cloud service “AIST Cloud” for HPC users •  Open issues –  VMM noise reduction –  VMM-bypass device-aware VM scheduling –  Live migration with VMM-bypass devices 14
  • 15. LINPACK Efficiency TOP500 June 2011 100 InfiniBand: 79% 80 Efficiency (%) 10 Gigabit Ethernet: 74% 60 40 Gigabit Ethernet: 54% GPGPU machines #451 Amazon EC2 InfiniBand cluster compute instances 20 Gigabit Ethernet 10 Gigabit Ethernet Virtualization causes the 0 performance degradation! TOP500 rank Efficiency Maximum LINPACK performance Rmax Theoretical peak performance Rpeak
  • 16. Bloss: Parallel efficiency Bloss: non-linear internal eigensolver –  Hierarchical parallel program by MPI and OpenMP 120 100 Parallel Efficiency [%] 80 60 Binding threads and physical CPUs can be sensitive to VMM noise and degrade the performance. 40 Bare Metal 20 KVM KVM (w/ bind) Amazon EC2 Ideal 0 1 2 4 8 16 Number of nodes 16