SlideShare uma empresa Scribd logo
1 de 65
Baixar para ler offline
IN4392 Cloud Computing
Multi-Tenancy, including Virtualization




Cloud Computing (IN4392)

D.H.J. Epema and A. Iosup

2012-2013

                                          1



Parallel and Distributed Group/
Terms for Today’s Discussion




 mul·ti-te·nan·cy noun ˌməl-ti-ˌte-nan(t)-sē
  an IT sharing model of how physical and virtual
  resources are used by possibly concurrent tenants
2012-2013                                             2
Characteristics of Multi-Tenancy

1. Isolation = separation of services provided to each tenant
   (the noisy neighbor)
2. Scaling conveniently with the number and size of tenants
   (max weight in the elevator)
3. Meet SLAs for each tenant
4. Support for per-tenant service customization
5. Support for value-adding ops, e.g., backup, upgrade
6. Secure data processing and storage
   (the snoopy neighbor)
7. Support for regulatory law (per legislator, per tenant)

2012-2013                                              3
Benefits of Multi-Tenancy (the Promise)


• Cloud operator
       • Economy of scale
       • Market-share and branding (for the moment)
• Users
       •    Flexibility
       •    Focus on core expertise
       •    Reduced cost
       •    Reduced time-to-market
• Overall
       • Reduced cost of IT deployment and operation
2012-2013                                              4
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                                 5
Problems with Multi-Tenancy [1/5]
A List of Concerns
• Users
       •    Performance isolation (and variability) for all resources
       •    Scalability with the number of tenants (per resource)
       •    Support for value-added ops for each application type
       •    Security concerns (too many to list)
• Owners
       •    Up-front and operational costs
       •    Human management of multi-tenancy
       •    Development effort and required skills
       •    Time-to-market
• The law: think health management applications
2012-2013                                                               6
Problems with Multi-Tenancy [2/5]
   Load Imbalance


• Overall workload imbalance: normalized daily load (5:1)
• Temporary workload imbalance: hourly load (1000:1)
                  Overall
               imbalance
                                                Temporary
                                                imbalance




   2012-2013                                          7
Problems with Multi-Tenancy [3/5]
 Practical Achievable Utilization


 • Enterprise: <15% [McKinsey’12]
 • Parallel production environments: 60-70% [Nitzberg’99]
 • Grids: 15-30% average cluster,
         >90% busy clusters
 • Today’s clouds: ???




 2012-2013                                           8


Iosup and Epema: Grid Computing Workloads.
   IEEE Internet Computing 15(2): 19-26 (2011)
Problems with Multi-Tenancy [4/5]
  (Catastrophic) Cascading Failures


 • Parallel production environments: one failure kills one or
   more parallel jobs
 • Grids: correlated failures
 • Today’s clouds:
   Amazon, Facebook, etc.
   had catastrophic failures
   in the past 2-3 years        Average = 11 nodes
                                Range = 1—339 nodes
                             CDF



 2012-2013                            Size of correlated failures
                                                              9


Iosup et al. : On the dynamic resource
   availability in grids. GRID 2007: 26-33
Problems with Multi-Tenancy [5/5]
       Economics


 • Up-front: a shared approach is more difficult to develop than
   an isolated approach; may also require expensive skills




       2012-2013                                          10



Source:
www.capcloud.org/TechGate/Multitenancy_Magic.pptx
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                           11
2012-2013   12
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                                 13
Fairness




• Intuitively, distribution of goods (distributive justice)
• Different people, different perception of justice
       • Each one will pay the same vs
         The rich should pay proportionally higher taxes
       • I only need to pay a few years later than everyone else
2012-2013                                                          14
The VL-e project: application areas
     Philips                                   IBM        Unilever
      Medical
     Diagnosis &
                   Bags-of-Tasks
                    Bio-    Bio-                Data
                                              Intensive
                                                             Food         Dutch
                   Diversity   Informatics                Informatics   Telescience
       Imaging                                 Science



                     Virtual Laboratory (VL)
                               Management
                               of comm. &
                   Application Oriented Services
                                computing

                                  Grid Services
                    Harness multi-domain distributed resources




15
The VL-e project: application areas
         Philips                                   IBM        Unilever
          Medical        Bio-          Bio-         Data         Food         Dutch
         Diagnosis &   Diversity   Informatics    Intensive   Informatics   Telescience
           Imaging                                 Science
                                   Bags-of-Tasks
Fairness for all! Virtual Laboratory (VL)
                               Management
                              of comm. &
                 Application Oriented Services
Task (groups of 5, 5 minutes):  computing

discuss fairness for this scenario.
                           Grid Services
                        Harness multi-domain distributed resources
Task (inter-group discussion):
discuss fairness for this scenario.


    16
Research Questions

Q1

       What is the design space for
 BoT scheduling in large-scale, distributed,
         fine-grained computing?

Q2

 What is the performance of BoT schedulers
               in this setting?

 2012-2013                                 17
Scheduling Model [1/4]
      Overview

      •    System Model
           1. Clusters
              execute jobs
           2. Resource managers
              coordinate job execution
           3. Resource management architectures
              route jobs among resource managers
           4. Task selection policies
              create the eligible set
                                         Fairness for all!
           5. Task scheduling policies
              schedule the eligible set
      18


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108              Q1
Scheduling Model [2/4]
      Resource Management Architectures
      route jobs among resource managers




           Centralized       Separated Clusters            Decentralized
               (csp)                (sep-c)                   (fcondor)


      19


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                            Q1
Scheduling Model [3/4]
      Task Selection Policies
      create the eligible set
      •    Age-based:
           1. S-T: Select Tasks in the order of their arrival.
           2. S-BoT: Select BoTs in the order of their arrival.
      •    User priority based:
           3. S-U-Prio: Select the tasks of the User with the highest Priority.
      •    Based on fairness in resource consumption:
           4.   S-U-T: Select the Tasks of the User with the lowest res. cons.
           5.   S-U-BoT: Select the BoTs of the User with the lowest res. cons.
           6.   S-U-GRR: Select the User Round-Robin/all tasks for this user.
           7.   S-U-RR: Select the User Round-Robin/one task for this user.
      20


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                              Q1
Context: System Model [4/4]
                                                                    Task
      Task Scheduling Policies                                  Information
      schedule the eligible set                                K      H       U
      • Information availability:                             ECT,
                                                          K          ECT-P   FPF




                                            Information
           • Known                                            FPLT




                                             Resource
           • Unknown
                                                          H DFPLT,
           • Historical records                               MQD
                                                                             RR,
      • Sample policies:                                  U   STFR
                                                                             WQR
           • Earliest Completion Time (with
             Prediction of Runtimes) (ECT(-P))
           • Fastest Processor First (FPF)
           • (Dynamic) Fastest Processor Largest Task ((D)FPLT)
           • Shortest Task First w/ Replication (STFR)
           • Work Queue w/ Replication (WQR)
      21


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                               Q1
Design Space Exploration [1/3]
      Overview

      • Design space exploration: time to understand
        how our solutions fit into the complete system.
     s x 7P x I x S x A x (environment) → >2M design points

      • Study the impact of:
           •   The Task Scheduling Policy (s policies)
           •   The Workload Characteristics (P characteristics)
           •   The Dynamic System Information (I levels)
           •   The Task Selection Policy (S policies)
           •   The Resource Management Architecture (A policies)
      22


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                    Q2
Design Space Exploration [2/3]
      Experimental Setup
      • Simulator:
           • DGSim [IosupETFL SC’07, IosupSE EuroPar’08]
      • System:
           • DAS + Grid’5000 [Cappello & Bal CCGrid’07]
           • >3,000 CPUs: relative perf. 1-1.75
      • Metrics:
           • Makespan
           • Normalized Schedule Length ~ speed-up
      • Workloads:
           • Real: DAS + Grid’5000
           • Realistic: system load 20-95% (from workload model)
      23


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                    Q2
Design Space Exploration [3/3]
      Task Selection, including Fair Policies
      • Task selection policy only for busy systems
      • Naïve user priority can lead to poor performance
      • Fairness, in general, reduces performance


                                S-U-Prio




                                                S-U-*: S-U-T, S-U-BoT, …
      24


Iosup et al.: The performance of bags-of-tasks in large-
   scale distributed systems. HPDC 2008: 97-108                     Q2
Quincy: Microsoft’s Fair Scheduler


      • Fairness in Microsoft’s Dryad data centers
             • Large jobs (30 minutes or longer) should not monopolize the
               whole cluster (Similar: Bounded Slowdown [Feitelson et al.’97])
             • A job that takes t seconds in exclusive-access run requires at
               most J x t seconds for J concurrent jobs in the cluster.


      • Challenges
             1. Support fairness
             2. Improve data locality: use data center’s network and storage
                architecture to reduce job response time

      2012-2013                                                           25


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Dryad Workloads




      2012-2013                                         26


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Dryad Workloads                                    Q: Worst-case scenario?




        2012-2013                                                                    27



Source:
http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
Dryad Workloads                                           Q: Is this fair?




        2012-2013                                                                    28



Source:
http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
Dryad Workloads                                           Q: Is this fair?




        2012-2013                                                                    29



Source:
http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
Dryad Workloads                                           Q: Is this fair?




        2012-2013                                                                    30



Source:
http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
Quincy
      Cluster Architecture: Racks and Computers




      2012-2013                                         31


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy
      Main Idea: Graph Min-Cost Flow

      • From scheduling to Graph Min-Cost Flow
             • Feasible schedule = min-cost flow
      • Graph construction
             • Graph from job tasks to computers, passing through cluster
               headnodes and racks
             • Edges weighted by cost function (scheduling constraints, e.g.,
               fairness)
      • Pros and Cons
             • From per-job (local) decisions to workload (global) decisions
             • Complex graph construction
             • Edge weight assumes all constraints can be normalized
      2012-2013                                                            32


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [1/6]

              Workflow tasks        Cluster, Racks, Computers




      2012-2013                                                 33


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [2/6]

                         unscheduled




      2012-2013                                         34


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [3/6]
                                        Q: How easy to encode
                                       heterogeneous resources?




                  Weighted edges




      2012-2013                                               35


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [4/6]




                  Root task gets
                  one computer


      2012-2013                                         36


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [5/6]
      Dynamic Schedule for One Job




      2012-2013                                         37


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy Operation [6/6]
      Dynamic Schedule for Two Jobs




                  Q: How compute-intensive is the
                         Quincy scheduler,
                  for many jobs and/or computers?




      2012-2013                                         38


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy
      Experimental Setup

      • Schedulers
             • Encoded two fair variants (w/ and w/o pre-emption)
             • Encoded two unfair variants (w/ and w/o pre-emption)
             • Comparison with Greedy Algorithm (Queue-Based)
      • Typical Dryad jobs
             • Workload includes worst-case scenario
      • Environment
             • 1 cluster
             • 8 racks
             • 240 nodes
      2012-2013                                                       39


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy
      Experimental Results [1/5]




      2012-2013                                         40


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy
      Experimental Results [2/5]




      2012-2013                                         41


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy, Experimental Results [3/5]
      No Fairness




      2012-2013                                         42


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy, Experimental Results [4/5]
      Queue-Based Fairness




      2012-2013                                         43


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Quincy, Experimental Results [5/5]
      Quincy Fairness




      2012-2013                                         44


Isard et al.: Quincy: fair scheduling for distributed
   computing clusters. SOSP 2009: 261-276
Mesos
Dominant Resource Fairness

• Multiple resource types
• Max-Min fairness = maximize minimum per-user allocation

• Paper 1 in Topic 4
• Dominant Resource Fairness better explained in:
    Ghodsi et al., Dominant Resource Fairness: Fair Allocation
       of Multiple Resources Types, Usenix NSDI 2011.




2012-2013                                                        45
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                                 46
Virtualization

        • Merriam-Webster




        • Popek and
          Goldberg, 1974




        2012-2013                                       47



Source: Waldspurger, Introduction to Virtual Machines
http://labs.vmware.com/download/52/
Characteristics of Virtualization
              Q: Why not do all these in the OS?

1. Fidelity* = ability to run application unmodified
2. Performance* close to hardware ability
3. Safety* = all hardware resources managed by virtualization
   manager, never directly accessible to application
4. Isolation of performance, or failures, etc.
5. Portability = ability to run VM on any hardware
    (support for value-adding ops, e.g., migration)
6. Encapsulation = ability to capture VM state
   (support for value-adding ops, e.g., backup, clone)
7. Transparency in operation
  2012-2013                                             48

* Classic virtualization
  (Popek and Goldberg 1974)
Benefits of Virtualization (the Promise)


•    Simplified management of physical resources
•    Increased utilization of physical resources (consolidation)
•    Better isolation of (catastrophic) failures
•    Better isolation of security leaks (?)
•    Support for multi-tenancy
       • Derived benefit: reduced cost of IT deployment and operation




2012-2013                                                          49
A List of Concerns

• Users
       • Performance isolation
• Owners
       •    Performance loss vs native hardware
       •    Support for exotic devices, especially on the versatile x86
       •    Porting OS and applications, for some virtualization flavors
       •    Implement VMM—application integration?
            (Loss of portability vs increased performance.)
       • Install hardware with support for virtualization?
            (Certification of new hardware vs increased performance.)
• The Law: security, reliability, …

2012-2013                                                                  50
Depth of Virtualization

• NO virtualization (actually, virtual memory)
       • Most grids, enterprise data centers until 2000
            Q: Are all our machines virtualized
       • Facebook now
                 anyway, by the modern OS?

• Single-level virtualization
  (we zoom into this next)

• Nested virtualization
       • VM embedded in a VM embedded in a VM emb …
       • Q: Why is this virtualization model useful?
          It’s all turtles all the way down…
    Ben-Yehuda et al.: The Turtles Project: Design and Implementation
       of Nested Virtualization. OSDI 2010: 423-436
2012-2013                                                     51
Single-Level Virtualization and The Full IaaS Stack

          Applications
   Applications                            Applications

         Guest OS
   Guest OS                                Guest OS

  Virtual Virtual Resources
          Resources                        Virtual Resources



                   VM Instance
          Virtual Machine                         Virtual Machine

  Virtual Machine Monitor                    Virtual Machine Monitor

                         Virtual Infrastructure
                               Manager
     February 20, 2013                                              52


                                                          Physical
                                                          Infrastructu
Single-Level Virtualization

 Applications                                 Applications

Guest OS                                  Guest OS
           MusicW
Virtual Resources             OtherAp      OtherAp
                                          Virtual Resources
           ave                   p             p
                              Q: What to do
                                  now?
         Virtual Machine                         Virtual Machine

                    Virtual Machine Monitor                  Hypervisor
                    Host OS                                  May not exist

February 20, 2013                                                    53
Three VMM Models

Classic VMM*          Hosted VMM         Hybrid VMM

MWave         App2   MWave     App2             I/O   App2

                                            V
                                      MWave
                                            M
                                                      Guest
                                            M          OS

VMM                  VMM              Host OS         VMM
                     Host OS


  2012-2013                                           54

* Classic virtualization
  (Popek and Goldberg 1974)
Single-Level Virtualization
   Implementing the Classic Virtualization Model
   • General technique*, similar to simulation/emulation
         • Code for computer X runs on general-purpose machine G.
         • If X=G (virtualization), slowdown in software simulation may be
           20:1. If X≠G (emulation), slowdown may be 1000:1.
         • If X=G (virtualization), code may execute directly on hardware


   • Privileged vs user code*
         • Trap-and-emulate as main (but not necessary) approach
         •    Ring deprivileging, ring aliasing, address-space compression, other niceties**



   • Specific approaches for each virtualized resource***
         • Virtualized CPU, memory, I/O (disk, network, graphics, …)
  2012-2013                                                                                    55

* (Goldberg 1974) ** (Intel 2006)
*** (Rosenblum and Garfinkel 2005)
Single-Level Virtualization
Refinements to the Classic Virtualization Model*
   • Enhancing VMM—guest OS interface (paravirtualization)
          • Guest OS is re-coded (ported) to the VMM, for performance gains
            (e.g., by avoiding some privileged operations)
          • Guest OS can provide information to VMM, for performance gains
          • Loses or loosens ―Fidelity‖ characteristic**
          • 2010 onwards: paravirtualization other than I/O seems to wane

   • Enhancing hardware—VMM interface (HW support)
          • New hardware execution modes for Guest OSs, so no need for
            VMM to trap all privileged operations, so performance gains
               • IBM’s System 370 introduced interpretive execution (1972), Intel VT-x and VT-I (2006)
          • Passthrough I/O virtualization with low CPU overhead
               • Isolated DMA: Intel VT-d and AMD IOMMU; I/O device partitions: PCI-SIG IOV spec

   2012-2013                                                                                    56

* (Adams and Agesen 2006)
** (Popek and Goldberg 1974)
Single-Level Virtualization
        Trap-and-Emulate

                                 Guest OS + Application




                                                                            Unprivileged
                                  Q: What are the challenges?
                                  Page                  Undef
                                Q: What are the challenges for
                                  Fault                 Instr
                                                       vIRQ
                                     x86 architectures?




                                                                            Privileged
                                MMU                CPU            I/O
                              Emulation          Emulation      Emulation


                                 Virtual Machine Monitor
        2012-2013                                                                          57



Source: Waldspurger, Introduction to Virtual Machines
http://labs.vmware.com/download/52/
Single-Level Virtualization
  Processor Virtualization Techniques*
• Binary Translation
    • Static, execute guest instructions in interpreter, to prevent unlawful
      access to privilege state instructions
    • Dynamic/Adaptive BT, detect instructions that trap frequently and
      adapt their translation, to eliminate traps from non-privileged
      instructions accessing sensitive data (e.g., load/store in page tables)

• Hardware virtualization
    • Co-design VM and Hardware: HW with non-standard ISA, shadow
      memory, optimization of instructions for selected applications
    • Intel VT-*, AMD SVM: in-memory data structure for state, guest
      mode, a less privileged execution mode + vmrun, etc.

  2012-2013                                                           58

* (Adams and Agesen 2006)
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                                 59
Support for Specific Services and/or Platforms
      Database Multi-Tenancy [1/3]


      1.     Isolation = separation of services provided to each tenant
      2.     Scaling conveniently with the number and size of tenants
      3.     Meet SLAs for each tenant
      4.     Support for per-tenant service customization
      5.     Support for value-adding ops, e.g., backup, upgrade
      6.     Secure data processing and storage
      7.     Support for regulatory law (per legislator, per tenant)


      2012-2013                                                  60




* Platform-specific (database-specific) issues
Support for Specific Services and/or Platforms
        Database Multi-Tenancy [2/3]




       2012-2013                                         61



Source:
http://msdn.microsoft.com/en-us/library/aa479086.aspx
Support for Specific Services and/or Platforms
       Database Multi-Tenancy [3/3]




                           Private tables                      Extension tables

                                            Rigid, shared table




                        Datatype-specific                             Universal table with
                        pivot tables                                   XML document
                                             Universal table

       2012-2013                                                                       62



Source: Bobrowksi
www.capcloud.org/TechGate/Multitenancy_Magic.pptx
Agenda


1.     Introduction
2.     Multi-Tenancy in Practice (The Problem)
3.     Architectural Models for Multi-Tenancy in Clouds
4.     Shared Nothing: Fairness
5.     Shared Hardware: Virtualization
6.     Sharing Other Operational Levels
7.     Summary



2012-2013                                                 63
Conclusion Take-Home Message
• Multi-Tenancy = reduced cost of IT
• 7 architectural models for multi-tenancy
   •     Shared Nothing—fairness is a key challenge
   •     Shared Hardware—virtualization is a key challenge
   •     Other levels—optimizing for specific application is a key challenge
   •     Many trade-offs
• Virtualization
   •     Enables multi-tenancy + many other benefits
   •     3 depth models, 3 VMM models
   •     A whole new dictionary: hypervisor, paravirtualization, ring deprivileging
   •     Main trade-off: performance cost vs benefits
• Reality check: virtualization is now (2012) very popular
       February 20, 2013                          http://www.flickr.com/photos/dimitrisotiropoulos/4204766418/
                                                                                                   64
Reading Material
•   Workloads
     •      James Patton Jones, Bill Nitzberg: Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization. JSSPP 1999:
            1-16
     •      Alexandru Iosup, Dick H. J. Epema: Grid Computing Workloads. IEEE Internet Computing 15(2): 19-26 (2011)
     •      Alexandru Iosup, Mathieu Jan, Omer Ozan Sonmez, Dick H. J. Epema: On the dynamic resource availability in grids. GRID 2007: 26-33
     •      D. Feitelson, L. Rudolph, U. Schwiegelshohn, K. Sevcik, and P. Wong. Theory and practice in parallel job scheduling. In JSSPP, pages 1–
            34, 1997
•   Fairness
     •      Alexandru Iosup, Omer Ozan Sonmez, Shanny Anoep, Dick H. J. Epema: The performance of bags-of-tasks in large-scale distributed
            systems. HPDC 2008: 97-108
     •      Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, Andrew Goldberg: Quincy: fair scheduling for distributed
            computing clusters. SOSP 2009: 261-276
     •      A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Dominant Resource Fairness: Fair Allocation of Multiple
            Resources Types, Usenix NSDI 2011.
•   Virtualization
     •      Gerald J. Popek, Robert P. Goldberg, Formal Requirements for Virtualizable Third Generation Architectures, Communications of the ACM,
            July 1974.
     •      Robert P. Goldberg, Survey of Virtual Machine Research, IEEE Computer Magazine, June 1974
     •      Mendel Rosenblum and Tal Garfinkel, Virtual Machine Monitors: Current Technology and Future Trends, IEEE Computer Magazine, May
            2005
     •      Keith Adams, Ole Agesen: A comparison of software and hardware techniques for x86 virtualization. ASPLOS 2006: 2-13
     •      Gill Neiger, Amy Santoni, Felix Leung, Dion Rodgers, Rich Uhlig, Intel Virtualization Technology: Hardware Support for Efficient Processor
            Virtualization. Intel Technology Journal, Vol.10(3), Aug 2006.
     •      Muli Ben-Yehuda, Michael D. Day, Zvi Dubitzky, Michael Factor, Nadav Har'El, Abel Gordon, Anthony Liguori, Orit Wasserman, Ben-Ami
            Yassour: The Turtles Project: Design and Implementation of Nested Virtualization. OSDI 2010: 423-436


         2012-2013                                                                                                                       65

Mais conteúdo relacionado

Mais procurados

Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Harish Chand
 
Architecture Challenges In Cloud Computing
Architecture Challenges In Cloud ComputingArchitecture Challenges In Cloud Computing
Architecture Challenges In Cloud ComputingIndicThreads
 
Introduction to Cloud Computing and Cloud Infrastructure
Introduction to Cloud Computing and Cloud InfrastructureIntroduction to Cloud Computing and Cloud Infrastructure
Introduction to Cloud Computing and Cloud InfrastructureSANTHOSHKUMARKL1
 
Big data
Big dataBig data
Big datahsn99
 
Online analytical processing (olap) tools
Online analytical processing (olap) toolsOnline analytical processing (olap) tools
Online analytical processing (olap) toolskulkarnivaibhav
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computingpurplesea
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATAGauravBiswas9
 
System Development Life Cycle & Implementation of MIS
System Development Life Cycle & Implementation of MISSystem Development Life Cycle & Implementation of MIS
System Development Life Cycle & Implementation of MISGeorge V James
 
Cloud computing and service models
Cloud computing and service modelsCloud computing and service models
Cloud computing and service modelsPrateek Soni
 
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Majid Hajibaba
 
Virtualization in cloud
Virtualization in cloudVirtualization in cloud
Virtualization in cloudAshok Kumar
 

Mais procurados (20)

ERP software architecture
ERP software architectureERP software architecture
ERP software architecture
 
Aspects of data mart
Aspects of data martAspects of data mart
Aspects of data mart
 
It infrastructure
It infrastructureIt infrastructure
It infrastructure
 
Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)Data mining & data warehousing (ppt)
Data mining & data warehousing (ppt)
 
Architecture Challenges In Cloud Computing
Architecture Challenges In Cloud ComputingArchitecture Challenges In Cloud Computing
Architecture Challenges In Cloud Computing
 
Introduction to Cloud Computing and Cloud Infrastructure
Introduction to Cloud Computing and Cloud InfrastructureIntroduction to Cloud Computing and Cloud Infrastructure
Introduction to Cloud Computing and Cloud Infrastructure
 
Big data
Big dataBig data
Big data
 
Online analytical processing (olap) tools
Online analytical processing (olap) toolsOnline analytical processing (olap) tools
Online analytical processing (olap) tools
 
System models for distributed and cloud computing
System models for distributed and cloud computingSystem models for distributed and cloud computing
System models for distributed and cloud computing
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
Map reduce in BIG DATA
Map reduce in BIG DATAMap reduce in BIG DATA
Map reduce in BIG DATA
 
Cloud computing
Cloud computingCloud computing
Cloud computing
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
System Development Life Cycle & Implementation of MIS
System Development Life Cycle & Implementation of MISSystem Development Life Cycle & Implementation of MIS
System Development Life Cycle & Implementation of MIS
 
Unit 4
Unit 4Unit 4
Unit 4
 
Cloud computing and service models
Cloud computing and service modelsCloud computing and service models
Cloud computing and service models
 
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
Cloud Computing Principles and Paradigms: 5 virtual machines provisioning and...
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Virtualization in cloud
Virtualization in cloudVirtualization in cloud
Virtualization in cloud
 
Business Analytics
 Business Analytics  Business Analytics
Business Analytics
 

Destaque

Cloud Workload Suitability
Cloud Workload SuitabilityCloud Workload Suitability
Cloud Workload SuitabilityVedanta Barooah
 
Production cloudscape
Production cloudscapeProduction cloudscape
Production cloudscapejjmarino
 
Corporate-Overview-Slides
Corporate-Overview-SlidesCorporate-Overview-Slides
Corporate-Overview-SlidesRISC Networks
 
RISC Networks CloudScape Product Overview
RISC Networks CloudScape Product OverviewRISC Networks CloudScape Product Overview
RISC Networks CloudScape Product OverviewRISC Networks
 
Multi-Tenant SOA Middleware for Cloud Computing
Multi-Tenant SOA Middleware for Cloud ComputingMulti-Tenant SOA Middleware for Cloud Computing
Multi-Tenant SOA Middleware for Cloud ComputingSrinath Perera
 
The 2014 AWS Enterprise Summit - Where to Begin
The 2014 AWS Enterprise Summit - Where to BeginThe 2014 AWS Enterprise Summit - Where to Begin
The 2014 AWS Enterprise Summit - Where to BeginAmazon Web Services
 
Social Media, Cloud Computing and architecture
Social Media, Cloud Computing and architectureSocial Media, Cloud Computing and architecture
Social Media, Cloud Computing and architectureRick Mans
 
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...Chad Lawler
 
Cloud Assessment and Readiness Tool (CART)
Cloud Assessment and Readiness Tool (CART)Cloud Assessment and Readiness Tool (CART)
Cloud Assessment and Readiness Tool (CART)HCL Technologies
 
RightScale Webinar: Key Considerations For Cloud Migration and Portability
RightScale Webinar:  Key Considerations For Cloud Migration and PortabilityRightScale Webinar:  Key Considerations For Cloud Migration and Portability
RightScale Webinar: Key Considerations For Cloud Migration and PortabilityRightScale
 
A cloud readiness assessment framework
A cloud readiness assessment frameworkA cloud readiness assessment framework
A cloud readiness assessment frameworkCarlo Colicchio
 
Where to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationWhere to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationAmazon Web Services
 
The Cloud Adoption Program for Financial Services
The Cloud Adoption Program for Financial ServicesThe Cloud Adoption Program for Financial Services
The Cloud Adoption Program for Financial ServicesAmazon Web Services
 
Multi Tenancy In The Cloud
Multi Tenancy In The CloudMulti Tenancy In The Cloud
Multi Tenancy In The Cloudrohit_ainapure
 
Data Center Migration to the AWS Cloud
Data Center Migration to the AWS CloudData Center Migration to the AWS Cloud
Data Center Migration to the AWS CloudTom Laszewski
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationCapgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationFloyd DCosta
 
Assessing Your Company's Cloud Readiness
Assessing Your Company's Cloud ReadinessAssessing Your Company's Cloud Readiness
Assessing Your Company's Cloud ReadinessAmazon Web Services
 

Destaque (18)

Cloud Workload Suitability
Cloud Workload SuitabilityCloud Workload Suitability
Cloud Workload Suitability
 
Production cloudscape
Production cloudscapeProduction cloudscape
Production cloudscape
 
Corporate-Overview-Slides
Corporate-Overview-SlidesCorporate-Overview-Slides
Corporate-Overview-Slides
 
RISC Networks CloudScape Product Overview
RISC Networks CloudScape Product OverviewRISC Networks CloudScape Product Overview
RISC Networks CloudScape Product Overview
 
Multi-Tenant SOA Middleware for Cloud Computing
Multi-Tenant SOA Middleware for Cloud ComputingMulti-Tenant SOA Middleware for Cloud Computing
Multi-Tenant SOA Middleware for Cloud Computing
 
The 2014 AWS Enterprise Summit - Where to Begin
The 2014 AWS Enterprise Summit - Where to BeginThe 2014 AWS Enterprise Summit - Where to Begin
The 2014 AWS Enterprise Summit - Where to Begin
 
Social Media, Cloud Computing and architecture
Social Media, Cloud Computing and architectureSocial Media, Cloud Computing and architecture
Social Media, Cloud Computing and architecture
 
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...
Cloud Application Rationalization- The Cloud, the Enterprise, and Making the ...
 
Cloud Assessment and Readiness Tool (CART)
Cloud Assessment and Readiness Tool (CART)Cloud Assessment and Readiness Tool (CART)
Cloud Assessment and Readiness Tool (CART)
 
Cloud Migration
Cloud MigrationCloud Migration
Cloud Migration
 
RightScale Webinar: Key Considerations For Cloud Migration and Portability
RightScale Webinar:  Key Considerations For Cloud Migration and PortabilityRightScale Webinar:  Key Considerations For Cloud Migration and Portability
RightScale Webinar: Key Considerations For Cloud Migration and Portability
 
A cloud readiness assessment framework
A cloud readiness assessment frameworkA cloud readiness assessment framework
A cloud readiness assessment framework
 
Where to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationWhere to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio Migration
 
The Cloud Adoption Program for Financial Services
The Cloud Adoption Program for Financial ServicesThe Cloud Adoption Program for Financial Services
The Cloud Adoption Program for Financial Services
 
Multi Tenancy In The Cloud
Multi Tenancy In The CloudMulti Tenancy In The Cloud
Multi Tenancy In The Cloud
 
Data Center Migration to the AWS Cloud
Data Center Migration to the AWS CloudData Center Migration to the AWS Cloud
Data Center Migration to the AWS Cloud
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationCapgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
 
Assessing Your Company's Cloud Readiness
Assessing Your Company's Cloud ReadinessAssessing Your Company's Cloud Readiness
Assessing Your Company's Cloud Readiness
 

Semelhante a Multi-Tenancy and Virtualization in Cloud Computing

Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudOla Spjuth
 
RECAP at ETSI Experiential Network Intelligence (ENI) Meeting
RECAP at ETSI Experiential Network Intelligence (ENI) MeetingRECAP at ETSI Experiential Network Intelligence (ENI) Meeting
RECAP at ETSI Experiential Network Intelligence (ENI) MeetingRECAP Project
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Alexandru Iosup
 
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a PalaceInternet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a PalaceDr.-Ing Abdur Rahim Biswas
 
David Loureiro - Presentation at HP's HPC & OSL TES
David Loureiro - Presentation at HP's HPC & OSL TESDavid Loureiro - Presentation at HP's HPC & OSL TES
David Loureiro - Presentation at HP's HPC & OSL TESSysFera
 
Ukd2008 18-9-08 andrea
Ukd2008 18-9-08 andreaUkd2008 18-9-08 andrea
Ukd2008 18-9-08 andreaAndrea Zaza
 
Challenges and complexities in application of LCA approaches in the case of I...
Challenges and complexities in application of LCA approaches in the case of I...Challenges and complexities in application of LCA approaches in the case of I...
Challenges and complexities in application of LCA approaches in the case of I...Reza Farrahi Moghaddam, PhD, BEng
 
Dealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationDealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationEdward Curry
 
CloudLighting - A Brief Overview
CloudLighting - A Brief OverviewCloudLighting - A Brief Overview
CloudLighting - A Brief OverviewCloudLightning
 
Research portfolio
Research portfolio Research portfolio
Research portfolio Mehdi Bennis
 
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityBarton George
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2kevin_donovan
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...David Wallom
 
CloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning
 
Pistoia Alliance Sequence Services Programme Phase 2
Pistoia Alliance Sequence Services Programme Phase 2Pistoia Alliance Sequence Services Programme Phase 2
Pistoia Alliance Sequence Services Programme Phase 2Pistoia Alliance
 
20120605 icse zurich
20120605 icse zurich20120605 icse zurich
20120605 icse zurichArian Zwegers
 

Semelhante a Multi-Tenancy and Virtualization in Cloud Computing (20)

Data-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and CloudData-intensive bioinformatics on HPC and Cloud
Data-intensive bioinformatics on HPC and Cloud
 
Chapter 1
Chapter 1Chapter 1
Chapter 1
 
RECAP at ETSI Experiential Network Intelligence (ENI) Meeting
RECAP at ETSI Experiential Network Intelligence (ENI) MeetingRECAP at ETSI Experiential Network Intelligence (ENI) Meeting
RECAP at ETSI Experiential Network Intelligence (ENI) Meeting
 
Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.Cloud Programming Models: eScience, Big Data, etc.
Cloud Programming Models: eScience, Big Data, etc.
 
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a PalaceInternet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
Internet of Things (IoT) is a King, Big data is a Queen and Cloud is a Palace
 
David Loureiro - Presentation at HP's HPC & OSL TES
David Loureiro - Presentation at HP's HPC & OSL TESDavid Loureiro - Presentation at HP's HPC & OSL TES
David Loureiro - Presentation at HP's HPC & OSL TES
 
Ukd2008 18-9-08 andrea
Ukd2008 18-9-08 andreaUkd2008 18-9-08 andrea
Ukd2008 18-9-08 andrea
 
Challenges and complexities in application of LCA approaches in the case of I...
Challenges and complexities in application of LCA approaches in the case of I...Challenges and complexities in application of LCA approaches in the case of I...
Challenges and complexities in application of LCA approaches in the case of I...
 
Dealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time InformationDealing with Semantic Heterogeneity in Real-Time Information
Dealing with Semantic Heterogeneity in Real-Time Information
 
Session19 Globus
Session19 GlobusSession19 Globus
Session19 Globus
 
Overview of CloudLightning
Overview of CloudLightningOverview of CloudLightning
Overview of CloudLightning
 
CloudLighting - A Brief Overview
CloudLighting - A Brief OverviewCloudLighting - A Brief Overview
CloudLighting - A Brief Overview
 
Research portfolio
Research portfolio Research portfolio
Research portfolio
 
Cloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard UniversityCloud Presentation and OpenStack case studies -- Harvard University
Cloud Presentation and OpenStack case studies -- Harvard University
 
Paderborn
PaderbornPaderborn
Paderborn
 
Harvard university i tv3.2
Harvard university i tv3.2Harvard university i tv3.2
Harvard university i tv3.2
 
Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...Utilising Cloud Computing for Research through Infrastructure, Software and D...
Utilising Cloud Computing for Research through Infrastructure, Software and D...
 
CloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture OverviewCloudLightning - Project and Architecture Overview
CloudLightning - Project and Architecture Overview
 
Pistoia Alliance Sequence Services Programme Phase 2
Pistoia Alliance Sequence Services Programme Phase 2Pistoia Alliance Sequence Services Programme Phase 2
Pistoia Alliance Sequence Services Programme Phase 2
 
20120605 icse zurich
20120605 icse zurich20120605 icse zurich
20120605 icse zurich
 

Multi-Tenancy and Virtualization in Cloud Computing

  • 1. IN4392 Cloud Computing Multi-Tenancy, including Virtualization Cloud Computing (IN4392) D.H.J. Epema and A. Iosup 2012-2013 1 Parallel and Distributed Group/
  • 2. Terms for Today’s Discussion mul·ti-te·nan·cy noun ˌməl-ti-ˌte-nan(t)-sē an IT sharing model of how physical and virtual resources are used by possibly concurrent tenants 2012-2013 2
  • 3. Characteristics of Multi-Tenancy 1. Isolation = separation of services provided to each tenant (the noisy neighbor) 2. Scaling conveniently with the number and size of tenants (max weight in the elevator) 3. Meet SLAs for each tenant 4. Support for per-tenant service customization 5. Support for value-adding ops, e.g., backup, upgrade 6. Secure data processing and storage (the snoopy neighbor) 7. Support for regulatory law (per legislator, per tenant) 2012-2013 3
  • 4. Benefits of Multi-Tenancy (the Promise) • Cloud operator • Economy of scale • Market-share and branding (for the moment) • Users • Flexibility • Focus on core expertise • Reduced cost • Reduced time-to-market • Overall • Reduced cost of IT deployment and operation 2012-2013 4
  • 5. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 5
  • 6. Problems with Multi-Tenancy [1/5] A List of Concerns • Users • Performance isolation (and variability) for all resources • Scalability with the number of tenants (per resource) • Support for value-added ops for each application type • Security concerns (too many to list) • Owners • Up-front and operational costs • Human management of multi-tenancy • Development effort and required skills • Time-to-market • The law: think health management applications 2012-2013 6
  • 7. Problems with Multi-Tenancy [2/5] Load Imbalance • Overall workload imbalance: normalized daily load (5:1) • Temporary workload imbalance: hourly load (1000:1) Overall imbalance Temporary imbalance 2012-2013 7
  • 8. Problems with Multi-Tenancy [3/5] Practical Achievable Utilization • Enterprise: <15% [McKinsey’12] • Parallel production environments: 60-70% [Nitzberg’99] • Grids: 15-30% average cluster, >90% busy clusters • Today’s clouds: ??? 2012-2013 8 Iosup and Epema: Grid Computing Workloads. IEEE Internet Computing 15(2): 19-26 (2011)
  • 9. Problems with Multi-Tenancy [4/5] (Catastrophic) Cascading Failures • Parallel production environments: one failure kills one or more parallel jobs • Grids: correlated failures • Today’s clouds: Amazon, Facebook, etc. had catastrophic failures in the past 2-3 years Average = 11 nodes Range = 1—339 nodes CDF 2012-2013 Size of correlated failures 9 Iosup et al. : On the dynamic resource availability in grids. GRID 2007: 26-33
  • 10. Problems with Multi-Tenancy [5/5] Economics • Up-front: a shared approach is more difficult to develop than an isolated approach; may also require expensive skills 2012-2013 10 Source: www.capcloud.org/TechGate/Multitenancy_Magic.pptx
  • 11. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 11
  • 12. 2012-2013 12
  • 13. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 13
  • 14. Fairness • Intuitively, distribution of goods (distributive justice) • Different people, different perception of justice • Each one will pay the same vs The rich should pay proportionally higher taxes • I only need to pay a few years later than everyone else 2012-2013 14
  • 15. The VL-e project: application areas Philips IBM Unilever Medical Diagnosis & Bags-of-Tasks Bio- Bio- Data Intensive Food Dutch Diversity Informatics Informatics Telescience Imaging Science Virtual Laboratory (VL) Management of comm. & Application Oriented Services computing Grid Services Harness multi-domain distributed resources 15
  • 16. The VL-e project: application areas Philips IBM Unilever Medical Bio- Bio- Data Food Dutch Diagnosis & Diversity Informatics Intensive Informatics Telescience Imaging Science Bags-of-Tasks Fairness for all! Virtual Laboratory (VL) Management of comm. & Application Oriented Services Task (groups of 5, 5 minutes): computing discuss fairness for this scenario. Grid Services Harness multi-domain distributed resources Task (inter-group discussion): discuss fairness for this scenario. 16
  • 17. Research Questions Q1 What is the design space for BoT scheduling in large-scale, distributed, fine-grained computing? Q2 What is the performance of BoT schedulers in this setting? 2012-2013 17
  • 18. Scheduling Model [1/4] Overview • System Model 1. Clusters execute jobs 2. Resource managers coordinate job execution 3. Resource management architectures route jobs among resource managers 4. Task selection policies create the eligible set Fairness for all! 5. Task scheduling policies schedule the eligible set 18 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q1
  • 19. Scheduling Model [2/4] Resource Management Architectures route jobs among resource managers Centralized Separated Clusters Decentralized (csp) (sep-c) (fcondor) 19 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q1
  • 20. Scheduling Model [3/4] Task Selection Policies create the eligible set • Age-based: 1. S-T: Select Tasks in the order of their arrival. 2. S-BoT: Select BoTs in the order of their arrival. • User priority based: 3. S-U-Prio: Select the tasks of the User with the highest Priority. • Based on fairness in resource consumption: 4. S-U-T: Select the Tasks of the User with the lowest res. cons. 5. S-U-BoT: Select the BoTs of the User with the lowest res. cons. 6. S-U-GRR: Select the User Round-Robin/all tasks for this user. 7. S-U-RR: Select the User Round-Robin/one task for this user. 20 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q1
  • 21. Context: System Model [4/4] Task Task Scheduling Policies Information schedule the eligible set K H U • Information availability: ECT, K ECT-P FPF Information • Known FPLT Resource • Unknown H DFPLT, • Historical records MQD RR, • Sample policies: U STFR WQR • Earliest Completion Time (with Prediction of Runtimes) (ECT(-P)) • Fastest Processor First (FPF) • (Dynamic) Fastest Processor Largest Task ((D)FPLT) • Shortest Task First w/ Replication (STFR) • Work Queue w/ Replication (WQR) 21 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q1
  • 22. Design Space Exploration [1/3] Overview • Design space exploration: time to understand how our solutions fit into the complete system. s x 7P x I x S x A x (environment) → >2M design points • Study the impact of: • The Task Scheduling Policy (s policies) • The Workload Characteristics (P characteristics) • The Dynamic System Information (I levels) • The Task Selection Policy (S policies) • The Resource Management Architecture (A policies) 22 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q2
  • 23. Design Space Exploration [2/3] Experimental Setup • Simulator: • DGSim [IosupETFL SC’07, IosupSE EuroPar’08] • System: • DAS + Grid’5000 [Cappello & Bal CCGrid’07] • >3,000 CPUs: relative perf. 1-1.75 • Metrics: • Makespan • Normalized Schedule Length ~ speed-up • Workloads: • Real: DAS + Grid’5000 • Realistic: system load 20-95% (from workload model) 23 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q2
  • 24. Design Space Exploration [3/3] Task Selection, including Fair Policies • Task selection policy only for busy systems • Naïve user priority can lead to poor performance • Fairness, in general, reduces performance S-U-Prio S-U-*: S-U-T, S-U-BoT, … 24 Iosup et al.: The performance of bags-of-tasks in large- scale distributed systems. HPDC 2008: 97-108 Q2
  • 25. Quincy: Microsoft’s Fair Scheduler • Fairness in Microsoft’s Dryad data centers • Large jobs (30 minutes or longer) should not monopolize the whole cluster (Similar: Bounded Slowdown [Feitelson et al.’97]) • A job that takes t seconds in exclusive-access run requires at most J x t seconds for J concurrent jobs in the cluster. • Challenges 1. Support fairness 2. Improve data locality: use data center’s network and storage architecture to reduce job response time 2012-2013 25 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 26. Dryad Workloads 2012-2013 26 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 27. Dryad Workloads Q: Worst-case scenario? 2012-2013 27 Source: http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
  • 28. Dryad Workloads Q: Is this fair? 2012-2013 28 Source: http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
  • 29. Dryad Workloads Q: Is this fair? 2012-2013 29 Source: http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
  • 30. Dryad Workloads Q: Is this fair? 2012-2013 30 Source: http://sigops.org/sosp/sosp09/slides/quincy/QuincyTestPage.html
  • 31. Quincy Cluster Architecture: Racks and Computers 2012-2013 31 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 32. Quincy Main Idea: Graph Min-Cost Flow • From scheduling to Graph Min-Cost Flow • Feasible schedule = min-cost flow • Graph construction • Graph from job tasks to computers, passing through cluster headnodes and racks • Edges weighted by cost function (scheduling constraints, e.g., fairness) • Pros and Cons • From per-job (local) decisions to workload (global) decisions • Complex graph construction • Edge weight assumes all constraints can be normalized 2012-2013 32 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 33. Quincy Operation [1/6] Workflow tasks Cluster, Racks, Computers 2012-2013 33 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 34. Quincy Operation [2/6] unscheduled 2012-2013 34 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 35. Quincy Operation [3/6] Q: How easy to encode heterogeneous resources? Weighted edges 2012-2013 35 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 36. Quincy Operation [4/6] Root task gets one computer 2012-2013 36 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 37. Quincy Operation [5/6] Dynamic Schedule for One Job 2012-2013 37 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 38. Quincy Operation [6/6] Dynamic Schedule for Two Jobs Q: How compute-intensive is the Quincy scheduler, for many jobs and/or computers? 2012-2013 38 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 39. Quincy Experimental Setup • Schedulers • Encoded two fair variants (w/ and w/o pre-emption) • Encoded two unfair variants (w/ and w/o pre-emption) • Comparison with Greedy Algorithm (Queue-Based) • Typical Dryad jobs • Workload includes worst-case scenario • Environment • 1 cluster • 8 racks • 240 nodes 2012-2013 39 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 40. Quincy Experimental Results [1/5] 2012-2013 40 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 41. Quincy Experimental Results [2/5] 2012-2013 41 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 42. Quincy, Experimental Results [3/5] No Fairness 2012-2013 42 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 43. Quincy, Experimental Results [4/5] Queue-Based Fairness 2012-2013 43 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 44. Quincy, Experimental Results [5/5] Quincy Fairness 2012-2013 44 Isard et al.: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276
  • 45. Mesos Dominant Resource Fairness • Multiple resource types • Max-Min fairness = maximize minimum per-user allocation • Paper 1 in Topic 4 • Dominant Resource Fairness better explained in: Ghodsi et al., Dominant Resource Fairness: Fair Allocation of Multiple Resources Types, Usenix NSDI 2011. 2012-2013 45
  • 46. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 46
  • 47. Virtualization • Merriam-Webster • Popek and Goldberg, 1974 2012-2013 47 Source: Waldspurger, Introduction to Virtual Machines http://labs.vmware.com/download/52/
  • 48. Characteristics of Virtualization Q: Why not do all these in the OS? 1. Fidelity* = ability to run application unmodified 2. Performance* close to hardware ability 3. Safety* = all hardware resources managed by virtualization manager, never directly accessible to application 4. Isolation of performance, or failures, etc. 5. Portability = ability to run VM on any hardware (support for value-adding ops, e.g., migration) 6. Encapsulation = ability to capture VM state (support for value-adding ops, e.g., backup, clone) 7. Transparency in operation 2012-2013 48 * Classic virtualization (Popek and Goldberg 1974)
  • 49. Benefits of Virtualization (the Promise) • Simplified management of physical resources • Increased utilization of physical resources (consolidation) • Better isolation of (catastrophic) failures • Better isolation of security leaks (?) • Support for multi-tenancy • Derived benefit: reduced cost of IT deployment and operation 2012-2013 49
  • 50. A List of Concerns • Users • Performance isolation • Owners • Performance loss vs native hardware • Support for exotic devices, especially on the versatile x86 • Porting OS and applications, for some virtualization flavors • Implement VMM—application integration? (Loss of portability vs increased performance.) • Install hardware with support for virtualization? (Certification of new hardware vs increased performance.) • The Law: security, reliability, … 2012-2013 50
  • 51. Depth of Virtualization • NO virtualization (actually, virtual memory) • Most grids, enterprise data centers until 2000 Q: Are all our machines virtualized • Facebook now anyway, by the modern OS? • Single-level virtualization (we zoom into this next) • Nested virtualization • VM embedded in a VM embedded in a VM emb … • Q: Why is this virtualization model useful? It’s all turtles all the way down… Ben-Yehuda et al.: The Turtles Project: Design and Implementation of Nested Virtualization. OSDI 2010: 423-436 2012-2013 51
  • 52. Single-Level Virtualization and The Full IaaS Stack Applications Applications Applications Guest OS Guest OS Guest OS Virtual Virtual Resources Resources Virtual Resources VM Instance Virtual Machine Virtual Machine Virtual Machine Monitor Virtual Machine Monitor Virtual Infrastructure Manager February 20, 2013 52 Physical Infrastructu
  • 53. Single-Level Virtualization Applications Applications Guest OS Guest OS MusicW Virtual Resources OtherAp OtherAp Virtual Resources ave p p Q: What to do now? Virtual Machine Virtual Machine Virtual Machine Monitor Hypervisor Host OS May not exist February 20, 2013 53
  • 54. Three VMM Models Classic VMM* Hosted VMM Hybrid VMM MWave App2 MWave App2 I/O App2 V MWave M Guest M OS VMM VMM Host OS VMM Host OS 2012-2013 54 * Classic virtualization (Popek and Goldberg 1974)
  • 55. Single-Level Virtualization Implementing the Classic Virtualization Model • General technique*, similar to simulation/emulation • Code for computer X runs on general-purpose machine G. • If X=G (virtualization), slowdown in software simulation may be 20:1. If X≠G (emulation), slowdown may be 1000:1. • If X=G (virtualization), code may execute directly on hardware • Privileged vs user code* • Trap-and-emulate as main (but not necessary) approach • Ring deprivileging, ring aliasing, address-space compression, other niceties** • Specific approaches for each virtualized resource*** • Virtualized CPU, memory, I/O (disk, network, graphics, …) 2012-2013 55 * (Goldberg 1974) ** (Intel 2006) *** (Rosenblum and Garfinkel 2005)
  • 56. Single-Level Virtualization Refinements to the Classic Virtualization Model* • Enhancing VMM—guest OS interface (paravirtualization) • Guest OS is re-coded (ported) to the VMM, for performance gains (e.g., by avoiding some privileged operations) • Guest OS can provide information to VMM, for performance gains • Loses or loosens ―Fidelity‖ characteristic** • 2010 onwards: paravirtualization other than I/O seems to wane • Enhancing hardware—VMM interface (HW support) • New hardware execution modes for Guest OSs, so no need for VMM to trap all privileged operations, so performance gains • IBM’s System 370 introduced interpretive execution (1972), Intel VT-x and VT-I (2006) • Passthrough I/O virtualization with low CPU overhead • Isolated DMA: Intel VT-d and AMD IOMMU; I/O device partitions: PCI-SIG IOV spec 2012-2013 56 * (Adams and Agesen 2006) ** (Popek and Goldberg 1974)
  • 57. Single-Level Virtualization Trap-and-Emulate Guest OS + Application Unprivileged Q: What are the challenges? Page Undef Q: What are the challenges for Fault Instr vIRQ x86 architectures? Privileged MMU CPU I/O Emulation Emulation Emulation Virtual Machine Monitor 2012-2013 57 Source: Waldspurger, Introduction to Virtual Machines http://labs.vmware.com/download/52/
  • 58. Single-Level Virtualization Processor Virtualization Techniques* • Binary Translation • Static, execute guest instructions in interpreter, to prevent unlawful access to privilege state instructions • Dynamic/Adaptive BT, detect instructions that trap frequently and adapt their translation, to eliminate traps from non-privileged instructions accessing sensitive data (e.g., load/store in page tables) • Hardware virtualization • Co-design VM and Hardware: HW with non-standard ISA, shadow memory, optimization of instructions for selected applications • Intel VT-*, AMD SVM: in-memory data structure for state, guest mode, a less privileged execution mode + vmrun, etc. 2012-2013 58 * (Adams and Agesen 2006)
  • 59. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 59
  • 60. Support for Specific Services and/or Platforms Database Multi-Tenancy [1/3] 1. Isolation = separation of services provided to each tenant 2. Scaling conveniently with the number and size of tenants 3. Meet SLAs for each tenant 4. Support for per-tenant service customization 5. Support for value-adding ops, e.g., backup, upgrade 6. Secure data processing and storage 7. Support for regulatory law (per legislator, per tenant) 2012-2013 60 * Platform-specific (database-specific) issues
  • 61. Support for Specific Services and/or Platforms Database Multi-Tenancy [2/3] 2012-2013 61 Source: http://msdn.microsoft.com/en-us/library/aa479086.aspx
  • 62. Support for Specific Services and/or Platforms Database Multi-Tenancy [3/3] Private tables Extension tables Rigid, shared table Datatype-specific Universal table with pivot tables XML document Universal table 2012-2013 62 Source: Bobrowksi www.capcloud.org/TechGate/Multitenancy_Magic.pptx
  • 63. Agenda 1. Introduction 2. Multi-Tenancy in Practice (The Problem) 3. Architectural Models for Multi-Tenancy in Clouds 4. Shared Nothing: Fairness 5. Shared Hardware: Virtualization 6. Sharing Other Operational Levels 7. Summary 2012-2013 63
  • 64. Conclusion Take-Home Message • Multi-Tenancy = reduced cost of IT • 7 architectural models for multi-tenancy • Shared Nothing—fairness is a key challenge • Shared Hardware—virtualization is a key challenge • Other levels—optimizing for specific application is a key challenge • Many trade-offs • Virtualization • Enables multi-tenancy + many other benefits • 3 depth models, 3 VMM models • A whole new dictionary: hypervisor, paravirtualization, ring deprivileging • Main trade-off: performance cost vs benefits • Reality check: virtualization is now (2012) very popular February 20, 2013 http://www.flickr.com/photos/dimitrisotiropoulos/4204766418/ 64
  • 65. Reading Material • Workloads • James Patton Jones, Bill Nitzberg: Scheduling for Parallel Supercomputing: A Historical Perspective of Achievable Utilization. JSSPP 1999: 1-16 • Alexandru Iosup, Dick H. J. Epema: Grid Computing Workloads. IEEE Internet Computing 15(2): 19-26 (2011) • Alexandru Iosup, Mathieu Jan, Omer Ozan Sonmez, Dick H. J. Epema: On the dynamic resource availability in grids. GRID 2007: 26-33 • D. Feitelson, L. Rudolph, U. Schwiegelshohn, K. Sevcik, and P. Wong. Theory and practice in parallel job scheduling. In JSSPP, pages 1– 34, 1997 • Fairness • Alexandru Iosup, Omer Ozan Sonmez, Shanny Anoep, Dick H. J. Epema: The performance of bags-of-tasks in large-scale distributed systems. HPDC 2008: 97-108 • Michael Isard, Vijayan Prabhakaran, Jon Currey, Udi Wieder, Kunal Talwar, Andrew Goldberg: Quincy: fair scheduling for distributed computing clusters. SOSP 2009: 261-276 • A. Ghodsi, M. Zaharia, B. Hindman, A. Konwinski, S. Shenker, and I. Stoica, Dominant Resource Fairness: Fair Allocation of Multiple Resources Types, Usenix NSDI 2011. • Virtualization • Gerald J. Popek, Robert P. Goldberg, Formal Requirements for Virtualizable Third Generation Architectures, Communications of the ACM, July 1974. • Robert P. Goldberg, Survey of Virtual Machine Research, IEEE Computer Magazine, June 1974 • Mendel Rosenblum and Tal Garfinkel, Virtual Machine Monitors: Current Technology and Future Trends, IEEE Computer Magazine, May 2005 • Keith Adams, Ole Agesen: A comparison of software and hardware techniques for x86 virtualization. ASPLOS 2006: 2-13 • Gill Neiger, Amy Santoni, Felix Leung, Dion Rodgers, Rich Uhlig, Intel Virtualization Technology: Hardware Support for Efficient Processor Virtualization. Intel Technology Journal, Vol.10(3), Aug 2006. • Muli Ben-Yehuda, Michael D. Day, Zvi Dubitzky, Michael Factor, Nadav Har'El, Abel Gordon, Anthony Liguori, Orit Wasserman, Ben-Ami Yassour: The Turtles Project: Design and Implementation of Nested Virtualization. OSDI 2010: 423-436 2012-2013 65

Notas do Editor

  1. Comparison:Classic allows code execution to “run through” to the raw hardware; very efficient for I/OClassic requires virtualizable CPU, in the sense described by (Popek and Goldberg 1974’) – see earlier slide “Characteristics of Virtualization”Classic is not possible on x86 architectures without hardware virtualization supportHosted offers better I/O support through use of Host OS drivers, but much worse I/O performance due to I/O ops going through Guest OS, VMM, and Host OS before reaching the raw hardwareHosted has slow I/O so it cannot be used for most servers (Web, etc.)Hosted is easy to install and maintain (it’s a regular app for the Host OS), so it is good for desktopsHosted has problems with maintaining complete isolation(VMware’s) Hybrid runs VMM at the same level as Host OS; I/O VMM can perform graphics and other I/O ops for generic I/O devices, which are then translated to real hardware by the Host OS  works but can be very slow