SlideShare uma empresa Scribd logo
1 de 40
Baixar para ler offline
Jiannan Ouyang, John Lange
Department of Computer Science
University of Pittsburgh
VEE ’13
03/17/2013
Preemptable Ticket Spinlocks
Improving Consolidated Performance in the Cloud
Motivation
2
—  VM interference in overcommitted environments
—  OS synchronization overhead
—  Lock holder preemption (LHP)
—  Lock Waiter Preemption
—  significance analysis of lock waiter preemption
—  PreemptableTicket Spinlock
—  implementation inside Linux
—  Evaluation
—  significant speedup over Linux
Contributions
3
Spinlocks
4
—  Basics
—  lock() & unlock()
—  Busy waiting lock
—  generic spinlock: random order, unfair (starvation)
—  ticket spinlock: FIFO order, fair
—  Designed for fast mutual exclusion
—  busy waiting vs. sleep/wakeup
—  spinlocks for short & fast critical sections (~1us)
—  OS assumptions
—  use spinlocks for short critical section only
—  never preempt a thread holding or waiting a kernel spinlock
Preemption in VMs
5
—  Lock Holder Preemption (LHP)
—  virtualization breaks the OS assumption
—  vCPU holding a lock is unscheduled byVMM
—  preemption prolongs critical section (~1m v.s. ~1us)
—  Proposed Solutions
—  Co-scheduling and variants
—  Hardware-assisted scheme (Pause Loop Exiting)
—  Paravirtual spinlocks
Preemption in Ticket Lock
6
0 1 2
head = 0 tail = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
7
0 1 2
head = 0 tail = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
8
0 1 2 3
head = 0 tail = 3
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
9
0 1 2 3 4
head = 0 tail = 4
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
10
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
11
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Lock Holder Preemption!
Preemption in Ticket Lock
12
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
13
1 2 3 4
tail = 4head = 1
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
14
2 3 4
tail = 4head = 2
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
15
3 4
head = 3
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
tail = 4
Preemption in Ticket Lock
16
3 4 5
head = 3 tail = 5
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
Preemption in Ticket Lock
17
3 4 5 6
head = 3 tail = 6
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
LockWaiter Preemption
wait on available resource
Lock Waiter Preemption
18
—  Lock waiter is preempted
—  Later waiters wait on an available lock
—  Possible to adapt to it, if we
—  detect preempted waiter
—  acquire lock out of order
How significant is it??
Waiter Preemption Dominates
19
LHP + LWP LWP ​ 𝐋 𝐖𝐏/ 𝐋𝐇𝐏
+𝐋𝐖𝐏 
hackbench x1 1089 452 41.5%
hackbench x2 44342 39221 88.5%
ebizzy x1 294 166 56.5%
ebizzy x2 1017 980 96.4%
Table 2: LockWaiter Preemption Problem in the Linux Kernel
Lock waiter preemption dominates in
overcommitted environments
Challenges & Approach
20
—  How to identify a preempted waiter?
—  timeout threshold
—  How to violate order constraints?
—  allow timed out waiters get the lock randomly
—  ensure mutual exclusion between them
—  How NOT to break the whole ordering mechanism?
—  timeout threshold proportional to queue position
Queue Position Index
21
N = ticket – queue_head
—  ticket: copy of queue tail value upon enqueue
—  N: number of earlier waiters
n n+1 n+2
head = n tail = n+2
ticket = n+2
N = 2
Proportional Timeout Threshold
22
T = N x t
—  t is a constant parameter
—  large enough to avoid false detection
—  small enough to save waiting time
—  Performance is NOT t value sensitive
—  most locks take ~1us & most spinning time wasted on locks
that wait ~1ms
—  larger t does not harm & smaller t does not gain much
n n+1 n+2
head = n tail = n+2
0 t 2tTimeout
Threshold
Preemptable Ticket Spinlock
23
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
0 1 2 3 4 5
head = 0 tail = 5
0
Timeout
Threshold
t 2t 3t 4t 5t
Preemptable Ticket Spinlock
24
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 2 3 4 5
head = 1 tail = 5
0
Timeout
Threshold
t 2t 3t 4t
Preemptable Ticket Spinlock
25
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 2 3 4 5
head = 1 tail = 5
0
Timeout
Threshold
t 2t 3t 4t
Preemptable Ticket Spinlock
26
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 4 5
head = 2 tail = 5
Timeout
Threshold
t 2t 3t
N = ticket – head
Preemptable Ticket Spinlock
27
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 4 5
head = 2 tail = 5
Timeout
Threshold
t 2t 3t
Preemptable Ticket Spinlock
28
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Preemptable Ticket Spinlock
29
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Preemptable Ticket Spinlock
30
0
1
a scheduled waiter with ticket 0
a preempted waiter with ticket 1
1 3 5
head = 3 tail = 5
Timeout
Threshold
0 2t
Summary
31
—  PreemptableTicket Lock adapts to preemption
—  preserve order in absence of preemption
—  violate order upon preemption
—  PreemptableTicket Lock preserves fairness
—  order violations are restricted
—  priority is always given to timed out waiters
—  timed out waiters bounded by vCPU numbers of aVM
Implementation
32
—  Drop-in replacement
—  lock(), unlock(), is_locked(), trylock(), etc.
—  Correct
—  race condition free: atomic updates
—  Fast
—  performance is sensitive to lock efficiency
—  ~60 lines of C/inline-assembly in Linux 3.5.0
Paravirtual Spinlocks
33
—  Lock holder preemption is unaddressed
—  semantic gap between guest and host
—  paravirtualization: guest/host cooperation
—  signal long waiting lock / put a vCPU to sleep
—  notify to wake up a vCPU / wake up a vCPU
—  paravirtual preemptable ticket spinlock
—  sleep when waiting too long after timed out
—  wake up all sleeping waiters upon lock releasing
Evaluation
34
—  Host
—  8 core 2.6GHz Intel Core i7 CPU, 8 GB RAM, 1Gbit NIC,
Fedora 17 (Linux 3.5.0)
—  Guest
—  8 core, 1G RAM, Fedora 17 (Linux 3.5.0)
—  Benchmarks
—  hackbench, ebizzy, dell dvd store
—  Lock implementations
—  baseline: ticket lock, paravirtual ticket lock (pv-lock)
—  preemptable ticket lock
—  paravirtual (pv) preemptable ticket lock
Hackbench
35
—  Average Speedup
—  preemptable-lock vs. ticket lock: 4.82X
—  pv-preemptable-lock v.s. ticket lock: 7.08X
—  pv-preemptable-lock v.s. pv-lock: 1.03X
Ebizzy
36
Less variance over ticket lock and pv-lock
—  in-VM preemption adaptivity
—  lessVM interference
variance
80.36 vs. 10.94
variance
131.62 vs. 16.09
Dell DVD Store (apache/mysql)
37
—  Average Speedup
—  preemptable-lock vs. ticket lock: 11.68X
—  pv-preemptable-lock v.s. ticket lock: 19.52X
—  pv-preemptable-lock v.s. pv-lock: 1.11X
Evaluation Summary
38
—  PreemptableTicket Spinlocks speedup
—  5.32X over ticket lock
—  Paravirtual PreemptableTicket Spinlocks speedup
—  7.91X over ticket lock
—  1.08X over paravirtual ticket lock
Average speedup across cases for all benchmarks
—  LockWaiter Preemption
—  most significant preemption problem in queue based lock under
overcommitted environment
—  PreemptableTicket Spinlock
—  Implementation with ~60 lines of code in Linux
—  Better performance in overcommitted environment
—  5.32X average speedup up over ticket lock w/oVMM support
—  1.08X average speedup over pv-lock with less variance
Conclusion
39
Thank You
40
— Jiannan Ouyang
—  ouyang@cs.pitt.edu
—  http://www.cs.pitt.edu/~ouyang/
1 2 3 4 5
0 t 2t 3t 4t
PreemptableTicket Spinlock

Mais conteúdo relacionado

Mais procurados

主機自保指南
主機自保指南主機自保指南
主機自保指南維泰 蔡
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...PROIDEA
 
DTMF Decoder Shield for Arduino
DTMF Decoder Shield for ArduinoDTMF Decoder Shield for Arduino
DTMF Decoder Shield for ArduinoRaghav Shetty
 
Austin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectreAustin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectreKim Phillips
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone oseramax
 
Playing CTFs for Fun & Profit
Playing CTFs for Fun & ProfitPlaying CTFs for Fun & Profit
Playing CTFs for Fun & Profitimpdefined
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardeningarchwisp
 

Mais procurados (13)

Exploiting buffer overflows
Exploiting buffer overflowsExploiting buffer overflows
Exploiting buffer overflows
 
主機自保指南
主機自保指南主機自保指南
主機自保指南
 
Pres
PresPres
Pres
 
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
CONFidence 2017: Escaping the (sand)box: The promises and pitfalls of modern ...
 
Proxy arp
Proxy arpProxy arp
Proxy arp
 
Prosess accouting
Prosess accoutingProsess accouting
Prosess accouting
 
DTMF Decoder Shield for Arduino
DTMF Decoder Shield for ArduinoDTMF Decoder Shield for Arduino
DTMF Decoder Shield for Arduino
 
The propeller
The propellerThe propeller
The propeller
 
Austin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectreAustin c-c++-meetup-feb2018-spectre
Austin c-c++-meetup-feb2018-spectre
 
Roll your own toy unix clone os
Roll your own toy unix clone osRoll your own toy unix clone os
Roll your own toy unix clone os
 
Ethereum A to Z
Ethereum A to ZEthereum A to Z
Ethereum A to Z
 
Playing CTFs for Fun & Profit
Playing CTFs for Fun & ProfitPlaying CTFs for Fun & Profit
Playing CTFs for Fun & Profit
 
Creating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server HardeningCreating "Secure" PHP applications, Part 2, Server Hardening
Creating "Secure" PHP applications, Part 2, Server Hardening
 

Destaque

Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsJiannan Ouyang, PhD
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsJiannan Ouyang, PhD
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Sneeker Yeh
 
Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)Sneeker Yeh
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSStephan Cadene
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Eric Van Hensbergen
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumThe Linux Foundation
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingLinaro
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEric Van Hensbergen
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2Outlyer
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_AnalysisBuland Singh
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionHemanth Venkatesh
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelDavidlohr Bueso
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespacesLocaweb
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterAaron Joue
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationLinaro
 

Destaque (20)

Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)Dead Lock Analysis of spin_lock() in Linux Kernel (english)
Dead Lock Analysis of spin_lock() in Linux Kernel (english)
 
Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)Basic Concept of Pixel and MPEG data structure (english)
Basic Concept of Pixel and MPEG data structure (english)
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
 
Cache profiling on ARM Linux
Cache profiling on ARM LinuxCache profiling on ARM Linux
Cache profiling on ARM Linux
 
Docker by demo
Docker by demoDocker by demo
Docker by demo
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
Balance, Flexibility, and Partnership: An ARM Approach to Future HPC Node Arc...
 
SDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + QuantumSDN - OpenFlow + OpenVSwitch + Quantum
SDN - OpenFlow + OpenVSwitch + Quantum
 
Q2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP SchedulingQ2.12: Research Update on big.LITTLE MP Scheduling
Q2.12: Research Update on big.LITTLE MP Scheduling
 
Effect of Virtualization on OS Interference
Effect of Virtualization on OS InterferenceEffect of Virtualization on OS Interference
Effect of Virtualization on OS Interference
 
DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2DOXLON November 2016: Facebook Engineering on cgroupv2
DOXLON November 2016: Facebook Engineering on cgroupv2
 
reference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysisreference_guide_Kernel_Crash_Dump_Analysis
reference_guide_Kernel_Crash_Dump_Analysis
 
Linux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emptionLinux Device Driver parallelism using SMP and Kernel Pre-emption
Linux Device Driver parallelism using SMP and Kernel Pre-emption
 
Memory Barriers in the Linux Kernel
Memory Barriers in the Linux KernelMemory Barriers in the Linux Kernel
Memory Barriers in the Linux Kernel
 
Linux cgroups and namespaces
Linux cgroups and namespacesLinux cgroups and namespaces
Linux cgroups and namespaces
 
How Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver ClusterHow Ceph performs on ARM Microserver Cluster
How Ceph performs on ARM Microserver Cluster
 
SFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM VirtualizationSFO15-407: Performance Overhead of ARM Virtualization
SFO15-407: Performance Overhead of ARM Virtualization
 

Último

Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfStefano Stabellini
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...Akihiro Suda
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...confluent
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identityteam-WIBU
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZABSYZ Inc
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Mater
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Rob Geurden
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commercemanigoyal112
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalLionel Briand
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsAhmed Mohamed
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfFerryKemperman
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxAndreas Kunz
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfMarharyta Nedzelska
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsChristian Birchler
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 

Último (20)

2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva2.pdf Ejercicios de programación competitiva
2.pdf Ejercicios de programación competitiva
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Xen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdfXen Safety Embedded OSS Summit April 2024 v4.pdf
Xen Safety Embedded OSS Summit April 2024 v4.pdf
 
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
20240415 [Container Plumbing Days] Usernetes Gen2 - Kubernetes in Rootless Do...
 
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
 
Post Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on IdentityPost Quantum Cryptography – The Impact on Identity
Post Quantum Cryptography – The Impact on Identity
 
Salesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZSalesforce Implementation Services PPT By ABSYZ
Salesforce Implementation Services PPT By ABSYZ
 
Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)Ahmed Motair CV April 2024 (Senior SW Developer)
Ahmed Motair CV April 2024 (Senior SW Developer)
 
Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...Simplifying Microservices & Apps - The art of effortless development - Meetup...
Simplifying Microservices & Apps - The art of effortless development - Meetup...
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Advantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your BusinessAdvantages of Odoo ERP 17 for Your Business
Advantages of Odoo ERP 17 for Your Business
 
Cyber security and its impact on E commerce
Cyber security and its impact on E commerceCyber security and its impact on E commerce
Cyber security and its impact on E commerce
 
Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Unveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML DiagramsUnveiling Design Patterns: A Visual Guide with UML Diagrams
Unveiling Design Patterns: A Visual Guide with UML Diagrams
 
Introduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdfIntroduction Computer Science - Software Design.pdf
Introduction Computer Science - Software Design.pdf
 
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptxUI5ers live - Custom Controls wrapping 3rd-party libs.pptx
UI5ers live - Custom Controls wrapping 3rd-party libs.pptx
 
A healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdfA healthy diet for your Java application Devoxx France.pdf
A healthy diet for your Java application Devoxx France.pdf
 
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving CarsSensoDat: Simulation-based Sensor Dataset of Self-driving Cars
SensoDat: Simulation-based Sensor Dataset of Self-driving Cars
 
英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 

Preemptable ticket spinlocks: improving consolidated performance in the cloud

  • 1. Jiannan Ouyang, John Lange Department of Computer Science University of Pittsburgh VEE ’13 03/17/2013 Preemptable Ticket Spinlocks Improving Consolidated Performance in the Cloud
  • 2. Motivation 2 —  VM interference in overcommitted environments —  OS synchronization overhead —  Lock holder preemption (LHP)
  • 3. —  Lock Waiter Preemption —  significance analysis of lock waiter preemption —  PreemptableTicket Spinlock —  implementation inside Linux —  Evaluation —  significant speedup over Linux Contributions 3
  • 4. Spinlocks 4 —  Basics —  lock() & unlock() —  Busy waiting lock —  generic spinlock: random order, unfair (starvation) —  ticket spinlock: FIFO order, fair —  Designed for fast mutual exclusion —  busy waiting vs. sleep/wakeup —  spinlocks for short & fast critical sections (~1us) —  OS assumptions —  use spinlocks for short critical section only —  never preempt a thread holding or waiting a kernel spinlock
  • 5. Preemption in VMs 5 —  Lock Holder Preemption (LHP) —  virtualization breaks the OS assumption —  vCPU holding a lock is unscheduled byVMM —  preemption prolongs critical section (~1m v.s. ~1us) —  Proposed Solutions —  Co-scheduling and variants —  Hardware-assisted scheme (Pause Loop Exiting) —  Paravirtual spinlocks
  • 6. Preemption in Ticket Lock 6 0 1 2 head = 0 tail = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 7. Preemption in Ticket Lock 7 0 1 2 head = 0 tail = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 8. Preemption in Ticket Lock 8 0 1 2 3 head = 0 tail = 3 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 9. Preemption in Ticket Lock 9 0 1 2 3 4 head = 0 tail = 4 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 10. Preemption in Ticket Lock 10 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 11. Preemption in Ticket Lock 11 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 Lock Holder Preemption!
  • 12. Preemption in Ticket Lock 12 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 13. Preemption in Ticket Lock 13 1 2 3 4 tail = 4head = 1 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 14. Preemption in Ticket Lock 14 2 3 4 tail = 4head = 2 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 15. Preemption in Ticket Lock 15 3 4 head = 3 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 tail = 4
  • 16. Preemption in Ticket Lock 16 3 4 5 head = 3 tail = 5 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1
  • 17. Preemption in Ticket Lock 17 3 4 5 6 head = 3 tail = 6 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 LockWaiter Preemption wait on available resource
  • 18. Lock Waiter Preemption 18 —  Lock waiter is preempted —  Later waiters wait on an available lock —  Possible to adapt to it, if we —  detect preempted waiter —  acquire lock out of order How significant is it??
  • 19. Waiter Preemption Dominates 19 LHP + LWP LWP ​ 𝐋 𝐖𝐏/ 𝐋𝐇𝐏 +𝐋𝐖𝐏  hackbench x1 1089 452 41.5% hackbench x2 44342 39221 88.5% ebizzy x1 294 166 56.5% ebizzy x2 1017 980 96.4% Table 2: LockWaiter Preemption Problem in the Linux Kernel Lock waiter preemption dominates in overcommitted environments
  • 20. Challenges & Approach 20 —  How to identify a preempted waiter? —  timeout threshold —  How to violate order constraints? —  allow timed out waiters get the lock randomly —  ensure mutual exclusion between them —  How NOT to break the whole ordering mechanism? —  timeout threshold proportional to queue position
  • 21. Queue Position Index 21 N = ticket – queue_head —  ticket: copy of queue tail value upon enqueue —  N: number of earlier waiters n n+1 n+2 head = n tail = n+2 ticket = n+2 N = 2
  • 22. Proportional Timeout Threshold 22 T = N x t —  t is a constant parameter —  large enough to avoid false detection —  small enough to save waiting time —  Performance is NOT t value sensitive —  most locks take ~1us & most spinning time wasted on locks that wait ~1ms —  larger t does not harm & smaller t does not gain much n n+1 n+2 head = n tail = n+2 0 t 2tTimeout Threshold
  • 23. Preemptable Ticket Spinlock 23 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 0 1 2 3 4 5 head = 0 tail = 5 0 Timeout Threshold t 2t 3t 4t 5t
  • 24. Preemptable Ticket Spinlock 24 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 0 Timeout Threshold t 2t 3t 4t
  • 25. Preemptable Ticket Spinlock 25 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 2 3 4 5 head = 1 tail = 5 0 Timeout Threshold t 2t 3t 4t
  • 26. Preemptable Ticket Spinlock 26 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t N = ticket – head
  • 27. Preemptable Ticket Spinlock 27 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 4 5 head = 2 tail = 5 Timeout Threshold t 2t 3t
  • 28. Preemptable Ticket Spinlock 28 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 29. Preemptable Ticket Spinlock 29 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 30. Preemptable Ticket Spinlock 30 0 1 a scheduled waiter with ticket 0 a preempted waiter with ticket 1 1 3 5 head = 3 tail = 5 Timeout Threshold 0 2t
  • 31. Summary 31 —  PreemptableTicket Lock adapts to preemption —  preserve order in absence of preemption —  violate order upon preemption —  PreemptableTicket Lock preserves fairness —  order violations are restricted —  priority is always given to timed out waiters —  timed out waiters bounded by vCPU numbers of aVM
  • 32. Implementation 32 —  Drop-in replacement —  lock(), unlock(), is_locked(), trylock(), etc. —  Correct —  race condition free: atomic updates —  Fast —  performance is sensitive to lock efficiency —  ~60 lines of C/inline-assembly in Linux 3.5.0
  • 33. Paravirtual Spinlocks 33 —  Lock holder preemption is unaddressed —  semantic gap between guest and host —  paravirtualization: guest/host cooperation —  signal long waiting lock / put a vCPU to sleep —  notify to wake up a vCPU / wake up a vCPU —  paravirtual preemptable ticket spinlock —  sleep when waiting too long after timed out —  wake up all sleeping waiters upon lock releasing
  • 34. Evaluation 34 —  Host —  8 core 2.6GHz Intel Core i7 CPU, 8 GB RAM, 1Gbit NIC, Fedora 17 (Linux 3.5.0) —  Guest —  8 core, 1G RAM, Fedora 17 (Linux 3.5.0) —  Benchmarks —  hackbench, ebizzy, dell dvd store —  Lock implementations —  baseline: ticket lock, paravirtual ticket lock (pv-lock) —  preemptable ticket lock —  paravirtual (pv) preemptable ticket lock
  • 35. Hackbench 35 —  Average Speedup —  preemptable-lock vs. ticket lock: 4.82X —  pv-preemptable-lock v.s. ticket lock: 7.08X —  pv-preemptable-lock v.s. pv-lock: 1.03X
  • 36. Ebizzy 36 Less variance over ticket lock and pv-lock —  in-VM preemption adaptivity —  lessVM interference variance 80.36 vs. 10.94 variance 131.62 vs. 16.09
  • 37. Dell DVD Store (apache/mysql) 37 —  Average Speedup —  preemptable-lock vs. ticket lock: 11.68X —  pv-preemptable-lock v.s. ticket lock: 19.52X —  pv-preemptable-lock v.s. pv-lock: 1.11X
  • 38. Evaluation Summary 38 —  PreemptableTicket Spinlocks speedup —  5.32X over ticket lock —  Paravirtual PreemptableTicket Spinlocks speedup —  7.91X over ticket lock —  1.08X over paravirtual ticket lock Average speedup across cases for all benchmarks
  • 39. —  LockWaiter Preemption —  most significant preemption problem in queue based lock under overcommitted environment —  PreemptableTicket Spinlock —  Implementation with ~60 lines of code in Linux —  Better performance in overcommitted environment —  5.32X average speedup up over ticket lock w/oVMM support —  1.08X average speedup over pv-lock with less variance Conclusion 39
  • 40. Thank You 40 — Jiannan Ouyang —  ouyang@cs.pitt.edu —  http://www.cs.pitt.edu/~ouyang/ 1 2 3 4 5 0 t 2t 3t 4t PreemptableTicket Spinlock