SlideShare uma empresa Scribd logo
1 de 24
Baixar para ler offline
Slide 1
Driver Parallelism using SMP and
Kernel Pre-emption
Hemanth V
Slide 2
• Understanding of Linux Device Drivers
• Basic understanding of Linux Synchronization mechanisms like
Semaphore, Mutex and Spin Locks
PrerequisitesPrerequisites
Slide 3
Contents
Kernel Pre-emption Feature
SMP Architecture
USB Usecase Analysis
Driver Scenarios
Summary
What's Driver Parallelism
Slide 4
Driver Parallelism
• Parallelism or Concurrency arises when system tries to do more than one
thing at once
– Concurrency is when two tasks can start, run, and complete in
overlapping time periods. It doesn't necessarily mean they'll ever
both be running at the same instant.
– Parallelism is when tasks literally run at the same time
• The goal of parallelism/concurrency is to improve system performance
• The side affect is that it can also lead to Race conditions
• Further discussion in the slides will highlight the sources of
parallelism/concurrency, howto improve performance and avoid race
conditions for Linux Device Drivers
http://www.fasterj.com/cartoon/cartoon106.shtml
Slide 5
Kernel Preemption
• CONFIG_PREEMPT
– This kernel config option reduces the latency of the kernel by making all kernel
code (that is not executing in a critical section) preemptible.
– This allows reaction to interactive events by permitting a low priority process to
be preempted involuntarily even if it is in kernel mode executing
– After execution of an asynchronous event like interrupt handler, if a higher
priority process is ready to run the current process is replaced.
– Useful for embedded system with latency requirements in the milliseconds
range.
Slide 6
SMP Architecture
• Evolution of multiprocessor architectures
– Late 60s saw need for more CPU processing power for scientific and
compute intensive applications.
– Two or more CPUs combined to form a single computer
• SMP (Symmetric Multiprocessing) is one of the multiprocessor
architecture.
• AMP, Cluster are others
• Basic idea, more tasks in parallel per unit time
Slide 7
SMP Architecture
Cache Cache Cache Cache
CPU CPU CPU CPU
I/O
Memory
Fig 1 : Logical view of SMP
In actual hardware implementation, cache will not be
directly connected to bus.
Cache Cache Cache Cache
CPU CPU CPU CPU
I/O
Memory
Fig 1 : Logical view of SMP
In actual hardware implementation, cache will not be
directly connected to bus.
Slide 8
SMP Architecture Contd
• 4 CPU SMP system shown in diagram, all CPUs would be symmetric i.e.
would be of same architecture, frequency etc
• CPU, Memory, IO tightly coupled using high speed interconnect bus,
allowing any unit connected to bus to communicate with any other unit
• Single globally accessible memory used by all CPUs, No local RAM in
CPUs, Data changes visible to all CPUs
• Symmetric or equal access to global shared memory, contents are fully
shared, all CPUs use the same address whenever referring to the same
piece of data
• I/O access also symmetric, i.e. any cpu can initiate I/O
Slide 9
SMP Architecture Cont
• Interrupts distributed across CPUs by PIC
• Access to bus and memory has to be arbitrated so that no 2 CPUs step
on each other, and all have guaranteed fair access
• Max CPUs that can be used depends on Bus bandwidth
• Only one instance of OS or Operating System, which is loaded in main
memory
• Concurrent access to kernel data structures, hence kernel needs to be
SMP aware
Slide 10
SMP Intricacies: Cache Coherency
Slide 11
SMP Intricacies: Cache Coherency
• CPU stores data into cache in most implementations to improve system
performance.
• Consider the case of 2 Threads running on 2 different CPUs in a SMP
system. Both use global variable “Data”. If one of them modifies it to 1, it
is reflected in its own cache only. Values in main memory and other cpu’s
cache are stale, and if those values are read by other CPU, results could
be unpredictable. Hence the need to maintain consistency or coherency
of caches.
• This problem is typically solved by Hardware cache consistency protocols,
which include snooping and write-update/write-invalidate
Slide 12
SMP Intricacies: Atomic
operations
• Two threads trying to obtain the same semaphore simultaneously. Both
read value of 0 think its available and set it to 1.
• These issues are solved by using atomic instructions provided by each
architecture
• Special instructions provide Atomic test and set operations. Example
load-linked and store-conditional instructions in MIPS and load-exclusive
store-exclusive in ARM
Slide 13
USB Subsystem Analysis
USB Host
Controller
EHCI Driver
USB Core
USB Print
Class Driver
USB Mass Storage
Class Driver
USB Print
APP
USB Mass
Storage APP
Linux
Host
USB Device
Controller
UDC Driver
Mass storage
gadget Driver
Print gadget
Driver
USB
Print App
Linux
Device
Simplified view of USB Subsystem
Slide 14
USB Subsystem Analysis:
No preempt
• Assume Linux host has initiated a large transfer for USB mass storage.
• In-kernel transfer would not be pre-empted until available data is
exhausted.
• High priority, small amount of data for Print would get scheduled only after
mass storage transfer is complete.
• This affects end user experience
Slide 15
USB Subsystem Analysis:
Preempt Enabled
• Assume the same scenario with kernel preemption enabled.
• In kernel transfer of mass-storage can be preempted and replaced by
Print data transfer, for example after processing a keyboard or timer
interrupt
• Opens another parallel path into both USB core and Ehci drivers, since
mass storage transfer is not complete and Print transfer has started.
• Print transfer could re-open the same device, access the same data
structures for initiating transfer, and could even disconnect the device.
Slide 16
USB Subsystem Analysis:
Preempt Enabled
• Hence driver design needs to determine all parallel paths and points at
which its safe to be pre-empted, at the same time enable parallelism.
• For example it could be safe to pre-empt once URB request is queued,
but might not be safe to pre-empt when DMA is in progress since DMA
configuration registers could be overwritten.
Slide 17
USB Subsystem Analysis: SMP
• Assume the previous scenario on a SMP system
• In this case the scheduler need not pre-empt the running mass storage transfer,
but can schedule the print transfer on an another CPU.
• This too opens a new parallel path into the drivers, and both would be executing
at the same instant of time.
• Hence if parallelism is taken care in the drivers, its to a large extent SMP safe.
• In SMP systems Interrupt handler and driver code could run concurrently on
different CPUs.
• Hence the need to protect Interrupt handlers using spin locks
Slide 18
Driver Scenarios
static LIST_HEAD(ts_list);
int process_ts_entries ()
{
local_irq_disable();
list_for_each_entry(ts, &ts_list, node) {
/* Process List elements */
list_del(node);
}
local_irq_enable();
}
irqreturn_t ts_isr (int irq, void *dev_id)
{
/* Process Interrupt */
list_add_tail(node, &ts_list);
}

local_irq_disable () protects from both interrupt handler and
preemption

spin_lock_irqsave () needs to be added for SMP safe in Driver
Code & ISR
Slide 19
Driver Scenarios: Cont

Locking using Mutex/Semaphore doesn't disable pre-emption,
but guarantees that data structure is not corrupted on pre-
emption

Both SMP safe and Pre-empt Safe
static LIST_HEAD(ts_list);
int process_ts_entries ()
{
mutex_lock_interruptible(ts->lock);
list_for_each_entry(ts, &ts_list, node) {
/* Process List elements */
list_del(node);
}
mutex_unlock(ts->lock);
}
int process_rest_entries()
{
mutex_lock_interruptible(ts->lock);
list_for_each_entry(ts, &ts_list, node) {
/* Process remaining elements */
}
mutex_unlock(ts->lock);
}
Slide 20
Driver Scenarios: Cont

Functions process_ts_entries() and
process_rest_entries() could deadlock if pre-empted
while holding one of the locks

Locks need to be obtained in the same order, to avoid
deadlock
static LIST_HEAD(ts_list);
static LIST_HEAD(tc_list);
int process_ts_entries ()
{
mutex_lock_interruptible(ts->lock);
/* Some processing */
mutex_lock_interruptible(tc->lock);
}
int process_rest_entries()
{
mutex_lock_interruptible(tc->lock);
/* Some processing */
mutex_lock_interruptible(ts->lock);
}
Slide 21
Driver Scenarios: Cont
In some cases it might be better to access resources from a single
function, rather than have locks spread across through out the code
static LIST_HEAD(ts_list);
int process_ts_entries ()
{
mutex_lock_interruptible(ts->lock);
list_for_each_entry(ts, &ts_list, node) {
/* Process List elements */
list_del(node);
}
mutex_unlock(ts->lock);
}
{
/* Process list elements */
process_ts_entries();
}
{
/* Process list elements */
process_ts_entries();
}
Slide 22
Driver Scenarios
• Don’t use one big lock for everything, reduces concurrency
• Too fine-grained locks increases overhead
• Need to balance both aspects
• Reader –Writer locks
– If Data structures are read more often than being updated
– Allows multiple reads locks to be obtained simultaneously.
– Allows single write lock to be obtained, and also prevents any read lock from
being obtained while write lock is held
– Available for both spin locks and semaphores
• Stack variables/structures don't need locking, since on pre-emption
another instance is created
Slide 23
Summary
• Concurrency/Parallelism needs to be one of the criteria during Driver Design
phase
• Analysis required to determine the parallel paths and protection for critical
sections
• Drivers which ensure concurrency using appropriate locking techniques, not only
avoids race conditions but also improves performance
• Unit testing could be used to test some of the parallel paths in the driver
– Two different applications which will enable parallel path into the same driver.
– Two instances for the same application.
Slide 24
Thank You
hemanth_venkatesh@yahoo.com

Mais conteúdo relacionado

Mais procurados

Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequenceHoucheng Lin
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdfAdrian Huang
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Adrian Huang
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdfAdrian Huang
 
Linux memory-management-kamal
Linux memory-management-kamalLinux memory-management-kamal
Linux memory-management-kamalKamal Maiti
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Aananth C N
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingTushar B Kute
 
Scheduling in Android
Scheduling in AndroidScheduling in Android
Scheduling in AndroidOpersys inc.
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)shimosawa
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelKernel TLV
 
Embedded linux system development (slides)
Embedded linux system development (slides)Embedded linux system development (slides)
Embedded linux system development (slides)Jaime Barragan
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic ControlSUSE Labs Taipei
 
Linux scheduling and input and output
Linux scheduling and input and outputLinux scheduling and input and output
Linux scheduling and input and outputSanidhya Chugh
 

Mais procurados (20)

Introduction to Linux Drivers
Introduction to Linux DriversIntroduction to Linux Drivers
Introduction to Linux Drivers
 
Uboot startup sequence
Uboot startup sequenceUboot startup sequence
Uboot startup sequence
 
Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
 
Sockets and Socket-Buffer
Sockets and Socket-BufferSockets and Socket-Buffer
Sockets and Socket-Buffer
 
Physical Memory Management.pdf
Physical Memory Management.pdfPhysical Memory Management.pdf
Physical Memory Management.pdf
 
Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...Decompressed vmlinux: linux kernel initialization from page table configurati...
Decompressed vmlinux: linux kernel initialization from page table configurati...
 
semaphore & mutex.pdf
semaphore & mutex.pdfsemaphore & mutex.pdf
semaphore & mutex.pdf
 
Linux memory-management-kamal
Linux memory-management-kamalLinux memory-management-kamal
Linux memory-management-kamal
 
Virtualization Support in ARMv8+
Virtualization Support in ARMv8+Virtualization Support in ARMv8+
Virtualization Support in ARMv8+
 
Part 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module ProgrammingPart 02 Linux Kernel Module Programming
Part 02 Linux Kernel Module Programming
 
macvlan and ipvlan
macvlan and ipvlanmacvlan and ipvlan
macvlan and ipvlan
 
Scheduling in Android
Scheduling in AndroidScheduling in Android
Scheduling in Android
 
Linux Initialization Process (2)
Linux Initialization Process (2)Linux Initialization Process (2)
Linux Initialization Process (2)
 
Qemu
QemuQemu
Qemu
 
Continguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux KernelContinguous Memory Allocator in the Linux Kernel
Continguous Memory Allocator in the Linux Kernel
 
Embedded linux system development (slides)
Embedded linux system development (slides)Embedded linux system development (slides)
Embedded linux system development (slides)
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 
Linux Internals - Interview essentials - 1.0
Linux Internals - Interview essentials - 1.0Linux Internals - Interview essentials - 1.0
Linux Internals - Interview essentials - 1.0
 
Linux scheduling and input and output
Linux scheduling and input and outputLinux scheduling and input and output
Linux scheduling and input and output
 

Destaque

Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device driversHoucheng Lin
 
Linux device driver
Linux device driverLinux device driver
Linux device driverchatsiri
 
Linux Device Driver Training-TutorialsDaddy
Linux Device Driver Training-TutorialsDaddyLinux Device Driver Training-TutorialsDaddy
Linux Device Driver Training-TutorialsDaddyStryker King
 
Breaking into Open Source and Linux: A USB 3.0 Success Story
Breaking into Open Source and Linux: A USB 3.0 Success StoryBreaking into Open Source and Linux: A USB 3.0 Success Story
Breaking into Open Source and Linux: A USB 3.0 Success StorySage Sharp
 
Linux Kernel Introduction
Linux Kernel IntroductionLinux Kernel Introduction
Linux Kernel IntroductionSage Sharp
 
Linux Device Driver Introduction
Linux Device Driver IntroductionLinux Device Driver Introduction
Linux Device Driver IntroductionDavidChen0513
 
Linux Kernel Tour
Linux Kernel TourLinux Kernel Tour
Linux Kernel Toursamrat das
 
Introduction To Linux Kernel Modules
Introduction To Linux Kernel ModulesIntroduction To Linux Kernel Modules
Introduction To Linux Kernel Modulesdibyajyotig
 
Introduction to embedded linux device driver and firmware
Introduction to embedded linux device driver and firmwareIntroduction to embedded linux device driver and firmware
Introduction to embedded linux device driver and firmwaredefinecareer
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsJiannan Ouyang, PhD
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSStephan Cadene
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsJiannan Ouyang, PhD
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversJez Halford
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linuxbrouer
 

Destaque (20)

Arm device tree and linux device drivers
Arm device tree and linux device driversArm device tree and linux device drivers
Arm device tree and linux device drivers
 
teste 22
teste 22teste 22
teste 22
 
Linux device driver
Linux device driverLinux device driver
Linux device driver
 
Linux device drivers
Linux device driversLinux device drivers
Linux device drivers
 
Linux Device Driver Training-TutorialsDaddy
Linux Device Driver Training-TutorialsDaddyLinux Device Driver Training-TutorialsDaddy
Linux Device Driver Training-TutorialsDaddy
 
Linux Device Driver Training
Linux Device Driver TrainingLinux Device Driver Training
Linux Device Driver Training
 
Breaking into Open Source and Linux: A USB 3.0 Success Story
Breaking into Open Source and Linux: A USB 3.0 Success StoryBreaking into Open Source and Linux: A USB 3.0 Success Story
Breaking into Open Source and Linux: A USB 3.0 Success Story
 
Linux Kernel Introduction
Linux Kernel IntroductionLinux Kernel Introduction
Linux Kernel Introduction
 
Linux Device Driver Introduction
Linux Device Driver IntroductionLinux Device Driver Introduction
Linux Device Driver Introduction
 
Peek into linux_device_driver_kit
Peek into linux_device_driver_kitPeek into linux_device_driver_kit
Peek into linux_device_driver_kit
 
Linux Kernel Tour
Linux Kernel TourLinux Kernel Tour
Linux Kernel Tour
 
Introduction To Linux Kernel Modules
Introduction To Linux Kernel ModulesIntroduction To Linux Kernel Modules
Introduction To Linux Kernel Modules
 
Introduction to embedded linux device driver and firmware
Introduction to embedded linux device driver and firmwareIntroduction to embedded linux device driver and firmware
Introduction to embedded linux device driver and firmware
 
LINUX Device Drivers
LINUX Device DriversLINUX Device Drivers
LINUX Device Drivers
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONSENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
ENERGY EFFICIENCY OF ARM ARCHITECTURES FOR CLOUD COMPUTING APPLICATIONS
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 
Denser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microserversDenser, cooler, faster, stronger: PHP on ARM microservers
Denser, cooler, faster, stronger: PHP on ARM microservers
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
Docker by demo
Docker by demoDocker by demo
Docker by demo
 

Semelhante a Linux Device Driver parallelism using SMP and Kernel Pre-emption

Multiprocessor Scheduling
Multiprocessor SchedulingMultiprocessor Scheduling
Multiprocessor SchedulingoDesk
 
Unix operating system basics
Unix operating system basicsUnix operating system basics
Unix operating system basicsSankar Suriya
 
Computer system architecture
Computer system architectureComputer system architecture
Computer system architecturejeetesh036
 
EMBEDDED OS
EMBEDDED OSEMBEDDED OS
EMBEDDED OSAJAL A J
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linuxSusant Sahani
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940Samsung Electronics
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSKathirvel Ayyaswamy
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07timcrack
 
ARM architcture
ARM architcture ARM architcture
ARM architcture Hossam Adel
 
CSCI 2121- Computer Organization and Assembly Language Labor.docx
CSCI 2121- Computer Organization and Assembly Language Labor.docxCSCI 2121- Computer Organization and Assembly Language Labor.docx
CSCI 2121- Computer Organization and Assembly Language Labor.docxannettsparrow
 
Processor management
Processor managementProcessor management
Processor managementdev3993
 
Aman 16 os sheduling algorithm methods.pptx
Aman 16 os sheduling algorithm methods.pptxAman 16 os sheduling algorithm methods.pptx
Aman 16 os sheduling algorithm methods.pptxvikramkagitapu
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Sarwan ali
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processorscsandit
 

Semelhante a Linux Device Driver parallelism using SMP and Kernel Pre-emption (20)

Multiprocessor Scheduling
Multiprocessor SchedulingMultiprocessor Scheduling
Multiprocessor Scheduling
 
Cache memory
Cache memoryCache memory
Cache memory
 
module4.ppt
module4.pptmodule4.ppt
module4.ppt
 
Unix operating system basics
Unix operating system basicsUnix operating system basics
Unix operating system basics
 
Computer system architecture
Computer system architectureComputer system architecture
Computer system architecture
 
EMBEDDED OS
EMBEDDED OSEMBEDDED OS
EMBEDDED OS
 
Synchronization linux
Synchronization linuxSynchronization linux
Synchronization linux
 
load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940load-balancing-method-for-embedded-rt-system-20120711-0940
load-balancing-method-for-embedded-rt-system-20120711-0940
 
CS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMSCS9222 ADVANCED OPERATING SYSTEMS
CS9222 ADVANCED OPERATING SYSTEMS
 
01 oracle architecture
01 oracle architecture01 oracle architecture
01 oracle architecture
 
10 Multicore 07
10 Multicore 0710 Multicore 07
10 Multicore 07
 
ARM architcture
ARM architcture ARM architcture
ARM architcture
 
CSCI 2121- Computer Organization and Assembly Language Labor.docx
CSCI 2121- Computer Organization and Assembly Language Labor.docxCSCI 2121- Computer Organization and Assembly Language Labor.docx
CSCI 2121- Computer Organization and Assembly Language Labor.docx
 
Chapter 10
Chapter 10Chapter 10
Chapter 10
 
Ch4 memory management
Ch4 memory managementCh4 memory management
Ch4 memory management
 
Multicore
MulticoreMulticore
Multicore
 
Processor management
Processor managementProcessor management
Processor management
 
Aman 16 os sheduling algorithm methods.pptx
Aman 16 os sheduling algorithm methods.pptxAman 16 os sheduling algorithm methods.pptx
Aman 16 os sheduling algorithm methods.pptx
 
Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler Chip Multithreading Systems Need a New Operating System Scheduler
Chip Multithreading Systems Need a New Operating System Scheduler
 
Affect of parallel computing on multicore processors
Affect of parallel computing on multicore processorsAffect of parallel computing on multicore processors
Affect of parallel computing on multicore processors
 

Último

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...itnewsafrica
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 

Último (20)

The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...Zeshan Sattar- Assessing the skill requirements and industry expectations for...
Zeshan Sattar- Assessing the skill requirements and industry expectations for...
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 

Linux Device Driver parallelism using SMP and Kernel Pre-emption

  • 1. Slide 1 Driver Parallelism using SMP and Kernel Pre-emption Hemanth V
  • 2. Slide 2 • Understanding of Linux Device Drivers • Basic understanding of Linux Synchronization mechanisms like Semaphore, Mutex and Spin Locks PrerequisitesPrerequisites
  • 3. Slide 3 Contents Kernel Pre-emption Feature SMP Architecture USB Usecase Analysis Driver Scenarios Summary What's Driver Parallelism
  • 4. Slide 4 Driver Parallelism • Parallelism or Concurrency arises when system tries to do more than one thing at once – Concurrency is when two tasks can start, run, and complete in overlapping time periods. It doesn't necessarily mean they'll ever both be running at the same instant. – Parallelism is when tasks literally run at the same time • The goal of parallelism/concurrency is to improve system performance • The side affect is that it can also lead to Race conditions • Further discussion in the slides will highlight the sources of parallelism/concurrency, howto improve performance and avoid race conditions for Linux Device Drivers http://www.fasterj.com/cartoon/cartoon106.shtml
  • 5. Slide 5 Kernel Preemption • CONFIG_PREEMPT – This kernel config option reduces the latency of the kernel by making all kernel code (that is not executing in a critical section) preemptible. – This allows reaction to interactive events by permitting a low priority process to be preempted involuntarily even if it is in kernel mode executing – After execution of an asynchronous event like interrupt handler, if a higher priority process is ready to run the current process is replaced. – Useful for embedded system with latency requirements in the milliseconds range.
  • 6. Slide 6 SMP Architecture • Evolution of multiprocessor architectures – Late 60s saw need for more CPU processing power for scientific and compute intensive applications. – Two or more CPUs combined to form a single computer • SMP (Symmetric Multiprocessing) is one of the multiprocessor architecture. • AMP, Cluster are others • Basic idea, more tasks in parallel per unit time
  • 7. Slide 7 SMP Architecture Cache Cache Cache Cache CPU CPU CPU CPU I/O Memory Fig 1 : Logical view of SMP In actual hardware implementation, cache will not be directly connected to bus. Cache Cache Cache Cache CPU CPU CPU CPU I/O Memory Fig 1 : Logical view of SMP In actual hardware implementation, cache will not be directly connected to bus.
  • 8. Slide 8 SMP Architecture Contd • 4 CPU SMP system shown in diagram, all CPUs would be symmetric i.e. would be of same architecture, frequency etc • CPU, Memory, IO tightly coupled using high speed interconnect bus, allowing any unit connected to bus to communicate with any other unit • Single globally accessible memory used by all CPUs, No local RAM in CPUs, Data changes visible to all CPUs • Symmetric or equal access to global shared memory, contents are fully shared, all CPUs use the same address whenever referring to the same piece of data • I/O access also symmetric, i.e. any cpu can initiate I/O
  • 9. Slide 9 SMP Architecture Cont • Interrupts distributed across CPUs by PIC • Access to bus and memory has to be arbitrated so that no 2 CPUs step on each other, and all have guaranteed fair access • Max CPUs that can be used depends on Bus bandwidth • Only one instance of OS or Operating System, which is loaded in main memory • Concurrent access to kernel data structures, hence kernel needs to be SMP aware
  • 10. Slide 10 SMP Intricacies: Cache Coherency
  • 11. Slide 11 SMP Intricacies: Cache Coherency • CPU stores data into cache in most implementations to improve system performance. • Consider the case of 2 Threads running on 2 different CPUs in a SMP system. Both use global variable “Data”. If one of them modifies it to 1, it is reflected in its own cache only. Values in main memory and other cpu’s cache are stale, and if those values are read by other CPU, results could be unpredictable. Hence the need to maintain consistency or coherency of caches. • This problem is typically solved by Hardware cache consistency protocols, which include snooping and write-update/write-invalidate
  • 12. Slide 12 SMP Intricacies: Atomic operations • Two threads trying to obtain the same semaphore simultaneously. Both read value of 0 think its available and set it to 1. • These issues are solved by using atomic instructions provided by each architecture • Special instructions provide Atomic test and set operations. Example load-linked and store-conditional instructions in MIPS and load-exclusive store-exclusive in ARM
  • 13. Slide 13 USB Subsystem Analysis USB Host Controller EHCI Driver USB Core USB Print Class Driver USB Mass Storage Class Driver USB Print APP USB Mass Storage APP Linux Host USB Device Controller UDC Driver Mass storage gadget Driver Print gadget Driver USB Print App Linux Device Simplified view of USB Subsystem
  • 14. Slide 14 USB Subsystem Analysis: No preempt • Assume Linux host has initiated a large transfer for USB mass storage. • In-kernel transfer would not be pre-empted until available data is exhausted. • High priority, small amount of data for Print would get scheduled only after mass storage transfer is complete. • This affects end user experience
  • 15. Slide 15 USB Subsystem Analysis: Preempt Enabled • Assume the same scenario with kernel preemption enabled. • In kernel transfer of mass-storage can be preempted and replaced by Print data transfer, for example after processing a keyboard or timer interrupt • Opens another parallel path into both USB core and Ehci drivers, since mass storage transfer is not complete and Print transfer has started. • Print transfer could re-open the same device, access the same data structures for initiating transfer, and could even disconnect the device.
  • 16. Slide 16 USB Subsystem Analysis: Preempt Enabled • Hence driver design needs to determine all parallel paths and points at which its safe to be pre-empted, at the same time enable parallelism. • For example it could be safe to pre-empt once URB request is queued, but might not be safe to pre-empt when DMA is in progress since DMA configuration registers could be overwritten.
  • 17. Slide 17 USB Subsystem Analysis: SMP • Assume the previous scenario on a SMP system • In this case the scheduler need not pre-empt the running mass storage transfer, but can schedule the print transfer on an another CPU. • This too opens a new parallel path into the drivers, and both would be executing at the same instant of time. • Hence if parallelism is taken care in the drivers, its to a large extent SMP safe. • In SMP systems Interrupt handler and driver code could run concurrently on different CPUs. • Hence the need to protect Interrupt handlers using spin locks
  • 18. Slide 18 Driver Scenarios static LIST_HEAD(ts_list); int process_ts_entries () { local_irq_disable(); list_for_each_entry(ts, &ts_list, node) { /* Process List elements */ list_del(node); } local_irq_enable(); } irqreturn_t ts_isr (int irq, void *dev_id) { /* Process Interrupt */ list_add_tail(node, &ts_list); }  local_irq_disable () protects from both interrupt handler and preemption  spin_lock_irqsave () needs to be added for SMP safe in Driver Code & ISR
  • 19. Slide 19 Driver Scenarios: Cont  Locking using Mutex/Semaphore doesn't disable pre-emption, but guarantees that data structure is not corrupted on pre- emption  Both SMP safe and Pre-empt Safe static LIST_HEAD(ts_list); int process_ts_entries () { mutex_lock_interruptible(ts->lock); list_for_each_entry(ts, &ts_list, node) { /* Process List elements */ list_del(node); } mutex_unlock(ts->lock); } int process_rest_entries() { mutex_lock_interruptible(ts->lock); list_for_each_entry(ts, &ts_list, node) { /* Process remaining elements */ } mutex_unlock(ts->lock); }
  • 20. Slide 20 Driver Scenarios: Cont  Functions process_ts_entries() and process_rest_entries() could deadlock if pre-empted while holding one of the locks  Locks need to be obtained in the same order, to avoid deadlock static LIST_HEAD(ts_list); static LIST_HEAD(tc_list); int process_ts_entries () { mutex_lock_interruptible(ts->lock); /* Some processing */ mutex_lock_interruptible(tc->lock); } int process_rest_entries() { mutex_lock_interruptible(tc->lock); /* Some processing */ mutex_lock_interruptible(ts->lock); }
  • 21. Slide 21 Driver Scenarios: Cont In some cases it might be better to access resources from a single function, rather than have locks spread across through out the code static LIST_HEAD(ts_list); int process_ts_entries () { mutex_lock_interruptible(ts->lock); list_for_each_entry(ts, &ts_list, node) { /* Process List elements */ list_del(node); } mutex_unlock(ts->lock); } { /* Process list elements */ process_ts_entries(); } { /* Process list elements */ process_ts_entries(); }
  • 22. Slide 22 Driver Scenarios • Don’t use one big lock for everything, reduces concurrency • Too fine-grained locks increases overhead • Need to balance both aspects • Reader –Writer locks – If Data structures are read more often than being updated – Allows multiple reads locks to be obtained simultaneously. – Allows single write lock to be obtained, and also prevents any read lock from being obtained while write lock is held – Available for both spin locks and semaphores • Stack variables/structures don't need locking, since on pre-emption another instance is created
  • 23. Slide 23 Summary • Concurrency/Parallelism needs to be one of the criteria during Driver Design phase • Analysis required to determine the parallel paths and protection for critical sections • Drivers which ensure concurrency using appropriate locking techniques, not only avoids race conditions but also improves performance • Unit testing could be used to test some of the parallel paths in the driver – Two different applications which will enable parallel path into the same driver. – Two instances for the same application.