SlideShare uma empresa Scribd logo
1 de 46
Baixar para ler offline
POSIX Realtime Evented Patterns
An introduction to AIO, IO reactors, callbacks and context switches




                                                                Lourens Naudé
                                                                  Trade2win Limited
The Call Stack


09/12/09
               The Call Stack   2
The Call Stack




09/12/09                3
The Call Stack

    Keeps track of subroutine execution (call + return)
    Dynamic, grows up or down depending on machine
    architecture
    Composed of n-1 stack frames
    Frequently includes local data storage, params,
    additional return state etc.
    Optimized for a single action
    Ordered
    Usually a single call stack per process, except ...


09/12/09                                              4
Context Switches


09/12/09
           Threads + Fiber / Coroutine overheads   5
Context Switches

    A stack per Thread / Fiber
    Context specific data
    Transient long running contexts are memory
    hungry
    Guard against transient threads with a pre-
    initialized Thread pool
    Threaded full stack Web Apps == expensive
    context switches
    clean_backtrace ?



09/12/09                                             6
Ruby Threads


09/12/09
              Green Threads   7
Ruby Threads




09/12/09              8
Ruby Threads

    Scheduled with a timer (SIGVTALRM) every 10ms
    Each thread is allowed a 10ms time slice
    Not the most efficient scheduler
    Coupled with select with IO multiplexing for
    portability
    Can wait for : fd, select, a PID, sleep state, a join
    MRI 1.8: Green Threads, cheap to spawn + switch
    MRI 1.9: Native OS threads, GIL, more expensive
    to spawn
    JRuby: Ruby thread == Java thread

09/12/09                                                    9
Fibers


09/12/09
           Coroutines   10
Fibers




09/12/09       11
Fibers

    A resumable execution state
    Computes a partial result – generator
    Yields back to it's caller
    Caller resumes
    Facilities for data exchange
    Initial 4k stack size and very fast context switches
    MRI 1.9 and JRuby only
    Cooperative scheduling for IO



09/12/09                                                   12
Reactor Pattern


09/12/09
              IO Reactor Pattern   13
Reactor Pattern




09/12/09                14
Reactor Pattern

    Main loop with a tick quantum ( 10 to 100ms )
    Operations register themselves with the reactor
    Process forked, fd readable, cmd finished, timer
    fired, IO timeout etc.
    Callbacks and errbacks
    Reactor notified by lower level subsystems : select,
    epoll, kqueue etc.
    Twisted (Python), EventMachine (Ruby, c++, Java)




09/12/09                                               15
Reactor and Contexts


09/12/09
           Best Practices for Multi Threading   16
Reactor and Threads

    Operations fire on the reactor thread
    Enumerated and invoked FIFO
    Blocking operations block the reactor
    Defer: schedule an operation on a background
    thread
    Schedule: push a deferred context back to the
    reactor thread




09/12/09                                            17
Blocked Reactor




09/12/09                18
Reactor with a deferred operation




09/12/09                                  19
System Calls


09/12/09
            Syscalls and the Kernel   20
Syscalls and the Kernel




09/12/09                        21
Syscalls and the Kernel

    Function calls into the OS: read,write,fork,sbrk etc.
    User vs Kernel space context switch, much more
    expensive than function calls within a process
    Usually implies data transfer between User and
    Kernel
    Important to reduce syscalls for high throughput
    environments
    Some lift workloads ...
    sendfile: request a file to be served directly from
    the kernel without User space overhead


09/12/09                                                    22
POSIX Realtime (AIO)


09/12/09
           POSIX Async IO extensions   23
POSIX Realtime (AIO)

    Introduced in Linux Kernel 2.6.x
    Floating spec for a number of years, currently
    defined in POSIX.1b
    Implementation resembles browser compat
    Fallback to blocking operations in most
    implementations
    Powers popular reverse proxies like Squid, Nginx,
    Varnish etc.




09/12/09                                                24
And then there were Specs




09/12/09                          25
AIO Control Blocks


09/12/09
            AIO Control Blocks   26
Control Block Struct




09/12/09                     27
AIO Control Blocks

    File descriptor with proper r/w mode set
    Buffer region for read / write
    Type of operation: read / write
    Priority: higher priority == faster execution
    Offset: Position to read from / write to
    Bytes: Amount of data to transfer
    Callback mechanism: no op, thread or signal
    Best wrapped in a custom struct for embedding
    domain logic specific to the use cases


09/12/09                                            28
AIO Operations


09/12/09
               AIO Operations   29
AIO Operations on a single fd




09/12/09                              30
AIO Operations on a single fd

    aio_read: sync / async read
    aio_write: sync / async write
    aio_error: error state, if any, for an operation
    aio_error and EINPROGRESS to simulate a
    blocking operation
    aio_cancel: cancel a submitted job
    aio_suspend: pause an in progress operation
    aio_sync: forcefully sync a write op to disk
    aio_return: return status from a given operation
    Uniform API, single AIO Control Block as arg

09/12/09                                               31
AIO List Operations


09/12/09
           AIO List Operations   32
AIO list operations




09/12/09                    33
AIO list operations

    Previously mentioned API still have a syscall per
    call overhead
    lio_listio: submit a batch of control blocks with a
    single syscall
    Modes: blocking, non-blocking and no-op
    Array of control blocks, number of operations and
    an optional callback as arguments
    Callback fires when all operations done
    Callbacks from individual control blocks still fire
    Useful for app specific correlation

09/12/09                                                  34
AIO and Syscalls


09/12/09
                AIO Syscalls   35
8 files, read




09/12/09              36
8 files, async read




09/12/09                    37
Revisit Threads and
            Fibers


09/12/09
           Threads and Fibers, revisited   38
Revisit Threads and Fibers

    Concept from James “raggi” Tucker
    Cheap switching of MRI green threads
    Lets embrace this …
    Stopped threads don't have scheduler overhead




09/12/09                                            39
Revisit Threads and Fibers




09/12/09                           40
Fibered IO Interpreter


09/12/09
           Fibered IO Interpreter   41
Fibered IO Interpreter

    Thread#sleep and Thread#wakeup for pooled or
    transient threads
    Stopped threads excluded by the scheduler saves
    10ms runtime per stopped thread when IO bound
    Model fits very well with existing threaded servers
    like Mongrel
    No need for an IO reactor – we delegate this to the
    OS and syscalls




09/12/09                                               42
Links


09/12/09
           Links and References   43
Links and References

    A few related projects
    http://github.com/eventmachine/eventmachine
    Event Machine repository
    http://github/methodmissing/aio
    Work in progress AIO extension for MRI, API in
    flux, but usable
    http://github/methodmissing/callback
    A native MRI callback object
    http://github/methodmissing/channel
    Fixed sized pub sub channels for MRI

09/12/09                                             44
Questions ?


09/12/09
               Q&A       45
Thanks!
           @methodmissing
           (github / twitter)

09/12/09
               Thanks for listening   46

Mais conteúdo relacionado

Mais procurados

2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session
Mikyung Kang
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Jiannan Ouyang, PhD
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
Jiannan Ouyang, PhD
 
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17AugCSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
cstalks
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
Hajime Tazaki
 

Mais procurados (20)

MARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 AltreonicMARC ONERA Toulouse2012 Altreonic
MARC ONERA Toulouse2012 Altreonic
 
Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2Network stack personality in Android phone - netdev 2.2
Network stack personality in Android phone - netdev 2.2
 
Playing BBR with a userspace network stack
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stack
 
mTCP使ってみた
mTCP使ってみたmTCP使ってみた
mTCP使ってみた
 
Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013Direct Code Execution @ CoNEXT 2013
Direct Code Execution @ CoNEXT 2013
 
Virtual net performance
Virtual net performanceVirtual net performance
Virtual net performance
 
Quantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits SystemQuantum Computing in China: Progress on Superconducting Multi-Qubits System
Quantum Computing in China: Progress on Superconducting Multi-Qubits System
 
2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session2012 Fall OpenStack Bare-metal Speaker Session
2012 Fall OpenStack Bare-metal Speaker Session
 
LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1LibOS as a regression test framework for Linux networking #netdev1.1
LibOS as a regression test framework for Linux networking #netdev1.1
 
Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)Network Stack in Userspace (NUSE)
Network Stack in Userspace (NUSE)
 
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
Linux rumpkernel - ABC2018 (AsiaBSDCon 2018)
 
Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01Library Operating System for Linux #netdev01
Library Operating System for Linux #netdev01
 
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
PLNOG14: Architektura oraz rozwiązywanie problemów na routerach IOS-XE - Piot...
 
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUsShoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
Shoot4U: Using VMM Assists to Optimize TLB Operations on Preempted vCPUs
 
Libra: a Library OS for a JVM
Libra: a Library OS for a JVMLibra: a Library OS for a JVM
Libra: a Library OS for a JVM
 
Achieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-KernelsAchieving Performance Isolation with Lightweight Co-Kernels
Achieving Performance Isolation with Lightweight Co-Kernels
 
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17AugCSTalks-Polymorphic heterogeneous multicore systems-17Aug
CSTalks-Polymorphic heterogeneous multicore systems-17Aug
 
Kernelvm 201312-dlmopen
Kernelvm 201312-dlmopenKernelvm 201312-dlmopen
Kernelvm 201312-dlmopen
 
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
Reinforcing the Kitchen Sink - Aligning BGP-4 Error Handling with Modern Netw...
 
Lev
LevLev
Lev
 

Destaque

Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
curryon
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
koji lin
 

Destaque (14)

Concurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple SpacesConcurrent Programming with Ruby and Tuple Spaces
Concurrent Programming with Ruby and Tuple Spaces
 
"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero Parviainen"Inside The AngularJS Directive Compiler" by Tero Parviainen
"Inside The AngularJS Directive Compiler" by Tero Parviainen
 
Caching and IPC with Redis
Caching and IPC with RedisCaching and IPC with Redis
Caching and IPC with Redis
 
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
Bits of Advice for the VM Writer, by Cliff Click @ Curry On 2015
 
"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей Шумада"How to deploy to production 10 times a day" Андрей Шумада
"How to deploy to production 10 times a day" Андрей Шумада
 
Inside the jvm
Inside the jvmInside the jvm
Inside the jvm
 
Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"Алексей Косинский "React Native vs. React+WebView"
Алексей Косинский "React Native vs. React+WebView"
 
Actors and Threads
Actors and ThreadsActors and Threads
Actors and Threads
 
Programming with Threads in Java
Programming with Threads in JavaProgramming with Threads in Java
Programming with Threads in Java
 
Java Course 10: Threads and Concurrency
Java Course 10: Threads and ConcurrencyJava Course 10: Threads and Concurrency
Java Course 10: Threads and Concurrency
 
Central processing unit
Central processing unitCentral processing unit
Central processing unit
 
Processor organization & register organization
Processor organization & register organizationProcessor organization & register organization
Processor organization & register organization
 
Threads
ThreadsThreads
Threads
 
MicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scaleMicroServices at Netflix - challenges of scale
MicroServices at Netflix - challenges of scale
 

Semelhante a Barcamp PT

QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
The Linux Foundation
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
Guenadi JILEVSKI
 
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentationReservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
Hans Haringa
 

Semelhante a Barcamp PT (20)

A presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NASA presentaion on Panasas HPC NAS
A presentaion on Panasas HPC NAS
 
From data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloudFrom data centers to fog computing: the evaporating cloud
From data centers to fog computing: the evaporating cloud
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
Five cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark fasterFive cool ways the JVM can run Apache Spark faster
Five cool ways the JVM can run Apache Spark faster
 
Trouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deploymentsTrouble shooting Storage Area Networks for virtualisation deployments
Trouble shooting Storage Area Networks for virtualisation deployments
 
MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!MySQL Replication Performance Tuning for Fun and Profit!
MySQL Replication Performance Tuning for Fun and Profit!
 
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
XPDS14: MirageOS 2.0: branch consistency for Xen Stub Domains - Anil Madhavap...
 
BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017BMC: Bare Metal Container @Open Source Summit Japan 2017
BMC: Bare Metal Container @Open Source Summit Japan 2017
 
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
Deep Dive on the Amazon Aurora MySQL-compatible Edition - DAT301 - re:Invent ...
 
Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...Oracle Clusterware and Private Network Considerations - Practical Performance...
Oracle Clusterware and Private Network Considerations - Practical Performance...
 
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentationReservoir engineering in a HPC (zettaflops) world:  a ‘disruptive’ presentation
Reservoir engineering in a HPC (zettaflops) world: a ‘disruptive’ presentation
 
Introduction to Real Time Java
Introduction to Real Time JavaIntroduction to Real Time Java
Introduction to Real Time Java
 
Z109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910bZ109889 z4 r-storage-dfsms-vegas-v1910b
Z109889 z4 r-storage-dfsms-vegas-v1910b
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
An AI accelerator ASIC architecture
An AI accelerator ASIC architectureAn AI accelerator ASIC architecture
An AI accelerator ASIC architecture
 
Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016Traitement temps réel chez Streamroot - Golang Paris Juin 2016
Traitement temps réel chez Streamroot - Golang Paris Juin 2016
 
Rtos
RtosRtos
Rtos
 
Flexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific ArchitecturesFlexible and Scalable Domain-Specific Architectures
Flexible and Scalable Domain-Specific Architectures
 
Nido
NidoNido
Nido
 
Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006Orcl siebel-sun-s282213-oow2006
Orcl siebel-sun-s282213-oow2006
 

Mais de Lourens Naudé

RailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMsRailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMs
Lourens Naudé
 
RailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your DomainRailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your Domain
Lourens Naudé
 

Mais de Lourens Naudé (9)

ZeroMQ as scriptable sockets
ZeroMQ as scriptable socketsZeroMQ as scriptable sockets
ZeroMQ as scriptable sockets
 
TX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficientlyTX/RX 101: Transfer data efficiently
TX/RX 101: Transfer data efficiently
 
In the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby ConferenceIn the Loop - Lone Star Ruby Conference
In the Loop - Lone Star Ruby Conference
 
EuRuKo 2011 - In the Loop
EuRuKo 2011 - In the LoopEuRuKo 2011 - In the Loop
EuRuKo 2011 - In the Loop
 
Event Driven Architecture
Event Driven ArchitectureEvent Driven Architecture
Event Driven Architecture
 
RailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMsRailswayCon 2010 - Dynamic Language VMs
RailswayCon 2010 - Dynamic Language VMs
 
RailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your DomainRailswayCon 2010 - Command Your Domain
RailswayCon 2010 - Command Your Domain
 
Railswaycon Inside Matz Ruby
Railswaycon Inside Matz RubyRailswaycon Inside Matz Ruby
Railswaycon Inside Matz Ruby
 
Embracing Events
Embracing EventsEmbracing Events
Embracing Events
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 

Barcamp PT

  • 1. POSIX Realtime Evented Patterns An introduction to AIO, IO reactors, callbacks and context switches Lourens Naudé Trade2win Limited
  • 2. The Call Stack 09/12/09 The Call Stack 2
  • 4. The Call Stack Keeps track of subroutine execution (call + return) Dynamic, grows up or down depending on machine architecture Composed of n-1 stack frames Frequently includes local data storage, params, additional return state etc. Optimized for a single action Ordered Usually a single call stack per process, except ... 09/12/09 4
  • 5. Context Switches 09/12/09 Threads + Fiber / Coroutine overheads 5
  • 6. Context Switches A stack per Thread / Fiber Context specific data Transient long running contexts are memory hungry Guard against transient threads with a pre- initialized Thread pool Threaded full stack Web Apps == expensive context switches clean_backtrace ? 09/12/09 6
  • 7. Ruby Threads 09/12/09 Green Threads 7
  • 9. Ruby Threads Scheduled with a timer (SIGVTALRM) every 10ms Each thread is allowed a 10ms time slice Not the most efficient scheduler Coupled with select with IO multiplexing for portability Can wait for : fd, select, a PID, sleep state, a join MRI 1.8: Green Threads, cheap to spawn + switch MRI 1.9: Native OS threads, GIL, more expensive to spawn JRuby: Ruby thread == Java thread 09/12/09 9
  • 10. Fibers 09/12/09 Coroutines 10
  • 12. Fibers A resumable execution state Computes a partial result – generator Yields back to it's caller Caller resumes Facilities for data exchange Initial 4k stack size and very fast context switches MRI 1.9 and JRuby only Cooperative scheduling for IO 09/12/09 12
  • 13. Reactor Pattern 09/12/09 IO Reactor Pattern 13
  • 15. Reactor Pattern Main loop with a tick quantum ( 10 to 100ms ) Operations register themselves with the reactor Process forked, fd readable, cmd finished, timer fired, IO timeout etc. Callbacks and errbacks Reactor notified by lower level subsystems : select, epoll, kqueue etc. Twisted (Python), EventMachine (Ruby, c++, Java) 09/12/09 15
  • 16. Reactor and Contexts 09/12/09 Best Practices for Multi Threading 16
  • 17. Reactor and Threads Operations fire on the reactor thread Enumerated and invoked FIFO Blocking operations block the reactor Defer: schedule an operation on a background thread Schedule: push a deferred context back to the reactor thread 09/12/09 17
  • 19. Reactor with a deferred operation 09/12/09 19
  • 20. System Calls 09/12/09 Syscalls and the Kernel 20
  • 21. Syscalls and the Kernel 09/12/09 21
  • 22. Syscalls and the Kernel Function calls into the OS: read,write,fork,sbrk etc. User vs Kernel space context switch, much more expensive than function calls within a process Usually implies data transfer between User and Kernel Important to reduce syscalls for high throughput environments Some lift workloads ... sendfile: request a file to be served directly from the kernel without User space overhead 09/12/09 22
  • 23. POSIX Realtime (AIO) 09/12/09 POSIX Async IO extensions 23
  • 24. POSIX Realtime (AIO) Introduced in Linux Kernel 2.6.x Floating spec for a number of years, currently defined in POSIX.1b Implementation resembles browser compat Fallback to blocking operations in most implementations Powers popular reverse proxies like Squid, Nginx, Varnish etc. 09/12/09 24
  • 25. And then there were Specs 09/12/09 25
  • 26. AIO Control Blocks 09/12/09 AIO Control Blocks 26
  • 28. AIO Control Blocks File descriptor with proper r/w mode set Buffer region for read / write Type of operation: read / write Priority: higher priority == faster execution Offset: Position to read from / write to Bytes: Amount of data to transfer Callback mechanism: no op, thread or signal Best wrapped in a custom struct for embedding domain logic specific to the use cases 09/12/09 28
  • 29. AIO Operations 09/12/09 AIO Operations 29
  • 30. AIO Operations on a single fd 09/12/09 30
  • 31. AIO Operations on a single fd aio_read: sync / async read aio_write: sync / async write aio_error: error state, if any, for an operation aio_error and EINPROGRESS to simulate a blocking operation aio_cancel: cancel a submitted job aio_suspend: pause an in progress operation aio_sync: forcefully sync a write op to disk aio_return: return status from a given operation Uniform API, single AIO Control Block as arg 09/12/09 31
  • 32. AIO List Operations 09/12/09 AIO List Operations 32
  • 34. AIO list operations Previously mentioned API still have a syscall per call overhead lio_listio: submit a batch of control blocks with a single syscall Modes: blocking, non-blocking and no-op Array of control blocks, number of operations and an optional callback as arguments Callback fires when all operations done Callbacks from individual control blocks still fire Useful for app specific correlation 09/12/09 34
  • 35. AIO and Syscalls 09/12/09 AIO Syscalls 35
  • 37. 8 files, async read 09/12/09 37
  • 38. Revisit Threads and Fibers 09/12/09 Threads and Fibers, revisited 38
  • 39. Revisit Threads and Fibers Concept from James “raggi” Tucker Cheap switching of MRI green threads Lets embrace this … Stopped threads don't have scheduler overhead 09/12/09 39
  • 40. Revisit Threads and Fibers 09/12/09 40
  • 41. Fibered IO Interpreter 09/12/09 Fibered IO Interpreter 41
  • 42. Fibered IO Interpreter Thread#sleep and Thread#wakeup for pooled or transient threads Stopped threads excluded by the scheduler saves 10ms runtime per stopped thread when IO bound Model fits very well with existing threaded servers like Mongrel No need for an IO reactor – we delegate this to the OS and syscalls 09/12/09 42
  • 43. Links 09/12/09 Links and References 43
  • 44. Links and References A few related projects http://github.com/eventmachine/eventmachine Event Machine repository http://github/methodmissing/aio Work in progress AIO extension for MRI, API in flux, but usable http://github/methodmissing/callback A native MRI callback object http://github/methodmissing/channel Fixed sized pub sub channels for MRI 09/12/09 44
  • 46. Thanks! @methodmissing (github / twitter) 09/12/09 Thanks for listening 46