SlideShare a Scribd company logo
1 of 14
epoll - The I/O Hero 
Evented I/O in Linux 
Mohsin Shafeeque MS080400037
Operating System 
●Processor blindly executes instructions. 
●Its the OS that puts a structure that enables a lot 
oThings like multi tasking 
oInteraction with I/O devices 
o Providing APIs and utility functions 
●In essence, lots of services! 
●Without them, there is no screen, no keyboard no HD. 
●But today I'll talk about a single service that has a great 
impact. 
●Or say, a single function that made Linux Kernel hell lot 
valuable. 
●And that is, evented I/O.
A little about I/O 
●All I/O devices are slow 
oDisks 
oNetworks 
●Processor are too much fast. 
oA 10 msec disk operation. 
oAnd processor has executed millions of instructions. 
●Two modes of I/O 
oBlocking 
 Process blocks until operation completes 
oNon blocking 
 Process continues. Asynchronous. 
●Non-blocking has many implementations
Non blocking I/O 
●There are many hardware/software implementations. 
●Polling 
oContinues looping to check status polling 
oWastes CPU cycles 
●Signals 
oOS generated interrupts. 
oMight leave other processes inconsistent. 
●Callbacks 
oPointer to functions. 
oStack deepening issue. (callback issuing I/O) 
●Interrupts 
o Hardware interrupts in kernel mode.
Web servers - I/O hungry! 
●Its not just the disk fragmentation and file copy 
programs that are I/O hungry. 
●Even more hungry are the web servers! 
●In the age of Internet, web server performance is 
critical. 
●And all of it relies on throughput! 
●Number of requests/clients served per second. 
● There are many models around it. 
●But one particular service/function called epoll has 
accelerated all this.
How web server works? 
●Before we look at epoll(), lets look at servers. 
●The open up a socket. 
●Wait for incoming connections. 
●There are three models here: 
oOne process per connection (Apache?) 
oOne thread per connection. 
●In first case, a new process spawned on every request. 
●Second case, new thread created for each request. 
●Both aren't very scalable. 
● Third option: 
oOne thread multiple connections! 
●Lets see how it works!
Create sockets, select then! 
●Single thread creates many sockets. 
●Each socket is a file descriptor so an array of them. 
●Server code calls the below given select function. 
oint select(int nfds, fd_set *readfds, fd_set 
*writefds, fd_set *exceptfds, struct timeval 
*timeout); 
●nfds - number of file descriptors. 
●readfs - those to be read. 
●writefds - those to be written. 
●exceptfds - those to check for error. 
●timeout - time to sleep at max. 
● Program calls this function, which makes it to sleep. 
●The call would only return when some descriptor is ready.
select - Zooming in 
●From the man page: 
select() and pselect() allow a program to monitor multiple file 
descriptors, waiting until one or more of the file descriptors become 
"ready" for some class of I/O operation (e.g., input possible). A file 
descriptor is considered ready if it is possible to perform the corre‐sponding 
I/O operation (e.g., read(2)) without blocking.
Problem with select 
●It takes O(n) time! 
●That is, if 500 file descriptors (sockets) are being 
watched, 
●it might take 500 steps to return the fd that's ready for 
I/O. 
●And that's a problem! 
●Already we have kernel to user mode switching overhead. 
●And then this O(n) 
● Solution....? 
●epoll() - Introduced in Linux 2.5.44. 
●Takes O(1) time. That's fast! 
●Lets have a look.
epoll details 
int epoll_create(int size); 
Creates an epoll object and returns its file descriptor. 
int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); 
Controls (configures) which file descriptors are watched by this object, and for which events. 
int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
How it works? 
●epoll_wait is called by the server code. 
●It will return any fds that are ready (i.e. have data). 
●Two modes of operations 
oedge triggered. 
olevel-triggered. 
●In edge triggered, process will be invoked only once per 
new arrival. 
oE.g. 2 KB received, process reads 1 KB only, next call 
will block till further data arrives even if 1 KB already 
in. 
●In level triggered, process will be invoked till the buffer 
is empty. 
oE.g. 2 KB received, 1 KB read, next call won't block but 
would return same descriptor.
So what? 
●This model has enabled the servers to handle thousands 
of requests in handful of threads. 
●Server creates sockets, on each request arrival, a file 
descriptor created and monitored. 
●Event driven! 
●Ngnix! The second largest server on internet uses this 
model. 
●Written by a Russian. Handles 70 million calls a day. 
●Ngnix used by Wordpress! Many others! 
● Node.js - The new hotness. Based on V8 Javascript 
Engine. 
●Uses the epoll() to handle thousands of requests per 
thread. Apache in comparison is on thread per request.
Conclusion 
●Operating systems do provide services. 
●But sometimes, even a single function call can open up a 
new world of possibilities. 
● epoll() is such an example. 
●I/O is most critical portion of an OS. 
●Even if it is single tasking system efficient I/O is what 
the system is all about. 
●And epoll() adds event driven I/O to Linux. 
●Interesting applications are being developed on top of 
epoll()
References 
●epoll() official man page. 
●Linux Device Drivers - poll epoll 
● Node.js - Evented I/O for Javascript 
● NGNIX - The Russian Webserver

More Related Content

What's hot

Linux internal
Linux internalLinux internal
Linux internal
mcganesh
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?
ScyllaDB
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ScyllaDB
 

What's hot (20)

Linux basic commands with examples
Linux basic commands with examplesLinux basic commands with examples
Linux basic commands with examples
 
Linux internal
Linux internalLinux internal
Linux internal
 
Troubleshooting containerized triple o deployment
Troubleshooting containerized triple o deploymentTroubleshooting containerized triple o deployment
Troubleshooting containerized triple o deployment
 
Linux Internals - Part I
Linux Internals - Part ILinux Internals - Part I
Linux Internals - Part I
 
DMA Survival Guide
DMA Survival GuideDMA Survival Guide
DMA Survival Guide
 
Linux kernel debugging
Linux kernel debuggingLinux kernel debugging
Linux kernel debugging
 
Choose the Right Container Storage for Kubernetes
Choose the Right Container Storage for KubernetesChoose the Right Container Storage for Kubernetes
Choose the Right Container Storage for Kubernetes
 
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
Linux dma engine
Linux dma engineLinux dma engine
Linux dma engine
 
U-Boot Porting on New Hardware
U-Boot Porting on New HardwareU-Boot Porting on New Hardware
U-Boot Porting on New Hardware
 
tmux
tmuxtmux
tmux
 
Linux Inter Process Communication
Linux Inter Process CommunicationLinux Inter Process Communication
Linux Inter Process Communication
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
linux file sysytem& input and output
linux file sysytem& input and outputlinux file sysytem& input and output
linux file sysytem& input and output
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?OSNoise Tracer: Who Is Stealing My CPU Time?
OSNoise Tracer: Who Is Stealing My CPU Time?
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Linux System Programming - File I/O
Linux System Programming - File I/O Linux System Programming - File I/O
Linux System Programming - File I/O
 

Viewers also liked

State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
Jonas Bonér
 
Node js presentation
Node js presentationNode js presentation
Node js presentation
martincabrera
 

Viewers also liked (20)

Non-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.jsNon-blocking I/O, Event loops and node.js
Non-blocking I/O, Event loops and node.js
 
Scala at foursquare
Scala at foursquareScala at foursquare
Scala at foursquare
 
Memory Pools for C and C++
Memory Pools for C and C++Memory Pools for C and C++
Memory Pools for C and C++
 
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
Fast and energy-efficient eNVM based memory organisation at L3-L1 layers for ...
 
Wait queue
Wait queueWait queue
Wait queue
 
How we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used ElasticsearchHow we (Almost) Forgot Lambda Architecture and used Elasticsearch
How we (Almost) Forgot Lambda Architecture and used Elasticsearch
 
HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015HBase from the Trenches - Phoenix Data Conference 2015
HBase from the Trenches - Phoenix Data Conference 2015
 
Hadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoopHadoop, MapReduce and R = RHadoop
Hadoop, MapReduce and R = RHadoop
 
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVMState: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
State: You're Doing It Wrong - Alternative Concurrency Paradigms For The JVM
 
Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析Linux下Poll和Epoll内核源码剖析
Linux下Poll和Epoll内核源码剖析
 
Node js presentation
Node js presentationNode js presentation
Node js presentation
 
Unity Internals: Memory and Performance
Unity Internals: Memory and PerformanceUnity Internals: Memory and Performance
Unity Internals: Memory and Performance
 
Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8Memory Management: What You Need to Know When Moving to Java 8
Memory Management: What You Need to Know When Moving to Java 8
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Understanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And ProfitUnderstanding Memory Management In Spark For Fun And Profit
Understanding Memory Management In Spark For Fun And Profit
 
NodeJS for Beginner
NodeJS for BeginnerNodeJS for Beginner
NodeJS for Beginner
 
Modern UI Development With Node.js
Modern UI Development With Node.jsModern UI Development With Node.js
Modern UI Development With Node.js
 
Hadoop HDFS Detailed Introduction
Hadoop HDFS Detailed IntroductionHadoop HDFS Detailed Introduction
Hadoop HDFS Detailed Introduction
 
Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture Anatomy of a Modern Node.js Application Architecture
Anatomy of a Modern Node.js Application Architecture
 
Nodejs Explained with Examples
Nodejs Explained with ExamplesNodejs Explained with Examples
Nodejs Explained with Examples
 

Similar to epoll() - The I/O Hero

Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexing
Mark Veltzer
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?
OpenFest team
 

Similar to epoll() - The I/O Hero (20)

Linux multiplexing
Linux multiplexingLinux multiplexing
Linux multiplexing
 
Deep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdfDeep Dive into Node.js Event Loop.pdf
Deep Dive into Node.js Event Loop.pdf
 
M|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScaleM|18 Architectural Overview: MariaDB MaxScale
M|18 Architectural Overview: MariaDB MaxScale
 
Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势Linux 开源操作系统发展新趋势
Linux 开源操作系统发展新趋势
 
uWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web appsuWSGI - Swiss army knife for your Python web apps
uWSGI - Swiss army knife for your Python web apps
 
Introduction to node.js
Introduction to node.jsIntroduction to node.js
Introduction to node.js
 
Nodejs
NodejsNodejs
Nodejs
 
Performance optimization techniques for Java code
Performance optimization techniques for Java codePerformance optimization techniques for Java code
Performance optimization techniques for Java code
 
Linux-Internals-and-Networking
Linux-Internals-and-NetworkingLinux-Internals-and-Networking
Linux-Internals-and-Networking
 
The internet of $h1t
The internet of $h1tThe internet of $h1t
The internet of $h1t
 
Scaling an ELK stack at bol.com
Scaling an ELK stack at bol.comScaling an ELK stack at bol.com
Scaling an ELK stack at bol.com
 
The art of concurrent programming
The art of concurrent programmingThe art of concurrent programming
The art of concurrent programming
 
Why kernelspace sucks?
Why kernelspace sucks?Why kernelspace sucks?
Why kernelspace sucks?
 
Monkey Server
Monkey ServerMonkey Server
Monkey Server
 
Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)Linux Server Deep Dives (DrupalCon Amsterdam)
Linux Server Deep Dives (DrupalCon Amsterdam)
 
High performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodbHigh performance json- postgre sql vs. mongodb
High performance json- postgre sql vs. mongodb
 
LAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMGLAS16-209: Finished and Upcoming Projects in LMG
LAS16-209: Finished and Upcoming Projects in LMG
 
Network Automation: Ansible 101
Network Automation: Ansible 101Network Automation: Ansible 101
Network Automation: Ansible 101
 
Concept of thread
Concept of threadConcept of thread
Concept of thread
 
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup SunnyvaleIntroduction to Docker (and a bit more) at LSPE meetup Sunnyvale
Introduction to Docker (and a bit more) at LSPE meetup Sunnyvale
 

Recently uploaded

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Recently uploaded (20)

+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Introduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDMIntroduction to use of FHIR Documents in ABDM
Introduction to use of FHIR Documents in ABDM
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 

epoll() - The I/O Hero

  • 1. epoll - The I/O Hero Evented I/O in Linux Mohsin Shafeeque MS080400037
  • 2. Operating System ●Processor blindly executes instructions. ●Its the OS that puts a structure that enables a lot oThings like multi tasking oInteraction with I/O devices o Providing APIs and utility functions ●In essence, lots of services! ●Without them, there is no screen, no keyboard no HD. ●But today I'll talk about a single service that has a great impact. ●Or say, a single function that made Linux Kernel hell lot valuable. ●And that is, evented I/O.
  • 3. A little about I/O ●All I/O devices are slow oDisks oNetworks ●Processor are too much fast. oA 10 msec disk operation. oAnd processor has executed millions of instructions. ●Two modes of I/O oBlocking  Process blocks until operation completes oNon blocking  Process continues. Asynchronous. ●Non-blocking has many implementations
  • 4. Non blocking I/O ●There are many hardware/software implementations. ●Polling oContinues looping to check status polling oWastes CPU cycles ●Signals oOS generated interrupts. oMight leave other processes inconsistent. ●Callbacks oPointer to functions. oStack deepening issue. (callback issuing I/O) ●Interrupts o Hardware interrupts in kernel mode.
  • 5. Web servers - I/O hungry! ●Its not just the disk fragmentation and file copy programs that are I/O hungry. ●Even more hungry are the web servers! ●In the age of Internet, web server performance is critical. ●And all of it relies on throughput! ●Number of requests/clients served per second. ● There are many models around it. ●But one particular service/function called epoll has accelerated all this.
  • 6. How web server works? ●Before we look at epoll(), lets look at servers. ●The open up a socket. ●Wait for incoming connections. ●There are three models here: oOne process per connection (Apache?) oOne thread per connection. ●In first case, a new process spawned on every request. ●Second case, new thread created for each request. ●Both aren't very scalable. ● Third option: oOne thread multiple connections! ●Lets see how it works!
  • 7. Create sockets, select then! ●Single thread creates many sockets. ●Each socket is a file descriptor so an array of them. ●Server code calls the below given select function. oint select(int nfds, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); ●nfds - number of file descriptors. ●readfs - those to be read. ●writefds - those to be written. ●exceptfds - those to check for error. ●timeout - time to sleep at max. ● Program calls this function, which makes it to sleep. ●The call would only return when some descriptor is ready.
  • 8. select - Zooming in ●From the man page: select() and pselect() allow a program to monitor multiple file descriptors, waiting until one or more of the file descriptors become "ready" for some class of I/O operation (e.g., input possible). A file descriptor is considered ready if it is possible to perform the corre‐sponding I/O operation (e.g., read(2)) without blocking.
  • 9. Problem with select ●It takes O(n) time! ●That is, if 500 file descriptors (sockets) are being watched, ●it might take 500 steps to return the fd that's ready for I/O. ●And that's a problem! ●Already we have kernel to user mode switching overhead. ●And then this O(n) ● Solution....? ●epoll() - Introduced in Linux 2.5.44. ●Takes O(1) time. That's fast! ●Lets have a look.
  • 10. epoll details int epoll_create(int size); Creates an epoll object and returns its file descriptor. int epoll_ctl(int epfd, int op, int fd, struct epoll_event *event); Controls (configures) which file descriptors are watched by this object, and for which events. int epoll_wait(int epfd, struct epoll_event *events, int maxevents, int timeout);
  • 11. How it works? ●epoll_wait is called by the server code. ●It will return any fds that are ready (i.e. have data). ●Two modes of operations oedge triggered. olevel-triggered. ●In edge triggered, process will be invoked only once per new arrival. oE.g. 2 KB received, process reads 1 KB only, next call will block till further data arrives even if 1 KB already in. ●In level triggered, process will be invoked till the buffer is empty. oE.g. 2 KB received, 1 KB read, next call won't block but would return same descriptor.
  • 12. So what? ●This model has enabled the servers to handle thousands of requests in handful of threads. ●Server creates sockets, on each request arrival, a file descriptor created and monitored. ●Event driven! ●Ngnix! The second largest server on internet uses this model. ●Written by a Russian. Handles 70 million calls a day. ●Ngnix used by Wordpress! Many others! ● Node.js - The new hotness. Based on V8 Javascript Engine. ●Uses the epoll() to handle thousands of requests per thread. Apache in comparison is on thread per request.
  • 13. Conclusion ●Operating systems do provide services. ●But sometimes, even a single function call can open up a new world of possibilities. ● epoll() is such an example. ●I/O is most critical portion of an OS. ●Even if it is single tasking system efficient I/O is what the system is all about. ●And epoll() adds event driven I/O to Linux. ●Interesting applications are being developed on top of epoll()
  • 14. References ●epoll() official man page. ●Linux Device Drivers - poll epoll ● Node.js - Evented I/O for Javascript ● NGNIX - The Russian Webserver