2. 2
1. Computer hardware Basics
2. Computing fundamental Notions
3. Introduction to Operating Systems
3.1 Memory Management
3.2 Process Management
3.3 Multithreading and Multiprocessing
3.4 Concurrency Control
3.5 Networking
3.6 Other Fundamental notions
4. Final Notes
Agenda
4. 4
A simplified Computer Model
CPU
Front side bus
Address
Control
Data
PCI
bus
High
Speed
Graphics
Bus
Northbridge
Southbridge
Chipset
(Memory
Controller
Hub)
(I/O
Controller
Hub)
Memory
Bus
ISA or LPC Bus
BIOS
(Other
Busies)
IDE
SATA
USB
…
Super
IO
Parallel Port
Serial Port
PS2 port (Keyboard / Mouse) / etc.
SIO
Sound Controller
Sound
Processor
AD / DA
Converters
Ethernet Controller
(OSI stack)
2. Data Link
1. Physical
GPU
Output
Controllers
S-Display
HDMI
DVI
…
Framebuffer
Management
OGL/ D3D
/ CUDA / etc.
Ethernet Jack / Speakers
…
RAM
0x0000 0000
0xFFFF FFFF
6. 6
Let’s simplify it further – the model we’ll use
RAM
Bus
CPU
I/O
(Hard
Drive)
Address
Control
Data
High Speed Data Store
An X bits address
(typically 64 bits)
maps to an Y bits data
(typically 8 bits)
Long-term persistent data organized in files
and managed by a filesystem
The CPU can do many things.
For instance:
- Read / write data from / in RAM
- Interpret an assembly instruction
- Trigger I/O Operations
- Perform basic arithmetic logic
- Generate interruptions
- Cache data in its own very high
speed memory (L1 – L3 caches)
0x0000 0000
0xFFFF FFFF
8. 8
Fundamental components
RAM
Bus
CPU
Hard
Drive
Address
Control
Data
Processes Space
Partitions / Filesystem
Inode table / directory structure / inodes
0x0000 0000
0xFFFF FFFF
Registers
MAR MBR CBR
SR
Data
(AX,BX,CX,DX)
IP SP BP
Index
(SI,DI)
Control
(OF,DF,IF,TF,SF,ZF,AF,CF)
Segment
(CS,DS,SS)
ALU FPU
CU
L1
Cache
(other
levels
caches)
Core
Components
MMU
Clock
AGU
(ACU)
Control
Unit
Components
IR
Instruction
Decoder
Cycle
Encoder
Control Logic Circuits
Kernel
Space
System Call Map
Video Framebuffer
Kernel
Kernel Process Layout
Interrupt TCBs
Physical
Memory
Layout
Bootloader
Microcode (software)
Filesystem buffer cache
9. 9
Registers
Memory management registers
MAR : Memory Address Register - MAR
emits instruction to address bus so that
the specific cell in the memory is
activated.
MBR – Memory Buffer Register – MBR
contains the data to read / write – also
called MDR – Memory Data Register
CBR – Control Bus Register - CBR emits
READ or WRITE command to control bus.
Program Execution management registers:
IP – Instruction Pointer – also called PC –
Program Counter - points to the next
instruction the CPU should execute,
SP – Stack Pointer – points to the last
value pushed onto the stack
BP – Base Pointer - is the base pointer
for the current stack frame.
CPU state management register :
SR – Status Register - contains
information about the state of the
processor
Segment registers:
CS - Code Segment - contains all the instructions to be executed.
DS - Data Segment - contains data, constants and work areas.
SS - Stack Segment - contains data and return addresses of procedures or subroutines. It is implemented as a 'stack' data structure. The Stack
Segment register or SS register stores the starting address of the stack.
10. 10
Registers (cont’d)
Data Registers:
AX is the primary accumulator; it is used in input/output and most arithmetic instructions. For example, in multiplication operation, one operand is stored in EAX or AX or
AL register according to the size of the operand.
BX is known as the base register, as it could be used in indexed addressing.
CX is known as the count register, as the ECX, CX registers store the loop count in iterative operations.
DX is known as the data register. It is also used in input/output operations. It is also used with AX register along with DX for multiply and divide operations involving large
values.
Index Registers:
SI - Source Index- is used as source index for string operations.
DI - Destination Index- is used as destination index for string operations.
Control Registers:
OF - Overflow Flag - indicates the overflow of a high-order bit (leftmost bit) of data after a signed arithmetic operation.
DF - Direction Flag - determines left or right direction for moving or comparing string data. When the DF value is 0, the string operation takes left-to-right direction and
when the value is set to 1, the string operation takes right-to-left direction.
IF - Interrupt Flag - determines whether the external interrupts like keyboard entry, etc., are to be ignored or processed. It disables the external interrupt when the value is
0 and enables interrupts when set to 1.
TF - Trap Flag - allows setting the operation of the processor in single-step mode. The DEBUG program we used sets the trap flag, so we could step through the execution
one instruction at a time.
SF - Sign Flag - shows the sign of the result of an arithmetic operation. This flag is set according to the sign of a data item following the arithmetic operation. The sign is
indicated by the high-order of leftmost bit. A positive result clears the value of SF to 0 and negative result sets it to 1.
ZF - Zero Flag - It indicates the result of an arithmetic or comparison operation. A nonzero result clears the zero flag to 0, and a zero result sets it to 1.
AF - Auxiliary Carry Flag - contains the carry from bit 3 to bit 4 following an arithmetic operation; used for specialized arithmetic. The AF is set when a 1-byte arithmetic
operation causes a carry from bit 3 into bit 4.
PF - Parity Flag - indicates the total number of 1-bits in the result obtained from an arithmetic operation. An even number of 1-bits clears the parity flag to 0 and an odd
number of 1-bits sets the parity flag to 1.
CF – Carry Flag - contains the carry of 0 or 1 from a high-order bit (leftmost) after an arithmetic operation. It also stores the contents of last bit of a shift or rotate
operation.
11. 11
CPU Components
Components:
ALU – Arithmetic logic unit - is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers
FPU - Floating-point unit - is specially designed to carry out operations on floating-point numbers.[1] Typical operations are addition, subtraction, multiplication, division,
and square root. Some FPUs can also perform various transcendental functions such as exponential or trigonometric calculations
AGU – Address Generation Unit - sometimes also called address computation unit (ACU) is an execution unit inside the CPU that calculates addresses used by the CPU to
access main memory. By having address calculations handled by separate circuitry that operates in parallel with the rest of the CPU, the number of CPU cycles required for
executing various machine instructions can be reduced, bringing performance improvements.
MMU - Memory Management Unit - sometimes called paged memory management unit (PMMU) has all memory references passed through itself, primarily performing the
translation of virtual memory addresses to physical addresses. An MMU effectively performs virtual memory management, handling at the same time memory protection,
cache control, bus arbitration
CU - Control Unit - directs the operation of the processor. It tells the computer's memory, arithmetic and logic unit and input and output devices how to respond to the
instructions that have been sent to the processor. It directs the operation of the other units by providing timing and control signals. Most computer resources are managed
by the CU.
A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost (time or energy) to access data from the main
memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs
have a hierarchy of multiple cache levels (L1, L2, often L3, and rarely even L4), with separate instruction-specific and data-specific caches at level 1.
Control Unit Components
IR - Instruction register - or current instruction register (CIR) holds the instruction currently being executed or decoded.] In simple processors, each instruction to be
executed is loaded into the instruction register, which holds it while it is decoded, prepared and ultimately executed, which can take several steps. Modern processors use a
pipeline of instruction registers where each stage of the pipeline does part of the decoding, preparation or execution and then passes it to the next stage for its step.
Modern processors can even do some of the steps out of order as decoding on several instructions is done in parallel.
Clock - usually just a specific clock that roughly times how fast the majority of the logic in a computer operates, how many state changes the computer can do in a second.
Instruction Decoder / Cycle Encoder - decodes and interprets the contents of the Instruction Register, i.e. its splits whole instruction into fields for the Control Unit to
interpret.
Control Logic Circuits - create the control signals themselves, which are then sent around the processor. These signals inform the other components (ALU, MMU, etc.) and
the register array what they actions and steps they should be performing, what data they should be using to perform actions, and what should be done with the results.
13. 13
Computer System Architecture / Standard OS Model
Hardware CPU RAM I/O
Software
OS
Kernel
Standard C Library (e.g. glibc)
App.
Fwk.
User App User App User App User App
Drivers
Process
Management
Memory
Management
…
…
…
Other
Compilers
Graphical User
Interface
System
libraries
…
…
…
System Calls
C
Compiler
14. 14
Executing a Program
RAM
Bus
CPU
Hard
Drive
Address
Control
Data
0x0000 0000
0xFFFF FFFF
Data
(AX,BX,CX,DX)
IP SP BP
Segment
(CS,DS,SS)
MMU
AGU
(ACU)
System Call Map
Kernel
TEXT (Code)
Initialized Global Variables
Uninitialized Global Variables (BSS)
HEAP
STACK
Command Line & Environment Vars.
Program File on the File-System
Data
Standard
Process
Layout
in
Memory
Index
(SI,DI)
The Kernel either
reads the file through
the HDD Controller
and stores it in
memory or orders the
HDD controller to
send it to memory
using DMA
15. 15
Dynamic Memory allocation
OS
Kernel
C
Lib
App.
Fwk.
Memory Management
Java
new()
GC
C++
new ()
delete()
Python
constr’()
del() / GC
mmap() / brk()
C
Compiler
malloc () / calloc() / free() / …
RAM
0x1000 0000
0xFFFF FFFF
System Call Map
TEXT (Code)
HEAP
STACK
Command Line & Environment Vars.
// Dynamically allocate memory using malloc()
n = 10; // array of 10 integers
int* ptr = (int*) malloc(n * sizeof(int));
// Free previously allocated memory
free(ptr);
Newly allocated bloc
Initialized Global Variables
Uninitialized Global Variables (BSS)
Data
16. 16
Call stack principles
RAM
0x1000 0000
0xFFFF FFFF
…
STACK
…
HW
CPU
IP BP SP
Registers
TEXT
(Code)
…
…
A
function
Frame
Function arguments
Return Address
Saved FP (BP)
Saved SP
Local Variables
OS
Kernel
Process Management
C Compiler
17. 17
Call stack Example Code
// Function to draw a square on the screen
void DrawSquare(int startX, int startY, int length){
// Let’s draw the square by drawing each of its lines
DrawLine(startX, startY, startX + length, startY);
DrawLine(startX, startY, startX, startY + length);
DrawLine(startX + length, startY, startX + length, startY + length);
DrawLine(startX, startY + length, startX + length, startY + length);
}
// Function to draw a line on the screen
void DrawLine(int startX, int startY, int endX, int endY){
// Do whatever needs to be done to draw a line :-)
...
}
Let’s use a little piece of code as an example :
18. 18
void DrawSquare(…){
DrawLine(…);
...
}
void DrawLine(…){
...
}
void DrawSquare(…){
DrawLine(…);
...
}
void DrawLine(…){
...
}
TEXT
(Code)
…
…
DrawSquare asm
DrawLine asm
…
…
SP
IP
FP
TEXT
(Code)
…
…
DrawSquare asm
DrawLine asm
…
…
DrawSquare
Frame
Function arguments
Return Address
Saved FP (BP)
Saved SP
Local Variables
SP
FP
IP
void DrawSquare(…){
DrawLine(…);
...
}
void DrawLine(…){
...
}
void DrawSquare(…){
DrawLine(…);
...
}
void DrawLine(…){
...
}
TEXT
(Code)
…
…
DrawSquare asm
DrawLine asm
…
…
DrawSquare
Frame
Function arguments
Return Address
Saved FP (BP)
Saved SP
Local Variables
SP
FP
IP
Call stack Example
TEXT
(Code)
…
…
DrawSquare asm
DrawLine asm
…
…
DrawSquare
Frame
Function arguments
Return Address
Saved FP (BP)
Saved SP
Local Variables
SP
FP
IP
DrawLine
Frame
Function arguments
Return Address
Saved FP / BP
Saved SP
Local Variables
STACK
20. 20
Operating Systems Architecture
Hardware
Kernel
/
Modules
Kernel
Core
s handler
s handler
FileSystem
Type Handler Device
Driver
Device
Driver
Character
Device Drivers
Driver
Driver
Network Device
Drivers
ms handler
ms handler
Block Device
Drivers
Driver
Driver
Network
Protocols
Memory
Manager
Scheduler
CPU RAM
Hard Drive, DVD,
Floppy
Software
Support
Various Terminal
Equipment
Network Adapter
Multitasking Virtual Memory Filesystem
Graphics,
Terminals, etc.
Network Functionality
FileSystem Device Drivers Network
Memory
management
Process
Management
Hardware
Support
Components
Kernel
System Calls Interface
Standard C Library (e.g. glibc)
Low level
Interface
High level
Interface
Shell
Window System
s handler
s handler
System
Library
s handler
s handler
Other Compilers
System Software
C Compiler
Applications
Operating
System
(Linux)
Basics
22. 22
Virtual Memory – principle / wrong model
OS
Krn
Memory Management
Process A
Virtual Memory
0x0000
0xFFFF
System Call Map
TEXT
(Code)
Data
HEAP
STACK
Cmd Line & Env. Vars.
Physical RAM
0x0000 0000
0xFFFF FFFF
Kernel
System Call Map
TEXT (Code)
Data
HEAP
STACK
Cmd Line & Env. Vars.
Process A
in physical
RAM
Process B
in physical
RAM
Process C
in physical
RAM
HW
CPU
MMU RAM
Virtual Memory is a combination of hardware and
software techniques that provides to every process a
virtual memory model being an idealized abstraction of
the memory but fundamentally independent from the
physical memory even though it is mapped to it.
The computer's operating system, using a combination of
hardware and software, maps memory addresses used by
a program, called virtual addresses, into physical
addresses in computer memory.
Main storage, as seen by a process or task, appears as a
contiguous address space or collection of contiguous
segments.
The operating system manages virtual address spaces and
the assignment of real memory to virtual memory.
Address translation hardware in the CPU is the memory
management unit (MMU), automatically translates virtual
addresses to physical addresses
23. 23
Process A
Virtual Memory
0x0000
0xFFFF
System Call Map
TEXT
(Code)
Initialized Vars.
Uninitialized Vars. (BSS)
HEAP
STACK
Cmd Line & Env. Vars.
System Call Map
TEXT
(Code)
Data
HEAP
STACK
Cmd Line & Env. Vars.
SWAP partition / pagefile.sys / SWAPFile
Virtual Memory – actual model
Page 8
Page 7
Page 6
Page 5
Page 4
Page 3
Page 2
Page 1
Physical RAM
0x0000 0000
0xFFFF FFFF
Kernel
Proc A / Page 1
Proc A / Page 2
Proc A / Page 4
Proc A / Page 8
Proc B / Page 1
Proc C / Page 1
Proc B / Page 2
Hard
Drive
Proc A / Page
3
Proc A / Page
5
Proc A / Page
7
This is the real model
A page – memory page or virtual page – is a fixed
length contiguous block of memory indexed /
tracked the Page Table
Most frequently used pages reside in physical
memory (RAM).
Least frequently used pages are “swapped” to the
filesystem
Some pages are not even created
24. 24
Proc B / Page 2
Virtual Memory – Page Fault Management
SWAP partition / pagefile.sys / SWAPFile
Process A
Virtual Memory
0x0000
0xFFFF
System Call Map
TEXT
(Code)
Initialized Vars.
Uninitialized Vars. (BSS)
HEAP
STACK
Cmd Line & Env. Vars.
Page 8
Page 7
Page 6
Page 5
Page 4
Page 3
Page 2
Page 1
Physical RAM
0x0000 0000
0xFFFF FFFF
Kernel
Proc A / Page 1
Proc A / Page 2
Proc A / Page 4
Proc A / Page 8
Proc B / Page 1
Proc C / Page 1
Hard
Drive
Proc A / Page
3
Proc A / Page
5
Proc 1 / Page 7
1. For whatever reason the process needs to access its page 7
(.e.g. new function call requiring new stack block)
Proc B / Page 2
OS
Krn
Memory
Management
HW
CPU
MMU
RAM
Page
Table
2. Since Page 7 is not in memory, the MMU generates / sends
an Page Fault exception (interrupt) to the kernel
3. The Memory
Management
Module is
selecting the
LFU page to
swap it and
replace it with
the missing page
4. The Proc B /
page 2 is
swapped out
while the Proc A
/ Page 7 is
swapped in.
The Page Table
is then updated
and the faulty
instruction is
restarted
Page 7
Proc B / Page 2
Proc A / Page 7
26. 26
Multitasking
Virtual Memory
Process B
in memory
Process C
in memory
Process A
in memory
TEXT
(Code)
TEXT (Code)
TEXT (Code)
CPU Clock
T0 T1 = T0 + 1 sec
Time
Process
A
Execution
Process
B
Execution
Process
C
Execution
A
B
C
A
B
C
A
B
C
A
B
C
A
B
C
A
B
C
A
Context Switch
…
Multitasking
Multitasking is the concurrent execution of
multiple tasks - or processes over a certain
period of time.
The computer interrupts already started tasks
before they finish and execute new tasks,
instead of waiting for the former to end
(preemptive model / preemption)
As a result, a computer executes segments of
multiple tasks in an interleaved manner.
Time
Slicing
HW
CPU
Registers
OS
Kernel
Process
Management
RAM
Memory
Management
27. 27
Multi-tasking – what happens in the CPU
Virtual Memory
Process B
in memory
Process C
in memory
Process A
in memory
TEXT
(Code)
TEXT (Code)
TEXT (Code)
T0 T1 = T0 + 1 sec
Time
Time
Slicing
CPU
Execution Context Switch
Context Switch
In a multitasking / time slicing operation mode, the OS / CPU couple does
thousands of context switches every single second
Every process has the ”illusion” it’s executing continuously … and has the
CPU for him all alone :-)
This is called “concurrent” execution
28. 28
A context Switch scenario
Process B
in memory
TEXT (Code)
Process A
in memory
TEXT (Code)
Kernel
Process
Management
Time
User Mode
Kernel Mode
Timer interrupt: A is pre-
empted, context switch to B
System call (trap): B start blocking IO
operation, context switch to A
Timer interrupt: no other process eligible (B
IO is not completed), A keeps the CPU
IO Device Interrupt: B IO operation is
completed, context switch to B
Timer interrupt: B still has some
execution time quota, no context switch
29. 29
How context Switches happen
Process B
in memory
TEXT (Code)
Process A
in memory
TEXT (Code)
Kernel
Process
Management
User Mode
Kernel Mode
HW
CPU
Regs.
RAM
Time
IP BP SP
Other
registers
Process A
PCB
Process B
PCB
Process A
PCB
Save Process
A state into
PCB
Restore
Process B
state into PCB
Save Process
B state into
PCB
Restore
Process A
state into PCB
Context Switch / PCB – Process Control Block
During a context switch, the kernel must stop the execution of the running process, copy out the values in hardware registers to
its PCB, and update the hardware registers with the values from the PCB of the new process.
The list of PCBs is stored in the kernel heap space.
PCB
PID – Process ID
Process State
Process priority
Accounting information
IP – Instruction Pointer (PC)
Other CPU registers
Memory Management Info
I/O Information
30. 30
A simple and common approach: Priority base scheduling
Principle:
A priority (integer number) is assigned to each process
at every scheduling cycle, the kernel selects and executes the process with the highest priority
Aging approach for fair round-robin policy
A process being executed has its priority decremented at every scheduling cycle
A process being idle (waiting for execution) has its priority incremented at every scheduling cycle
This ensures a fait round robin between processes
Variant : static priority base / dynamic priority increment
There are dozens of other approaches in every OSes.
Some are very simple (FIFO, round robin, etc.) some are really complex (CPU burst prediction, etc.).
There are entire books dedicated to process / task scheduling in Operating Systems
Task / Process Scheduling
31. 31
Non-preemptive scheduling vs. Preemptive scheduling
Non-preemptive scheduling : a process cannot be stopped unless it does a system call (IO, wait, sleep,
etc.). A running process can keep the CPU busy forever and other processes just have to wait.
Preemptive scheduling: a process can be stopped even if it doesn’t do any system call, for instance when
its “time share” or slice is expired.
Preemptive schedulers are significantly more difficult to design an implement
Preemption means the ability of the OS to preempt (stop or pause) a currently scheduled task in favour of
a higher priority task.
Preemption is crucial to support Real-Time processing use cases in OSes (Linux with specific kernel flags or
RTOSes)
Preemption is key for server applications running most of the time multiple backends (database, web server,
middleware, application servers, etc.)
What can trigger a context switch ?
The time slice finished (preemptive scheduling only)
System call (IO or any other system call)
Explicit call to yield(), wait() or sleep()
Other concerns
33. 33
A process is an independent program or executable running on the computer
A process is isolated from other processes in memory
A process always has at least one thread: usually the main thread
A process is represented by a PCB Object in the kernel
A thread belongs to a process
A thread exists in the context of an owning process
Every thread share the same heap, data and code area within the owning process.
As such thread can communicate easily with each others.
But every thread has its own stack
A thread is represented by a TCB Object in the kernel
A scheduler typically favours switching to another thread of the same process (thread context
switches are less costly than process context switches)
A thread is a lightweight execution stream within a process
Threads
Process
A
Virtual
Memory
0x0000
0xFFFF
System Call Map
TEXT (Code)
Data
HEAP
Cmd Line & Env. Vars.
Kernel
Memory
Process A
PCB
Thread 1 TCB
Thread 2 TCB
Thread 3 TCB
STACK
T
1
STACK
T
2
STACK
T
3
HW
CPU
Registers
OS
Kernel
Process
Management
RAM
Memory
Management
TCB
TID – Thread ID
Thread State
Accounting information
IP – Instruction Pointer (PC)
Other CPU registers
Pointer to owning PCB
34. 34
Multiprogramming / Multitasking / Multithreading / ...
Multiprogramming
- Several programs can reside in
physical memory at the same time
- Non-preemptive scheduling: A
program keeps the CPU as long as
it’s not doing an IO or other
waiting operation
- Round robin approach
- Multiprogramming Systems are
usually multi-user. Several users
can be launching programs
Multitasking
- Multiple jobs are executed at the
“same time” … not really
- The CPU is shared between
processes – Time sharing / time
slicing approach
- Preemptive scheduling: The CPU
is taken away from the running
process when it does a system
call or when it’s time share is
expired.
- Concurrent execution: the fact
that several processes access
resources at the “same time”
requires specific protections
Multithreading
- This is an extension of multi-
tasking
- A new notion exists in addition of
processes : threads, a sort of
lightweight process
- A Process always has a main
thread but can create as many
other threads as it wants
- Every thread has its own stack but
shares the other areas (data, code
and heap) with the “main”
process
- Threads are represented by TCB
– Thread Control Block
Multiprocessing
- Multi-processing happens when
there are several CPUs in the
computer.
- In contrary to multi-tasking, there
is really several processes running
at the same time.
- Semantically, parallel execution –
as happens in multiprocessing –
doesn’t require additional
protections than those already
used for concurrent programming
(multitasking) – as far as software
is concerned at least
36. 36
Concurrency issues : e.g. Lost Update
Time
Process
A
Virtual
Memory
System Call Map
TEXT (Code)
Data
HEAP
Cmd Line & Env. Vars.
STACK
T
1
STACK
T
2
Muller
Family
Account
Balance
Main concurrency issues (data corruption)
Lost updates (this example above)
Dirty read (or invalid data)
Unrepeatable read (or inconsistent retrievals / dirty summary – more common in DBMS)
1000 CHF
At the same time, Mrs Muller
debits 250 CHF from an ATM
from her family account.
Mr Muller arrives at the bank
desk and debits 200 CHF from
his family account.
The Desk cash application reads the
balance and computes the new balance, but
loose CPU before writing it back.
1000 CHF 1000 CHF
The ATM application reads the balance and
computes the new balance, but loose CPU
before writing it back.
1000 –200=800 CHF
1000 –250=750 CHF
The Desk cash application get the
CPU back and writes the new
balance from its perspective
800 CHF
1000 CHF 800 CHF
The ATM application get
the CPU back and
writes the new balance
from its perspective
750 CHF
800 CHF 750 CHF
37. 37
This kind of problems are called race conditions
The set of rules, methods, design methodologies, and theories to maintain the consistency of
components operating concurrently while interacting, and thus the consistency and correctness
of the whole system, is called Concurrency Control.
In Concurrency Control, the set of techniques deployed to prevent race conditions, are called
Mutual Exclusion techniques
Mutual Exclusion aims at avoiding simultaneous access to memory addresses or object by multiple threads
In programming languages, mutual exclusion is often achieved by using the critical section pattern
A critical section is a part of the program where the shared resource is accessed. It is protected in ways
that avoid the concurrent access.
Concurrency Control
38. 38
Muller
Family
Account
Balance
Simple solution : locking
Time
1000 CHF
At the same time, Mrs Muller
debits 250 CHF from an ATM
from her family account.
Mr Muller arrives at the bank
desk and debits 200 CHF from
his family account.
The Desk cash application , locks the account
,reads the balance and computes the new
balance, but loose CPU before writing it back.
1000 CHF 1000 CHF
The ATM application tries
to lock the account in its
turn but can’t, it yields …
The Desk cash application get the CPU back
and writes the new balance from its
perspective, then unlocks the account
800 CHF
1000 CHF 800 CHF
The ATM application get the
CPU back and writes the new
balance from its perspective,
then unlocks the account
550 CHF
1000 –200=800 CHF
LOCK
UNLOCK
The ATM application locks the
account, reads the balance and
computes the new balance, but loose
CPU before writing it back.
800 –250=550 CHF
LOCK
800CHF 550CHF
UNLOCK
TRY LOCK
T
1
T
2
Lock owned
by T1
Lock owned
by T2
39. 39
MUTEX
A MUTEX is a mutually exclusive flag.
It acts as a gate keeper to a section of code
allowing one thread in and blocking access to all other.
A thread can try to acquire() or wait() to acquire a MUTEX. Once it has it, another thread can't acquire it
as long as the first third don't release it
Semaphores
A semaphore record of how many units of a
particular protected resource are available,
coupled with operations to adjust that record
safely as units are acquired or become free, and, if necessary, wait until a unit of the resource becomes
available.
A Semaphore with a resource count of 1 behaves as a MUTEX
If the resource count is 0, then a thread calling P() would be blocked until another thread notifies the
release of is resource with V(), which would increment the resource count to 1 and wake up the waiting
thread.
Kernel primitives
T 1 T 2
Shared Resource
MUTEX
aquire() / wait() aquire() / wait()
release() release()
Protected Resource List
T 1 T 2
Semaphore
P()–wait(decrease)
V()–signal(inc)
P()–wait(decrease)
V()–signal(inc)
40. 40
Shared-lock / exclusive-lock model
Also called read-write lock model
A reading thread would typically take the lock with a shared flag. As such it would hold the lock in a shared way. Other
threads needed to read would be able to take the lock concurrently in a shared way as well without being blocked
A writing thread would take the lock with an exclusive flag. As long as at least one reading thread holds a shared lock, the
writing thread would be block and prevented from taking its exclusive lock.
As soon as all readers are completed, the writing thread could take its exclusive lock and all reading thread would be
blocked as long as it’s not completed
Monitor / Protected Object
A monitor is most of the time an high-level wrapper around a MUTEX, providing additional commodities.
Synchronized blocks / critical sections
Programming language structure to perform synchronization without bothering with low-level locks or MUTEX
Other approaches : there are other approaches to locking techniques when it comes to concurrency controls
Optimistic locking
Concurrent versions
Etc.
In real life …
41. 41
Typical Problems
Starvation - describes a situation where a thread is unable to gain regular access to shared resources and
is unable to make progress. This happens when shared resources are made unavailable for long periods by
"greedy" threads.
Deadlocks – is a state in which each member of a group waits for another member, including itself, to take
action, such as sending a message or more commonly releasing a lock.
Livelock - A thread often acts in response to the action of another thread. If the other thread's action is
also a response to the action of another thread, then livelock may result. As with deadlock, livelocked
threads are unable to make further progress. However, the threads are not blocked — they are simply too
busy responding to each other to resume work.
Notion of Atomicity
An atomic section or operation is an indivisible and irreducible series of operations such that either all
occurs, or nothing occurs. A guarantee of atomicity prevents updates to a shared object occurring only
partially, which can cause greater problems than rejecting the whole series outright
Critical section (or synchronized code) enable to render a set of instructions atomic
Last notes on concurrency control
43. 43
Notion of socket
Client Computer
HW
OS
Kernel
Network Device
Drivers
Network
Protocols
Network Adapter
Network
Server Computer
HW
OS
Kernel
Network Device
Drivers
Network
Protocols
Network Adapter
Network
Socket
Socket
IN Byte Stream …01100101010110…
OUT Byte Stream …10101101000100…
A network socket is a software structure that serves as an endpoint for sending and receiving data across a network.
The structure and properties of a socket are defined by an API - Application Programming Interface - for the networking architecture.
Sockets are created only during the lifetime of a process of an application running in the node.
Because of the standardization of the TCP/IP protocols in the development of the Internet, the term network socket is most commonly used
in the context of the IP - Internet Protocol - suite
It's hence also often called Internet Socket.
In this context, a socket is externally identified to other hosts by its socket address, which is the triad of transport protocol, IP address, and port number.
The term socket is also used for the software endpoint of node-internal inter-process communication (IPC), which often uses the same API
as a network socket.
45. 45
The TCP/IP protocol
Client
Computer
TCP
Client
Socket
Server Computer
TCP Server
Socket
bind()
sendto()
data (request)
recvfrom()
Blocks until some
data is received
from a client
Do something
socket()
socket() close()
TCP
Socket
listen() accept()
Blocks until a
connection from a
client comes
Connection establishment
(TCP 3-way handshake)
connect()
socket() recvfrom()
Blocks until some
data is received
from a client
Do something sendto()
accept()
Blocks until another
connection from a
client comes
close()
data (response)
TCP
Session
ACK ACK
46. 46
There are hundreds of other
protocols but TPC/IP and
UDP/IP are the most common
The OSI Model - Open Systems
Interconnection Model - is a
conceptual framework used to
describe the functions of a
networking system.
It characterizes computing functions
into a universal set of rules and
requirements in order to support
interoperability between different
products and software.
Every upper protocol is
“wrapped” in lower protocols
datagrams
Standard OSI Model
• End User Layer, Application Management
• HTTP, FTP, IRC, SSH, RCP, DNS, VoIP, VoD, etc.
7. Application
• Syntax Layer, Data Representation, Encryption, Compression
• SSL, SSH, IMAP, POP3, FTP, SNMP, VoIP, VoD, etc.
6. Presentation
• Sync, send/receive, Session Management, OS Infrastructure
• Sockets – (WinSock, POSIX Sockets, APIs, etc.)
5. Session
• End-to-End connection management, Transmission
• TCP, UDP, etc.
4. Transport
• Packets, Logical Addressing, Routing
• IP, ICMP, ARP, IPSec, IPX/SPX, etc.
3. Network
• Physical Frames, Physical Addressing, Switching
• Ethernet, PPP, WI-FI,, IEEEE 802.2, etc.
2. DataLink
• Physical Medium / Electric structure
• Coaxial 10Base-T, Fiber, Ethernet RJ45, Token Ring, etc.
1. Physical
Media
Layers
Software
/
Host
Layers
Kernel
Applications
Hardware
48. 48
A modern processor has two different modes: user mode and kernel mode.
The processor switches between the two modes depending on what type of code is running on the processor.
Applications run in user mode, and core operating system components run in kernel mode. While many drivers run in kernel mode, some drivers may
run in user mode.
The system starts in kernel mode when it boots and after the operating system is loaded, it executes applications in user mode.
When an application running in user mode does a system call, the kernel executes the system call in kernel mode
User mode
The User mode is normal mode where the process has limited access to memory, CPU functions and hardware access.
The kernel provides applications with a private virtual address space and a private handle table.
Because an application's virtual address space is private, one application cannot alter data that belongs to another application. Each application
runs in isolation, and if an application crashes, the crash is limited to that one application. In addition to being private, the virtual address space of a
user-mode application is limited.
Some system-level operations in the CPU cannot be used in user mode
A transition from user mode to kernel mode occurs when the application does a system all or an interrupt occurs
Kernel Mode
If an Operating System is properly designed, only the kernel code runs on the CPU in kernel mode
All code that runs in kernel mode shares a single virtual address space.
There are some privileged instructions that can only be executed in kernel mode.
In kernel mode, the kernel has unrestricted access to the memory, hardware ressources, all CPU functions, etc.
User mode vs Kernel Mode
50. 50
Hardware
CPU, RAM, Disk, IO, buses, Memory adresses and data
Fundamental CPU components, Fundamental physical RAM elements
Computer Fundamental notions
System calls and the Standard C library
Process memory layout in stack, heap, data and text
Dynamic allocation in heap
Behaviour of the stack
Kernel key features:
Virtual Memory
Process scheduling
System calls
Operating Systems fundamental notions:
Processes, PCB, Context Switches, Threads, TCB, simple priority based scheduling
Preemptive scheduling, multiprogramming, multitasking, multithreading, multiprocessing
Concurrent thread execution, parallel thread execution
Thread / Process synchronization, MUTEX, Critical Sections, Semaphores
Network socket, TCP/IP, UDP/IP, OSI Standard Model
Notions we learned
51. 51
Want to know more ?
IPC – Inter-Process Communications
Process synchronization
Protection & Security – OS protections, kernel features, standard UNIX security model, SELinux
Mass-storage and filesystems design
More on system calls …
Framebuffer for video display - OpenGL / DirectX
Other essential notions
Data / address / control bus ???
Soundcard / DA / AD
GPU PCI bus (hardware support for OpenGL, Direct3D, CUDA, HLSL, framebuffer support, etc. ) GPU DMA (Direct memory access)
Memory management registers
MAR : Memory Address Register - MAR emits instruction to address bus so that the specific cell in the memory is activated.
MBR – Memory Buffer Register – MBR contains the data to read / write
CBR – Control Bus Register - CBR emits READ or WRITE command to control bus.
Program Execution management registers :
IP – Instruction Pointer – also called PC – Program Counter - points to the next instruction the CPU should execute,
SP – Stack Pointer – points to the last value pushed onto the stack
BP – Base Pointer - is the base pointer for the current stack frame.
CPU state management register :
- SR – Status Register - contains information about the state of the processor
ebp is usually set to esp at the start of the function. Function parameters and local variables are accessed by adding and subtracting, respectively, a constant offset from ebp
ESP is the current stack pointer, which will change any time a word or address is pushed or popped onto/off off the stack. EBP is a more convenient way for the compiler to keep track of a function's parameters and local variables than using the ESP directly.
Data Registers:
AX is the primary accumulator; it is used in input/output and most arithmetic instructions. For example, in multiplication operation, one operand is stored in EAX or AX or AL register according to the size of the operand.
BX is known as the base register, as it could be used in indexed addressing.
CX is known as the count register, as the ECX, CX registers store the loop count in iterative operations.
DX is known as the data register. It is also used in input/output operations. It is also used with AX register along with DX for multiply and divide operations involving large values.
Index Registers:
SI - Source Index- is used as source index for string operations.
DI - Destination Index- is used as destination index for string operations.
Control Registers:
OF - Overflow Flag - indicates the overflow of a high-order bit (leftmost bit) of data after a signed arithmetic operation.
DF - Direction Flag - determines left or right direction for moving or comparing string data. When the DF value is 0, the string operation takes left-to-right direction and when the value is set to 1, the string operation takes right-to-left direction.
IF - Interrupt Flag - determines whether the external interrupts like keyboard entry, etc., are to be ignored or processed. It disables the external interrupt when the value is 0 and enables interrupts when set to 1.
TF - Trap Flag - allows setting the operation of the processor in single-step mode. The DEBUG program we used sets the trap flag, so we could step through the execution one instruction at a time.
SF - Sign Flag - shows the sign of the result of an arithmetic operation. This flag is set according to the sign of a data item following the arithmetic operation. The sign is indicated by the high-order of leftmost bit. A positive result clears the value of SF to 0 and negative result sets it to 1.
ZF - Zero Flag - It indicates the result of an arithmetic or comparison operation. A nonzero result clears the zero flag to 0, and a zero result sets it to 1.
AF - Auxiliary Carry Flag - contains the carry from bit 3 to bit 4 following an arithmetic operation; used for specialized arithmetic. The AF is set when a 1-byte arithmetic operation causes a carry from bit 3 into bit 4.
PF - Parity Flag - indicates the total number of 1-bits in the result obtained from an arithmetic operation. An even number of 1-bits clears the parity flag to 0 and an odd number of 1-bits sets the parity flag to 1.
CF – Carry Flag - contains the carry of 0 or 1 from a high-order bit (leftmost) after an arithmetic operation. It also stores the contents of last bit of a shift or rotate operation.
Segment registers:
CS - Code Segment - contains all the instructions to be executed. A 16-bit Code Segment register or CS register stores the starting address of the code segment.
DS - Data Segment - contains data, constants and work areas. A 16-bit Data Segment register or DS register stores the starting address of the data segment.
SS - Stack Segment - contains data and return addresses of procedures or subroutines. It is implemented as a 'stack' data structure. The Stack Segment register or SS register stores the starting address of the stack.
Components:
ALU – Arithmetic logic unit - is a combinational digital circuit that performs arithmetic and bitwise operations on integer binary numbers
FPU - Floating-point unit - is specially designed to carry out operations on floating-point numbers.[1] Typical operations are addition, subtraction, multiplication, division, and square root. Some FPUs can also perform various transcendental functions such as exponential or trigonometric calculations
AGU – Address Generation Unit - sometimes also called address computation unit (ACU) is an execution unit inside the CPU that calculates addresses used by the CPU to access main memory. By having address calculations handled by separate circuitry that operates in parallel with the rest of the CPU, the number of CPU cycles required for executing various machine instructions can be reduced, bringing performance improvements.
MMU - Memory Management Unit - sometimes called paged memory management unit (PMMU) has all memory references passed through itself, primarily performing the translation of virtual memory addresses to physical addresses. An MMU effectively performs virtual memory management, handling at the same time memory protection, cache control, bus arbitration
CU - Control Unit - directs the operation of the processor. It tells the computer's memory, arithmetic and logic unit and input and output devices how to respond to the instructions that have been sent to the processor. It directs the operation of the other units by providing timing and control signals. Most computer resources are managed by the CU.
Control Unit Components
IR - Instruction register - or current instruction register (CIR) holds the instruction currently being executed or decoded.] In simple processors, each instruction to be executed is loaded into the instruction register, which holds it while it is decoded, prepared and ultimately executed, which can take several steps. Modern processors use a pipeline of instruction registers where each stage of the pipeline does part of the decoding, preparation or execution and then passes it to the next stage for its step. Modern processors can even do some of the steps out of order as decoding on several instructions is done in parallel.
Clock - usually just a specific clock that roughly times how fast the majority of the logic in a computer operates, how many state changes the computer can do in a second.
Instruction Decoder / Cycle Encoder - decodes and interprets the contents of the Instruction Register, i.e. its splits whole instruction into fields for the Control Unit to interpret.
Control Logic Circuits - create the control signals themselves, which are then sent around the processor. These signals inform the other components (ALU, MMU, etc.) and the register array what they actions and steps they should be performing, what data they should be using to perform actions, and what should be done with the results.