2. Traditional Win32 Processes
A process is the set of resources (system libraries
and primary thread) and the memory allocations
used by a running application.
For each *.exe loaded into memory, OS creates
separate and isolated process
The failure of one process does not affect the
functioning of another
Every Win32 process is assigned a unique Process
Identifier or PID
3. Overview of threads
Every Win32 process has exactly one main
“thread” that functions as the entry point for the
application
A thread is a path of execution within a process
The first thread created by the process entry point,
or Main() is termed the 'primary thread'
The primary thread can be made to 'spawn'
additional secondary threads using Win32 API
functions like CreateThread()
4. Overview of threads
Each thread , primary or secondary, is a unique
path of execution in the process and has
concurrent access to all shared points of data
Using too many threads in a process, in a single
CPU system, may actually DEGRADE
performance since the CPU has to switch between
the threads
Single CPU systems use 'time slice' to service each
thread for a unit of time. It provides 'Thread local
storage' for each thread to maintain state between
time slices
If a process does not have any foreground threads,
the process ends, even if there are active
background threads
6. Asynchronous delegates
In .NET , usual pattern for implementing an
asynchronous method call is for some object to
expose two methods, BeginXXX() and EndXXX()
where XXX is the name of the method
BeginXXX() is the method that is called to start the
operation. It returns immediately , with the method
left executing – on a thread pool thread
EndXXX() is called when the results are required. If
the operation is still executing, EndXXX() waits
until it is completed before returning the values
7. Asynchronous delegate design
pattern
Some of the .NET classes which inherently
implement this pattern
System.IO.FileStream (BeginRead()/ EndRead())
System.Net.WebRequest(BeginRequest() /
EndRequest())
System.Windows.Forms.Control (BeginInvoke() /
EndInvoke())
System.Messaging.MessageQueue
(BeginReceive() / EndReceive())
8. Asynchronous delegate design
pattern
You can asynchronously invoke any method in .NET
by wrapping it in a delegate
Every delegate in .NET creates a BeginInvoke() and
EndInvoke() method for a delegate
We will use the IAsyncResult interface, which has 4
important properties :
AsyncState is some data passed to callback method
AsyncWaitHandle is a locking mechanism
CompletedSynchronously is a boolean (completed
on this thread ?)
IsCompleted is a boolean (operation completed ?)
9. CLR Threads
Currently each logical CLR thread uses one
physical Windows thread
In future the CLR may have its own threads ,
independent of the windows threads
So, .NET programmers should use CLR threads
and not Windows threads
CLR threads can either be created explicitly using
'new Thread()' method , or implicitly (thread pool)
when we invoke asynchronous operations
10. CLR Threads
Some processes also use multiple threads for
isolation. For example, the common language
runtime (CLR) has a finalizer thread that wants to
run in a predictable manner regardless of what
some other thread happens to do.
11. History of Windows threads
16 bit versions of Windows were single threaded,
and if one application went into a loop, the entire
system froze
Windows NT 3.1 was first multi threaded
Windows OS, where each process got its own
thread, and if that process looped, only that
process froze and other processes ran
12. Efficiency of threads
Threads are an overhead
For each thread, a thread kernel object has to be
allocated and initialized
Creation of each thread allocates 1 MB of address
space and another 12 KB for kernel mode stack
After creating a thread, Windows notifies every
DLL in the process about this new thread
When a thread is destroyed , every DLL is again
notified
13. Efficiency of threads
In a single CPU computer only one thread can run
at a time
So, in single CPU systems, Windows changes
context to other threads every 20 milliseconds
This switching is called 'context switch'
All this makes Windows slower than if it was on a
single thread
14. Steps in Context switching
Enter kernel mode
Save CPU registers in current threads kernel
object
Acquire 'spin lock'
Determine which thread to switch to
Release 'spin lock'
Load to CPU registers from new threads kernel
object
Leave kernel mode
15. Moral of story
Limit usage of threads especially on single CPU
systems
Threading on single CPU systems only makes
systems slower due to context switching, and also
takes up more memory for thread maintenance
However, as we begin to use multiple CPU chips
we may have to use threading to extract better
performance
Ideally speaking, there should never be more
threads in existence than there are CPUs in your
computer
16. Hyper threading and Multi core
Chip makers use hyper threading and multi core as
2 manufacturing techniques
Hyper threading (Intel Xeon and Intel Pentium 4)
has 2 logical CPU's on a single chip
Each logical CPU in Hyper threading has its own
CPU register but shares a CPU cache between the
2 CPUs
Hyper threaded CPUs give 10 to 30% boost to
performance (not 100%)
17. Multi core
A multi core chip (Intel Pentium D , AMD Athlon
64 X2) has two physical CPU's on it.
Better performance compared to Hyper threaded
chips since each CPU has dedicated CPU registers
and CPU cache
In future chips will come with even 4, 8, 16, or 32
CPUs in them. This is because chips have reached
the limit to their speed. Only way to grow is to
have more CPUs per chip.
18. CLR thread pool
Since creating and destroying threads is expensive,
CLR creates thread pools when we program
asynchronous operations.
One thread pool per process, for all AppDomains
in process
There is a thread pool queue, and if there are no
threads in the pool , CLR creates one
CLR reuses same thread for all requests until it till
it crosses some limit. Then another thread is added
to pool
If a thread pool thread is idle for 2 minutes, it is
killed.
Thread pool threads are all background threads
19. When to create dedicated thread
If you want the thread to be in a particular state
that is not so in Thread pool thread
If you want to run at a special priority
If you wanted a foreground thread so that process
does not end till this thread ends
If the compute bound thread would be very long
running
If you wanted to abort it prematurely
20. Thread pool limit
Thread pool has 'worker threads' and 'I/O threads'
Worker threads are used when application asks
thread pool to perform asynchronous compute
bound operation
I/O threads are used to access a file, network
server, database, web service, or other hardware
device.
In .NET 2.0, max number of worker threads
default is 25 per CPU, and max number of I/O
threads is 1000 per CPU.
Try to avoid a worker thread calling an I/O thread
since that can suspend operations till the I/O
thread is over
21. Asynchronous operations
To queue an asynchronous compute bound
operation to the thread pool
Static boolean QueueUserWorkItem(WaitCallback
callBack)
Static boolean QueueUserWorkItem(WaitCallback
callBack, Object state);
Static boolean
unsafeQueueUserWorkItem(WaitCallback
callBack, Object state);
A 'work item' is the method identified by the
CallBack parameter that will be called by the
ThreadPool thread
22. System.Threading.Timer
When you construct an instance of the Timer class,
you are telling the CLR that you want a method of
yours called back at a specified time by a Thread
pool thread
One of the Timer constructors is
Public Timer(TimerCallback callback, Object
state, Int32 dueTime, Int32 period)
The callback parameter is the method that the
thread should call after it has done its job
23. Three timers in .NET
System.Threading's Timer class to perform
periodic background tasks on another thread
System.Windows.Form's timer class to wake up
and send messages to desired callback method.
System.Timer's timer class used if you want to
place a timer on a design surface. Essentially same
as System.Threading's timer.
24. Deadlocks
A deadlock is a
situation wherein two
or more competing
actions are waiting for
the other to finish, and
thus neither ever does.
It is often seen in a
paradox like 'the
chicken or the egg'.
25. Livelocks
As a real-world example, livelock occurs when
two people meet in a narrow corridor, and each
tries to be polite by moving aside to let the other
pass, but they end up swaying from side to side
without making any progress because they always
both move the same way at the same time.
26. Thread Synchronization
Thread synchronization is required when two or
more threads might access a shared resource at the
same time
A resource can be as simple as a block of memory
or a single object, or it can be much more
complex, like a collection object that contains
thousands of objects inside it, each of which may
contain other objects as well
27. Race conditions
Thread T1 modifies resource R, releases its Write
lock to R, retakes the Read lock to R and uses R.
During the interval between giving up the write
lock and taking the read lock, thread T2 has
modified the state of R.
28. CPU Cache latency
CPU Caches to improve performance. However,
the cache will flush to the memory only at
periodical intervals. This can make multiple
threads think that a field has different values at the
same time.
Variables marked as 'Volatile' will overcome this
problem. Microsoft's latest JIT compilers also
overcome this problem irrespective of the non
usage of Volatile keyword.
29. System.Threading.Interlocked
Since most asynchronous operations are sharing
integer variables, the Interlocked class provides
Increment(ref varName), Decrement(ref
varName), Add(ref varName) static methods to
work in a thread safe manner
It also has Exchange() and CompareExchange()
methods to exchange states
30. System.Threading.Monitor class
Lock the critical section of code with a
Enter(Object) and Exit(Object) block to lock those
sections
When a thread calls the Enter() method it waits to
have exclusive access rights to the object
When it exits, the next call to Enter() is serviced
31. The lock C# keyword
An elegant alternative to Monitor.Enter() and
Monitor.Exit()
Syntax is lock (typeof (classname)) { code that
needs to be thread safe }
32. SyncRoot pattern
Since Monitor and Lock can be applied from
outside the class, effectively locking a portion of
the class, it is better to create a private member
within the class, and lock that :
Private objectInstanceSyncRoot = new Object();
Lock (instanceSyncRoot) { code that needs to be
thread safe }
33. Mutex (Win32 Thread lock
mechanism)
Mutually Exclusive lock
Close to the use of Monitor with a few differences
like same mutex can be used in several processes ,
but Monitor does not allow waiting on several
objects
34. Semaphore (Win32 locking)
Similar to Mutex but uses a counter to keep track
of how many threads are accessing a particular
resource. So it allows a certain number of threads
to access a resource simultaneously
35. Windows kernel objects for thread
synchronization
The CLR exposes Win32 objects for thread
synchronization. However, these are to be avoided
since Managed to unmanaged is extremely slow
WaitHandle
Mutex
Semaphore
EventWaitHandle
AutoResetEvent
ManualResetEvent
36. Events
To have a threadpool thread call your callback
method when a kernel object becomes signaled
Microsoft realized that many threads are spawned
just to wait on other threads . WaitEvents are
meant to handle this kind of events. The
RegisterWaitForSingleObject can act on a
Semaphore, or a Mutex, or a AutoResetEvent or a
ManualResetEvent object
37. Thread synchronization
Adding thread synchronization to your code makes
the code run slower, hurting performance and
reducing scalability
Writing thread synchronization code is difficult,
and doing it incorrectly can lead to resources in
inconsistent states causing unpredictable behavior