This document discusses parallel and asynchronous programming. It begins by explaining how processors are getting smaller while networks are getting worse, requiring more efficient parallel programming approaches. It then covers different parallel programming models in .NET like data parallelism using PLINQ, task parallelism using TPL, asynchronous programming with async/await, and concurrent collections. It also discusses challenges like cancellation, progress reporting, and synchronization, and how modern .NET addresses these.
3. • Processors are getting smaller
• Networks are getting worse
• Operating Systems demand it
• Only a subset of the code can run in parallel
Why
4. • Once, a single-thread process could use 100%
of the CPU
• 16% ΜΑΧ ona Quad core LAPTOP with
HyperThreading
• 8% ΜΑΧ on an 8 core server
Processors are getting smaller
5. • Hand-coded threads and synchronization
• BackgroundWorker
Heavy, cumbersome, single threaded, inadequate progress reporting
• EAP: From event to event
Complicated, loss of continuity
• APM: BeginXXX/EndXXX
Cumbersome, imagine socket programming with Begin/End!
or rather ...
What we used to have
7. • Collisions
Reduced throughput
Deadlocks
• Solution: Limit the number of threads
ThreadPools
Extreme: Stackless Python
Copy data instead of shared access
Extreme: Immutable programming
The problem with threads
8. • How can I speed-up my algorithm?
• Which parts can run in parallel?
• How can I partition my data?
Why should I care about
threads?
10. • Beat the yolks with 2/3 of sugar until fluffy
• Beat the whites with 1/3 of sugar to stiff meringue
• and add half the mixture to the yolk mixture.
• Mix semolina with flour and ground coconut ,
• add rest of meringue and mix
• Mix and pour in cake pan
• Bake in pre-heated oven at 170οC for 20-25 mins.
• Allow to cool
• Prepare syrup, boil water, sugar, lemon for 3 mins.
• Pour warm syrup over revani
• Sprinkle with ground coconut.
Synchronous Revani
11. Parallel Revani
• Beat yolks • Beat Whites
• Add half mixture
• Mix semolina
• Add rest of meringue
• Mix
• Pour in cake pan
• Pour syrup
• Sprinkle
• Bake • Prepare syrup
12. • Support for multiple concurrency scenarios
• Overall improvements in threading
• Highly Concurrent collections
What we have now
13. Scenaria
• Faster processing of large data
• Number crunching
• Execute long operations
• Serve high volume of requests
• Social Sites, Web sites, Billing, Log aggregators
• Tasks with frequent blocking
• REST clients, IT management apps
15. • Partition the data
• Implement the algorithm in a function
• TPL creates the necessary tasks
• The tasks are assigned to threads
• I DON’T’T have to define the number of
Tasks/Threads!
Data Parallelism – Recipe
17. • Parallel execution of lambdas
• Blocking calls!
• We specify
Cancellation Token
Maximum number of Threads
Task Scheduler
Parallel class Methods
18. • LINQ Queries
• Potentially multiple threads
• Parallel operators
• Unordered results
• Beware of races
List<int> list = new List<int>();
var q = src.AsParallel()
.Select(x => { list.Add(x); return x; })
.Where(x => true) .Take(100);
PLINQ
19. • Doesn’t use SSE instructions
• Doesn’t use the GPU
• Isn’t using the CPU at 100%
What it can’t do
22. • Break the problem into steps
• Convert each step to a function
• Combine steps with Continuations
• TPL assigns tasks to threads as needed
• I DON’T have to define number of
Tasks/Threads!
• Cancellation of the entire task chain
Task Parellelism – Recipe
24. • Problem: How do you cancel multiple tasks
without leaving trash behind?
• Solution: Everyone monitors a
CancellationToken
TPL cancels subsequent Tasks or Parallel operations
Created by a CancellationTokenSource
Can execute code when Cancel is called
Cancellation
25. • Problem: How do you update the UI from inside
a task?
• Solution: Using an IProgress<T> object
Out-of-the-Box Progress<T> updates the current Synch Context
Any type can be a message
Replace with our own implementation
Progress Reporting
26. • Calculate a value only when needed
• Lazy<T>(Func<T> …)
• Synchronous or Asynchronous calculation
Lazy.Value
Lazy.GetValueAsync<T>()
Lazy Initialization
27. • Since .NET 2.0!
• Hides Winforms, WPF, ASP.NET
SynchronizationContext.Post/Send instead of Dispatcher.Invoke etc
Synchronous and Asynchronous version
• Automatically created by the environment
SynchronizationContext.Current
• Can create our own
E.g. For a Command Line aplication
Synchronization Context
29. • Support at the language leve
• Debugging support
• Exception Handling
• After await return to original “thread”
Beware of servers and libraries
• Dos NOT always execute asynchronously
Only when a task is encountered or the thread yields
Task.Yield
Async/Await
30. private static async Task<T>
Retry<T>(Func<T> func, int retryCount) {
while (true) {
try {
var result = await Task.Run(func);
return result;
}
catch {
If (retryCount == 0)
throw;
retryCount--;
} } }
Asynchronous Retry
31.
32. • Highly concurrent
• Thread-safe
• Not only for TPL/PLINQ
• Producer/Consumer scenaria
More Goodies - Collections
34. • Duplicates allowed
• List per Thread
• Reduced collisions for each tread’s Add/Take
• BAD for Producer/Consumer
The Odd one - ConcurrentBag
35. • NOT faster than plain collections in low
concurrency scenarios
• DO NOT consume less memory
• DO NOT provide thread safe enumeration
• DO NOT ensure atomic operations on content
• DO NOT fix unsafe code
Concurrent Collections -
Gotchas
36. • Visual Studio 2012
• Async Targeting package
• System.Net.HttpClient package
Also in .NET 4
37. • F# async
• C++ Parallel Patterns Library
• C++ Concurrency Runtime
• C++ Agents
• C++ AMP
Other Technologies
38. • Object storage similar to Amazon S3/Azure Blob
storage
• A Service of Synnefo – IaaS by GRNet
• Written in Python
• Clients for Web, Windows, iOS, Android, Linux
• Versioning, Permissions, Sharing
40. • REST API base on CloudFiles by Rackspace
Compatible with CyberDuck etc
• Block storage
• Uploads only using blocks
• Uses Merkle Hashing
Pithos API
41. • Multiple accounts per machine
• Synchronize local folder to a Pithos account
• Detect local changes and upload
• Detect server changes and download
• Calculate Merkle Hash for each file
Pithos Client for Windows
43. • .ΝΕΤ 4, due to Windows XP compatibility
• Visual Studio 2012 + Async Targeting Pack
• UI - Caliburn.Micro
• Concurrency - TPL, Parallel, Dataflow
• Network – HttpClient
• Hashing - OpenSSL – Faster than native provider for hashing
• Storage - NHibernate, SQLite/SQL Server Compact
• Logging - log4net
Technologies
44. • Handle potentially hundrends of file events
• Hashing of many/large files
• Multiple slow calls to the server
• Unreliable network
• And yet it shouldn’t hang
• Update the UI with enough information
The challenges
45. • Use producer/consumer pattern
• Store events in ConcurrentQueue
• Process ONLY after idle timeout
Events Handling
46. • Why I hate Game of Thrones
• Asynchronous reading of blocks
• Parallel Hashing of each block
• Use of OpenSSL for its SSE support
• Concurrency Throttling
• Beware of memory consumption!
Merkle Hashing
47. • Each call a task
• Concurrent REST calls per account and share
• Task.WhenAll to process results
Multiple slow calls
48. • Use System.Net.Http.HttpClient
• Store blocks in a cache folder
• Check and reuse orphans
• Asynchronous Retry of calls
Unreliable network
49. • Use Transactional NTFS if available
Thanks MS for killing it!
• Update a copy and File.Replace otherwise
Resilience to crashes
50. • Use of independent agents
• Asynchronous operations wherever possible
Should not hang
51. • Use WPF, MVVM
• Use Progress to update the UI
Provide Sufficient user feedback
52. • Create Windows 8 Dekstop and WinRT client
• Use Reactive Framework
Next Steps
ΖΗΤΟΥΝΤΑΘ ΕΘΕΛΟΝΤΕΣ
53.
54. • Avoid Side Effects
• Use Functional Style
• Clean Coding
• THE BIG SECRET:
Use existing, tested algorithms
• IEEE, ACM Journals and libraries
Clever Tricks
55. • Simplify asynchronous or parallel code
• Use out-of-the-box libraries
• Scenarios that SUIT Task or Data Parallelism
YES TPL
56. • To accelerate “bad” algorithms
• To “accelerate” database access
Use proper SQL and Indexes!
Avoid Cursors
• Reporting DBs, Data Warehouse, OLAP Cubes
NO TPL
57. • Functional languages like F#, Scala
• Distributed Frameworks like Hadoop, {m}brace
When TPL is not enough
58. • C# 5 in a Nutshell, O’Riley
• Parallel Programming with .NET, Microsoft
• Pro Parallel Programming with C#, Wiley
• Concurrent Programming on Windows, Pearson
• The Art of Concurrency, O’Reilly
Books
59. • Parallel FX Team:
http://blogs.msdn.com/b/pfxteam/
• ΙΕΕΕ Computer Society
http://www.computer.org
• ACM http://www.acm.org
Useful Links