O slideshow foi denunciado.
Seu SlideShare está sendo baixado. ×

Overview Of Parallel Development - Ericnel

Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Anúncio
Carregando em…3
×

Confira estes a seguir

1 de 38 Anúncio

Mais Conteúdo rRelacionado

Semelhante a Overview Of Parallel Development - Ericnel (20)

Mais de ukdpe (20)

Anúncio

Mais recentes (20)

Overview Of Parallel Development - Ericnel

  1. 1. Overview of Parallel Development Visual Studio 2010 + a little on Axum and Concurrent Basic Eric Nelson Eric.nelson@microsoft.com http://geekswithblogs.net/iupdateable http://blogs.msdn.com/goto100 http://twitter.com/ericnel 1
  2. 2. Microsoft UK MSDN Flash Newsletter Every two weeks, pure joy enters your Inbox  MSDN Flash Podcast Pilot For feedback http://bit.ly/flashpod1 http://msdn.microsoft.com/uk/flash MSDN Flash eBook 13 of the “Best Technical Technical Authors wanted Articles of 2008” for the Flash – 400 to 500 http://bit.ly/flashebook1 words. Fancy it?
  3. 3. Agenda Overview of what we are up to Drill down into parallel programming for managed developers If we have time, “heads up” on Axum and CB
  4. 4. Things I learnt... We have a very large investment in parallel computing We have “something for everyone” It is not all synced, it is sometimes overlapping It is a big topic Managed vs native vs client vs server vs task vs data... Even with the investment, design/code/test for parallel is far harder Locking, Deadlocks, Livelocks It is about getting ready for the future Code today – run better tomorrow? VS2010 CTP – not a great place for parallel Single core in guest Unsupported route to use Hyper-V Easiest route to dabble – Microsoft Parallel Extensions June CTP for VS2008
  5. 5. Buying a new Processor £100 - £300 Core 64-bit 2-3GHz Core 2 cores or 4
  6. 6. Buying a new Processor £200 - £500 Core Core Core Core 64-bit 2-3GHz 4 cores with HT Memory Controller QuickPath Interconnect
  7. 7. Where will it all end?
  8. 8. Was it a wise purchase? App 1 My Code .NET Framework .NET CLR App 1 App 2 ... Windows OS
  9. 9. Was it a wise purchase? Some environments scale to take advantage of additional CPU cores (mostly server-side) ... ASP.NET Web Forms/Services WCF Services WF Engine .NET ThreadPool or Custom Threading Strategy A lot of code does not (mostly client-side) This code will see little benefit from future hardware advances 
  10. 10. What happened to “The Free Lunch”? Bad sequential code will run faster on a faster processor Bad parallel code WILL NOT run faster on more cores Just using parallel code is not enough Speedup 3 2.5 2 1.5 Speedup 1 0.5 0 1 2 4 8 16 32
  11. 11. Applications Can Scale Well 64 Production Fluid Production Face Production Cloth Parallel Speedup 48 Game Fluid Game Rigid Body Game Cloth 32 Marching Cubes Sports Video Analysis Video Cast Indexing Home Video Editing 16 Text Indexing Ray Tracing Foreground Estimation 0 Human Body Tracker 0 16 32 48 64 Portifolio Management Geometric Mean Cores Graphics Rendering – Physical Simulation -- Vision – Data Mining -- Analytics
  12. 12. What's The Problem? Multithreaded programming is “hard” today Doable by only a subgroup of senior specialists Parallel patterns are not prevalent, well known, nor easy to implement So many potential problems Races, deadlocks, livelocks, lock convoys, cache coherency overheads, lost event notifications, broken serializability, priority inversion, and so on… Businesses have little desire to “go deep” Best developers should focus on business value, not concurrency Need simple ways to allow all developers to write concurrent code
  13. 13. void MatrixMult( int size, double** m1, double** m2, double** result) { for (int i = 0; i < size; i++) { for (int j = 0; j < size; j++) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; } } } }
  14. 14. Static partitioning void MatrixMult( int size, double** m1, double** m2, double** result) { int N = size; Synchronization Knowledge int P = 2 * NUMPROCS; int Chunk = N / P; HANDLE hEvent = CreateEvent(NULL, TRUE, FALSE, NULL); Error prone long counter = P; for (int c = 0; c < P; c++) { std::thread t ([&,c] { Lots of boilerplate for (int i = c * Chunk; i < (c + 1 == P ? N : (c + 1) * Chunk); i++) { for (int j = 0; j < size; j++) { result[i][j] = 0; for (int k = 0; k < size; k++) { result[i][j] += m1[i][k] * m2[k][j]; Tricks } } } Lack of thread reuse if (InterlockedDecrement(counter) == 0) SetEvent(hEvent); }); } Heavy synchronization WaitForSingleObject(hEvent,INFINITE); CloseHandle(hEvent); }
  15. 15. Microsoft Parallel Computing Technologies Task Concurrency WCF CCR •Robotics-based •Automotive control system •Internet –based photo WF WF manufacturing assembly line •Silverlight Olympics viewer services TPL / PPL Maestro aka Axum Local Distributed/ Computing Cloud Computing •Ultrasound imaging •Enterprise search, OLTP, collab Cluster SOA •Animation / CGI rendering equipment •Media encode/decode •Weather forecasting PLINQ Cluster PLINQ •Image processing/ •Seismic monitoring •Oil exploration enhancement OpenMP TPL / PPL Cluster TPL MPI / MPI.Net •Data visualization Compute Shader CDS Data Parallelism
  16. 16. Visual Studio 2010 Tools / Programming Models / Runtimes Integrated Programming Models Programming Models Tooling PLINQ Parallel Pattern Parallel Task Parallel Agents Debugger Library Library Library Tool Data Structures Data Structures Concurrency Runtime Concurrency Runtime ThreadPool Task Scheduler Profiler Task Scheduler Concurrency Analysis Resource Manager Resource Manager Operating System Threads Key: Managed Library Native Library Tools
  17. 17. Explicit Tasking Support .NET 4.0 Visual Studio 2010 C++ Task Parallel Library Parallel Pattern Library Task, TaskFactory task, task_group Parallel.For parallel_for Parallel.Foreach parallel_for_each Parallel.Invoke parallel_invoke Concurrent data structures Concurrent data structures Primitives for message passing User-mode locks 17
  18. 18. Task Parallel Library ( TPL )
  19. 19. Task No Threading to Threading to Tasks 19
  20. 20. User Mode Scheduler CLR Thread Pool Global Queue Worker Worker … Thread 1 Thread p Program Thread
  21. 21. User Mode Scheduler For Tasks CLR Thread Pool: Work-Stealing Local Local … Queue Queue Global Queue Worker Worker … Thread 1 Thread p Task 6 Task Task 3 4 Task 1 Task 5 Task 2Program Thread
  22. 22. Tasks revisited More on Tasks 22
  23. 23. Debugger Support Support both managed and native 1. Parallel Tasks 2. Parallel Stacks
  24. 24. Higher Level Constructs Even with Task there are common patterns that build into higher level abstractions The Parallel class Invoke, For, For<T>, Foreach Care needs to be taken with state, ordering “This is not your Father’s for loop”
  25. 25. Parallel Parallel.ForEach Parallel.Invoke 25
  26. 26. Declarative Data Parallelism Parallel LINQ-to-Objects (PLINQ) Enables LINQ devs to leverage multiple cores Fully supports all .NET standard query operators Minimal impact to existing LINQ model var q = from p in people.AsParallel() where p.Name == queryInfo.Name && p.State == queryInfo.State && p.Year >= yearStart && p.Year <= yearEnd orderby p.Year ascending select p;
  27. 27. Parallel LINQ 27
  28. 28. IEnumerable<BabyInfo> babies = ...; var results = new List<BabyInfo>(); foreach(var baby in babies) { if (baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd) { results.Add(baby); } } results.Sort((b1, b2) => b1.Year.CompareTo(b2.Year));
  29. 29. IEnumerable<BabyInfo> babies = …; var results = new List<BabyInfo>(); int partitionsCount = Environment.ProcessorCount; int remainingCount = partitionsCount; var enumerator = babies.GetEnumerator(); try { using (var done = new ManualResetEvent(false)) { for(int i = 0; i < partitionsCount; i++) { ThreadPool.QueueUserWorkItem(delegate { while(true) { BabyInfo baby; lock (enumerator) { if (!enumerator.MoveNext()) break; baby = enumerator.Current; } if (baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd) { lock (results) results.Add(baby); } } if (Interlocked.Decrement(ref remainingCount) == 0) done.Set(); }); } done.WaitOne(); results.Sort((b1, b2) => b1.Year.CompareTo(b2.Year)); } } finally { if (enumerator is IDisposable) ((IDisposable)enumerator).Dispose(); }
  30. 30. var results = from baby in babies.AsParallel() where baby.Name == queryName && baby.State == queryState && baby.Year >= yearStart && baby.Year <= yearEnd orderby baby.Year ascending select baby;
  31. 31. Coordination Data Structures Thread-safe collections ConcurrentStack<T>... Locks SpinLock, SpinWait, SemaphoreSlim ... Work Exchange BlockingCollection<T> ... Phased Operation CountdownEvent ...
  32. 32. Coordination Data Structures 32
  33. 33. What Next? http://geekswithblogs.net/iupdateable Slides and links http://blogs.msdn.com/pfxteam/ http://msdn.com/concurrency Wait for the Beta of Visual Studio 2008 and OR for the most impatient Download VS 2010 CTP Remember to set the clock back Or Download Parallel Extensions June 2008 CTP for VS2008
  34. 34. Appendix 34
  35. 35. Heads up: Axum Previously called Maestro Incubation project! New programming language Lets you take advantage of parallelism without “thinking about it” Agent based programming vs Object based programming Model agents and their interactions via messages No public methods, fields
  36. 36. Axum “Hello World” using System; agent Program : Microsoft.Axum.ConsoleApplication { override int Run(String[] args) { Console.WriteLine(quot;Hello, World!quot;); } }
  37. 37. Channels and Agents using System; using System.Concurrency; agent MainAgent : channel Microsoft.Axum.Application using Microsoft.Axum; { public MainAgent() channel Adder { { var adder = AdderAgent.CreateInNewDomain(); input int Num1; adder::Num1 <-- 10; input int Num2; adder::Num2 <-- 20; output int Sum; // do something useful ... } var sum = receive(adder::Sum); agent AdderAgent : channel Adder Console.WriteLine(sum); { public AdderAgent() PrimaryChannel::ExitCode <-- 0; { } int result = receive(PrimaryChannel::Num1) + } receive(PrimaryChannel::Num2); PrimaryChannel::Sum <-- result; } }
  38. 38. Heads up: Concurrent Basic Research Project http://channel9.msdn.com/shows/Going+Deep/Claudio-Russo-and-Lucian-Wischik-Inside-Concurrent- Basic/ Added message passing primitives – channels Module Buffer Public Asynchronous Put(ByVal s As String) Public Synchronous Take() As String Private Function CaseTakeAndPut(ByVal s As String) As String When Take, Put Return s End Function End Module Thread1: Thread2: Put(“Hello”) result = Take()

×