SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
dmitrinesteruk@gmail.com
• Quant
• Programmer (C++, .NET, MATLAB)
• Microsoft MVP Visual C# (since 2009)
• Pluralsight course author
(MATLAB, CUDA, D, Boost,…)
• Technical Evangelist @ JetBrains
• An overview of available technologies for
computation
• A look at managed vs. unmanaged code
• How to leverage capabilities of x86 architecture
• What COTS and specialized acceleration h/w exists
and how to use it
• Native code
• Managed code
• More portable. But С++ is also portable provided you do
not use platform-specific things.
• In theory gets optimized for various platforms. In
practice, this isn’t great.
• Does not permit low-level interaction with the processor.
• Additional safety («managed») – array bound checks,
type conversion checks, etc.
• Not always portable (e.g. .NET is only partially
portable, excluding UI, WCF, …)
• Typically supports garbage collection.
• Has ways of interacting with native code (JNI,
P/Invoke, C++/CLI).
• Developer vs. software productivity?
• Managed languages simpler to use
• This talk focuses on CPU bound problems
• Some problems bottleneck on I/O
• SSD made things a lot better
• Optimization mechanisms
• Don’t expect CPU clock speed to pick up
• PC/server architecture does not scale
• The only way to accelerate computation is to provide
more entities to compute on.
• Instruction-level
• Thread-level
• Machine-level
• Via inline assembly
• Via ‘intrinsics’
• Compiler vectorization
• Use magical compilers (e.g. Intel SPMD)
• SIMD things
• Processing data in an array
• OpenMP
• Intel Threading Building Blocks/
Parallel Patterns Library (MS)
• GPGPU
• Expansion boards
• Custom chips
• Hardware Platforms – NVIDIA, ATI
• Software platforms for computation – CUDA,
OpenCL, C++ AMP
• Typically 2, effectiveness drop-off after that
• PCI bus congestion, but depends on usage patterns
• CUDA is the principal commercially successful GPGPU platform
• CUDA is supported by many software manufacturers
(Photoshop, MATLAB, etc.)
• In many domains (e.g. video transcoding), the situation with GPU
leveraging is dire
• In terms of performance, it is thought that CUDA has better
floating-point, AMD better integral math
• CUDA is actually a managed technology
• CUDA is not device-independent
• CUDA C is the primary development language
• A GPU has several streaming multiprocessors (SM)
• Each SM has lots of processors (SP)
• We can launch a large number of threads in parallel
• Very large number of SPs ensures that even at lower
clock speeds, GPU wins out over CPU
• A look at CUDA development
• GPU does not support ordinary x86.
• Running several tasks on a GPU is difficult
• Branch divergence – branching code (a simple if)
turns computation from parallel to sequential.
• How do you plug in a few CPUs into a
motherboard? You cannot. The architecture doesn’t
scale. (And never will.)
• An alternative is to put a coprocessor on the PCI bus
• Commercial coprocessor
implementation from Intel
• PCI board with 60x cores
• Supports x86!!!!!!!!!111111
• Supports different technologies
• Runs its own micro Linux (not a driver)
• Can be used in either independent or offload mode
• Requires special development tools (Intel C++ compiler)
• Intel makes a lot of tools for С++ developers
• To work with Xeon Phi, you need
• Offload mode
• Native execution mode
• Symmetric execution
• Programming the Xeon Phi
• 60 processors
• 4 hardware threads per core
• 8Gb memory
• 512-bit SIMD
• Same as in ordinary PCs, i.e.,
• OpenMP, MPI
• pthreads
• Other models coming soon
• FPGA – Field Programmable
Gate Array
• Design your own CPU
processing mechanic
• Middle ground between
hard-wired ASIC and very
flexible general-purpose CPU
• Uses special hardware description
languages (HDL) – VHDL, Verilog. There are others (SystemC,
OpenCL) and higher-level solutions (e.g., MATLAB, Embeddr).
• Intrinsically parallel
• Low-power
• Better scalability
• Not a COTS solution
• FPGA lets us offload some tasks from the CPU
• FPGA is a lot less flexible. Not so good for math.
• FPGA is a low-level construct.
• FPGAs are relatively expensive to operate.
• FPGAs do not directly compete with ordinary CPUs
• Gain an advantage due to a highly asynchronous
nature
• The goal is to pre-program an FPGA to solve a
single problem very quickly
• E.g., protocol parsing in hardware (so called ‘feed
handler’)
• JetBrains is working on the C++ IDE
• And C++ support in ReSharper
• Questions?

Mais conteúdo relacionado

Mais procurados

EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
melbats
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)
Louis Loizides PE
 

Mais procurados (18)

Multiply like rabbits with rabbit mq
Multiply like rabbits with rabbit mqMultiply like rabbits with rabbit mq
Multiply like rabbits with rabbit mq
 
BlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search FeedbackBlaBlaCar Elastic Search Feedback
BlaBlaCar Elastic Search Feedback
 
3. introduction to java
3. introduction to java3. introduction to java
3. introduction to java
 
EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
EclipseCon Eu 2012 - Buildroot Eclipse Bundle : A powerful IDE for Embedded L...
 
Enterprise messaging
Enterprise messagingEnterprise messaging
Enterprise messaging
 
Term Project Presentation (4)
Term Project Presentation (4)Term Project Presentation (4)
Term Project Presentation (4)
 
.NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016).NET Core Blimey! (dotnetsheff Jan 2016)
.NET Core Blimey! (dotnetsheff Jan 2016)
 
A Quick Tour of JVM Languages
A Quick Tour of JVM LanguagesA Quick Tour of JVM Languages
A Quick Tour of JVM Languages
 
Linux Hosting Training Course Level 1-1
Linux Hosting Training Course Level 1-1Linux Hosting Training Course Level 1-1
Linux Hosting Training Course Level 1-1
 
Joe Damato
Joe DamatoJoe Damato
Joe Damato
 
LCE13: Android Graphics Upstreaming
LCE13: Android Graphics UpstreamingLCE13: Android Graphics Upstreaming
LCE13: Android Graphics Upstreaming
 
06 - Программирование микроконтроллеров. Обзор контроллера MSP-430 (en)
06 - Программирование микроконтроллеров. Обзор контроллера MSP-430 (en)06 - Программирование микроконтроллеров. Обзор контроллера MSP-430 (en)
06 - Программирование микроконтроллеров. Обзор контроллера MSP-430 (en)
 
Heterogeneous System Architecture Overview
Heterogeneous System Architecture OverviewHeterogeneous System Architecture Overview
Heterogeneous System Architecture Overview
 
Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4Plug-ins & Third-Party SDKs in UE4
Plug-ins & Third-Party SDKs in UE4
 
Introduction of jvm|Java Training In Jaipur | Java Training Jaipur | Java Tra...
Introduction of jvm|Java Training In Jaipur | Java Training Jaipur | Java Tra...Introduction of jvm|Java Training In Jaipur | Java Training Jaipur | Java Tra...
Introduction of jvm|Java Training In Jaipur | Java Training Jaipur | Java Tra...
 
Ruby and Security
Ruby and SecurityRuby and Security
Ruby and Security
 
Its320 power point
Its320 power pointIts320 power point
Its320 power point
 
Java presetstion
Java presetstionJava presetstion
Java presetstion
 

Destaque

Production management software_to_the_rescue
Production management software_to_the_rescueProduction management software_to_the_rescue
Production management software_to_the_rescue
Argos Software
 
SEM0415_GearGuide
SEM0415_GearGuideSEM0415_GearGuide
SEM0415_GearGuide
TalentWise
 

Destaque (20)

Using TeamCity Inside JetBrains
Using TeamCity Inside JetBrainsUsing TeamCity Inside JetBrains
Using TeamCity Inside JetBrains
 
3 intro basic_elements
3 intro basic_elements3 intro basic_elements
3 intro basic_elements
 
IntelliJ IDEA 導入事例(IIJ編)
IntelliJ IDEA 導入事例(IIJ編)IntelliJ IDEA 導入事例(IIJ編)
IntelliJ IDEA 導入事例(IIJ編)
 
Production management software_to_the_rescue
Production management software_to_the_rescueProduction management software_to_the_rescue
Production management software_to_the_rescue
 
Marketing To Asian Women Conference Singapore
Marketing To Asian Women Conference SingaporeMarketing To Asian Women Conference Singapore
Marketing To Asian Women Conference Singapore
 
Oa presentation1 (1)
Oa presentation1 (1)Oa presentation1 (1)
Oa presentation1 (1)
 
Accolo - Turn your company into a hiring machine - 3-22-12 - John Younger
Accolo - Turn your company into a hiring machine - 3-22-12 - John YoungerAccolo - Turn your company into a hiring machine - 3-22-12 - John Younger
Accolo - Turn your company into a hiring machine - 3-22-12 - John Younger
 
How I learned to love the Process
How I learned to love the ProcessHow I learned to love the Process
How I learned to love the Process
 
Service Anywhere What's New March 2014
Service Anywhere What's New March 2014Service Anywhere What's New March 2014
Service Anywhere What's New March 2014
 
Is mobile's big promise a farce?
Is mobile's big promise a farce?Is mobile's big promise a farce?
Is mobile's big promise a farce?
 
How to use GitHub to Predict the Success of your Application
How to use GitHub to  Predict the Success of your Application How to use GitHub to  Predict the Success of your Application
How to use GitHub to Predict the Success of your Application
 
How To Choose A Coffee Table
How To Choose A Coffee TableHow To Choose A Coffee Table
How To Choose A Coffee Table
 
Secrets of World Class HR Depts | webinar with PayStream Advisors & docSTAR
Secrets of World Class HR Depts | webinar with PayStream Advisors & docSTARSecrets of World Class HR Depts | webinar with PayStream Advisors & docSTAR
Secrets of World Class HR Depts | webinar with PayStream Advisors & docSTAR
 
Customized Scrum
Customized ScrumCustomized Scrum
Customized Scrum
 
SEM0415_GearGuide
SEM0415_GearGuideSEM0415_GearGuide
SEM0415_GearGuide
 
Csmpowerpoint2a
Csmpowerpoint2aCsmpowerpoint2a
Csmpowerpoint2a
 
Igor Vuksanović - Kako bankrotirati pri izradi poslovne aplikacije (IT Showoff)
Igor Vuksanović - Kako bankrotirati pri izradi poslovne aplikacije (IT Showoff)Igor Vuksanović - Kako bankrotirati pri izradi poslovne aplikacije (IT Showoff)
Igor Vuksanović - Kako bankrotirati pri izradi poslovne aplikacije (IT Showoff)
 
Neuroscience & Talent Development Webinar: How To Foster Organizational Trust...
Neuroscience & Talent Development Webinar: How To Foster Organizational Trust...Neuroscience & Talent Development Webinar: How To Foster Organizational Trust...
Neuroscience & Talent Development Webinar: How To Foster Organizational Trust...
 
iMIS 20 Overview for Education Associations
iMIS 20 Overview for Education AssociationsiMIS 20 Overview for Education Associations
iMIS 20 Overview for Education Associations
 
Mrjoby
MrjobyMrjoby
Mrjoby
 

Semelhante a High-Performance Computing with C++

Embedded Operating System - Linux
Embedded Operating System - LinuxEmbedded Operating System - Linux
Embedded Operating System - Linux
Emertxe Information Technologies Pvt Ltd
 
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
Ron Munitz
 
Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!
All Things Open
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
chiportal
 

Semelhante a High-Performance Computing with C++ (20)

"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!Ice Age melting down: Intel features considered usefull!
Ice Age melting down: Intel features considered usefull!
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmap
 
Linux para iniciantes
Linux para iniciantesLinux para iniciantes
Linux para iniciantes
 
"OpenCV for Embedded: Lessons Learned," a Presentation from itseez
"OpenCV for Embedded: Lessons Learned," a Presentation from itseez"OpenCV for Embedded: Lessons Learned," a Presentation from itseez
"OpenCV for Embedded: Lessons Learned," a Presentation from itseez
 
PyData Boston 2013
PyData Boston 2013PyData Boston 2013
PyData Boston 2013
 
Deep Learning on ARM Platforms - SFO17-509
Deep Learning on ARM Platforms - SFO17-509Deep Learning on ARM Platforms - SFO17-509
Deep Learning on ARM Platforms - SFO17-509
 
OpenCV for Embedded: Lessons Learned
OpenCV for Embedded: Lessons LearnedOpenCV for Embedded: Lessons Learned
OpenCV for Embedded: Lessons Learned
 
Embedded Linux on ARM
Embedded Linux on ARMEmbedded Linux on ARM
Embedded Linux on ARM
 
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo... Debugging Numerical Simulations on Accelerated Architectures  - TotalView fo...
Debugging Numerical Simulations on Accelerated Architectures - TotalView fo...
 
Embedded Operating System - Linux
Embedded Operating System - LinuxEmbedded Operating System - Linux
Embedded Operating System - Linux
 
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
BYOD Revisited: Build Your Own Device (Embedded Linux Conference 2014)
 
EuroMPI 2013 presentation: McMPI
EuroMPI 2013 presentation: McMPIEuroMPI 2013 presentation: McMPI
EuroMPI 2013 presentation: McMPI
 
Embedded Linux on ARM
Embedded Linux on ARMEmbedded Linux on ARM
Embedded Linux on ARM
 
Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!Linux Distribution Collaboration …on a Mainframe!
Linux Distribution Collaboration …on a Mainframe!
 
From printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debuggingFrom printk to QEMU: Xen/Linux Kernel debugging
From printk to QEMU: Xen/Linux Kernel debugging
 
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
TRACK F: OpenCL for ALTERA FPGAs, Accelerating performance and design product...
 
GPU Algorithms and trends 2018
GPU Algorithms and trends 2018GPU Algorithms and trends 2018
GPU Algorithms and trends 2018
 
Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer Exploring the Programming Models for the LUMI Supercomputer
Exploring the Programming Models for the LUMI Supercomputer
 
Building Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARMBuilding Embedded Linux Full Tutorial for ARM
Building Embedded Linux Full Tutorial for ARM
 

Último

CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
anilsa9823
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
anilsa9823
 

Último (20)

Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female serviceCALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
CALL ON ➥8923113531 🔝Call Girls Badshah Nagar Lucknow best Female service
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online  ☂️
CALL ON ➥8923113531 🔝Call Girls Kakori Lucknow best sexual service Online ☂️
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AISyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
SyndBuddy AI 2k Review 2024: Revolutionizing Content Syndication with AI
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

High-Performance Computing with C++

  • 2. • Quant • Programmer (C++, .NET, MATLAB) • Microsoft MVP Visual C# (since 2009) • Pluralsight course author (MATLAB, CUDA, D, Boost,…) • Technical Evangelist @ JetBrains
  • 3. • An overview of available technologies for computation • A look at managed vs. unmanaged code • How to leverage capabilities of x86 architecture • What COTS and specialized acceleration h/w exists and how to use it
  • 4. • Native code • Managed code
  • 5. • More portable. But С++ is also portable provided you do not use platform-specific things. • In theory gets optimized for various platforms. In practice, this isn’t great. • Does not permit low-level interaction with the processor. • Additional safety («managed») – array bound checks, type conversion checks, etc.
  • 6. • Not always portable (e.g. .NET is only partially portable, excluding UI, WCF, …) • Typically supports garbage collection. • Has ways of interacting with native code (JNI, P/Invoke, C++/CLI).
  • 7. • Developer vs. software productivity? • Managed languages simpler to use
  • 8. • This talk focuses on CPU bound problems • Some problems bottleneck on I/O • SSD made things a lot better • Optimization mechanisms
  • 9. • Don’t expect CPU clock speed to pick up • PC/server architecture does not scale • The only way to accelerate computation is to provide more entities to compute on.
  • 11. • Via inline assembly • Via ‘intrinsics’ • Compiler vectorization • Use magical compilers (e.g. Intel SPMD)
  • 13. • Processing data in an array • OpenMP • Intel Threading Building Blocks/ Parallel Patterns Library (MS)
  • 14. • GPGPU • Expansion boards • Custom chips
  • 15. • Hardware Platforms – NVIDIA, ATI • Software platforms for computation – CUDA, OpenCL, C++ AMP
  • 16. • Typically 2, effectiveness drop-off after that • PCI bus congestion, but depends on usage patterns
  • 17. • CUDA is the principal commercially successful GPGPU platform • CUDA is supported by many software manufacturers (Photoshop, MATLAB, etc.) • In many domains (e.g. video transcoding), the situation with GPU leveraging is dire • In terms of performance, it is thought that CUDA has better floating-point, AMD better integral math
  • 18. • CUDA is actually a managed technology • CUDA is not device-independent • CUDA C is the primary development language
  • 19. • A GPU has several streaming multiprocessors (SM) • Each SM has lots of processors (SP) • We can launch a large number of threads in parallel • Very large number of SPs ensures that even at lower clock speeds, GPU wins out over CPU
  • 20. • A look at CUDA development
  • 21. • GPU does not support ordinary x86. • Running several tasks on a GPU is difficult • Branch divergence – branching code (a simple if) turns computation from parallel to sequential.
  • 22. • How do you plug in a few CPUs into a motherboard? You cannot. The architecture doesn’t scale. (And never will.) • An alternative is to put a coprocessor on the PCI bus
  • 23. • Commercial coprocessor implementation from Intel • PCI board with 60x cores • Supports x86!!!!!!!!!111111 • Supports different technologies • Runs its own micro Linux (not a driver) • Can be used in either independent or offload mode • Requires special development tools (Intel C++ compiler)
  • 24. • Intel makes a lot of tools for С++ developers • To work with Xeon Phi, you need
  • 25. • Offload mode • Native execution mode • Symmetric execution
  • 27. • 60 processors • 4 hardware threads per core • 8Gb memory • 512-bit SIMD
  • 28. • Same as in ordinary PCs, i.e., • OpenMP, MPI • pthreads • Other models coming soon
  • 29. • FPGA – Field Programmable Gate Array • Design your own CPU processing mechanic • Middle ground between hard-wired ASIC and very flexible general-purpose CPU • Uses special hardware description languages (HDL) – VHDL, Verilog. There are others (SystemC, OpenCL) and higher-level solutions (e.g., MATLAB, Embeddr).
  • 30. • Intrinsically parallel • Low-power • Better scalability • Not a COTS solution
  • 31. • FPGA lets us offload some tasks from the CPU • FPGA is a lot less flexible. Not so good for math. • FPGA is a low-level construct. • FPGAs are relatively expensive to operate.
  • 32. • FPGAs do not directly compete with ordinary CPUs • Gain an advantage due to a highly asynchronous nature • The goal is to pre-program an FPGA to solve a single problem very quickly • E.g., protocol parsing in hardware (so called ‘feed handler’)
  • 33. • JetBrains is working on the C++ IDE • And C++ support in ReSharper • Questions?