Enviar pesquisa
Carregar
Lec12 debugging
•
Transferir como PPTX, PDF
•
2 gostaram
•
927 visualizações
Taras Zakharchenko
Seguir
Tecnologia
Denunciar
Compartilhar
Denunciar
Compartilhar
1 de 16
Baixar agora
Recomendados
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Intel® Software
GPU Ecosystem
GPU Ecosystem
Ofer Rosenberg
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
AMD Developer Central
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Intel® Software
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDB
bmbouter
Dalvik Vm & Jit
Dalvik Vm & Jit
Ankit Somani
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
Ofer Rosenberg
eBPF/XDP
eBPF/XDP
Netronome
Recomendados
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Software AI Accelerators: The Next Frontier | Software for AI Optimization Su...
Intel® Software
GPU Ecosystem
GPU Ecosystem
Ofer Rosenberg
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
Keynote (Mike Muller) - Is There Anything New in Heterogeneous Computing - by...
AMD Developer Central
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Use C++ and Intel® Threading Building Blocks (Intel® TBB) for Hardware Progra...
Intel® Software
Debugging Hung Python Processes With GDB
Debugging Hung Python Processes With GDB
bmbouter
Dalvik Vm & Jit
Dalvik Vm & Jit
Ankit Somani
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
Ofer Rosenberg
eBPF/XDP
eBPF/XDP
Netronome
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
Jack (Jaegeun) Han
The GPGPU Continuum
The GPGPU Continuum
Ofer Rosenberg
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Pradeep Singh
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Intel® Software
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Anne Nicolas
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
AMD Developer Central
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
AMD Developer Central
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Shinya Takamaeda-Y
Tdd with python unittest for embedded c
Tdd with python unittest for embedded c
Benux Wei
NVIDIA CUDA
NVIDIA CUDA
Jungsoo Nam
AGDK tutorial step by step
AGDK tutorial step by step
Jungsoo Nam
2015 10-30 面試心得分享
2015 10-30 面試心得分享
Hung Liu
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
AMD Developer Central
Debugging CUDA applications
Debugging CUDA applications
Rogue Wave Software
Syntutic
Syntutic
Rohit Chintu
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
Umang Parekh
Intel® Graphics Performance Analyzers
Intel® Graphics Performance Analyzers
Intel® Software
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Intel® Software
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER II
Intel® Software
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
Mr. Vengineer
Visual Studio 2010 Testing & Lab Management Tools
Visual Studio 2010 Testing & Lab Management Tools
Ayman El-Hattab
Lec05 buffers basic_examples
Lec05 buffers basic_examples
Taras Zakharchenko
Mais conteúdo relacionado
Mais procurados
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
Jack (Jaegeun) Han
The GPGPU Continuum
The GPGPU Continuum
Ofer Rosenberg
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Pradeep Singh
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Intel® Software
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Anne Nicolas
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
AMD Developer Central
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
AMD Developer Central
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Shinya Takamaeda-Y
Tdd with python unittest for embedded c
Tdd with python unittest for embedded c
Benux Wei
NVIDIA CUDA
NVIDIA CUDA
Jungsoo Nam
AGDK tutorial step by step
AGDK tutorial step by step
Jungsoo Nam
2015 10-30 面試心得分享
2015 10-30 面試心得分享
Hung Liu
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
AMD Developer Central
Debugging CUDA applications
Debugging CUDA applications
Rogue Wave Software
Syntutic
Syntutic
Rohit Chintu
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
Umang Parekh
Intel® Graphics Performance Analyzers
Intel® Graphics Performance Analyzers
Intel® Software
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Intel® Software
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER II
Intel® Software
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
Mr. Vengineer
Mais procurados
(20)
Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
The GPGPU Continuum
The GPGPU Continuum
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
Tdd with python unittest for embedded c
Tdd with python unittest for embedded c
NVIDIA CUDA
NVIDIA CUDA
AGDK tutorial step by step
AGDK tutorial step by step
2015 10-30 面試心得分享
2015 10-30 面試心得分享
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
Debugging CUDA applications
Debugging CUDA applications
Syntutic
Syntutic
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
Intel® Graphics Performance Analyzers
Intel® Graphics Performance Analyzers
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER II
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
Destaque
Visual Studio 2010 Testing & Lab Management Tools
Visual Studio 2010 Testing & Lab Management Tools
Ayman El-Hattab
Lec05 buffers basic_examples
Lec05 buffers basic_examples
Taras Zakharchenko
Lec08 optimizations
Lec08 optimizations
Taras Zakharchenko
Lec02 03 opencl_intro
Lec02 03 opencl_intro
Taras Zakharchenko
Lec11 timing
Lec11 timing
Taras Zakharchenko
Lec13 multidevice
Lec13 multidevice
Taras Zakharchenko
Lec09 nbody-optimization
Lec09 nbody-optimization
Taras Zakharchenko
Notes on Debugging
Notes on Debugging
Cotap Engineering
Lec07 threading hw
Lec07 threading hw
Taras Zakharchenko
Debugging
Debugging
Jonathan Holloway
Online computer lab management system
Online computer lab management system
Pranyta Karhe
Debugging
Debugging
Indu Sharma Bhardwaj
GPU - Basic Working
GPU - Basic Working
Nived R Nambiar
Lec04 gpu architecture
Lec04 gpu architecture
Taras Zakharchenko
Welcome To Your Computer Lab Ppt
Welcome To Your Computer Lab Ppt
Dot Rutherford
Destaque
(15)
Visual Studio 2010 Testing & Lab Management Tools
Visual Studio 2010 Testing & Lab Management Tools
Lec05 buffers basic_examples
Lec05 buffers basic_examples
Lec08 optimizations
Lec08 optimizations
Lec02 03 opencl_intro
Lec02 03 opencl_intro
Lec11 timing
Lec11 timing
Lec13 multidevice
Lec13 multidevice
Lec09 nbody-optimization
Lec09 nbody-optimization
Notes on Debugging
Notes on Debugging
Lec07 threading hw
Lec07 threading hw
Debugging
Debugging
Online computer lab management system
Online computer lab management system
Debugging
Debugging
GPU - Basic Working
GPU - Basic Working
Lec04 gpu architecture
Lec04 gpu architecture
Welcome To Your Computer Lab Ppt
Welcome To Your Computer Lab Ppt
Semelhante a Lec12 debugging
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
Rob Gillen
GPU Programming with Java
GPU Programming with Java
Kelum Senanayake
Cuda intro
Cuda intro
Anshul Sharma
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018
NVIDIA
Cuda materials
Cuda materials
Thiruselvan Subramanian
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
ARUNACHALAM468781
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Arka Ghosh
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Kohei KaiGai
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
J On The Beach
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
AbdullahMunir32
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
AMD Developer Central
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
Intel® Software
Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)
IT-Доминанта
LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness
Peter Griffin
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
Linaro
Introduction to Accelerators
Introduction to Accelerators
Dilum Bandara
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
mohamedragabslideshare
Semelhante a Lec12 debugging
(20)
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
GPU Programming with Java
GPU Programming with Java
Cuda intro
Cuda intro
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018
Cuda materials
Cuda materials
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)
LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
Introduction to Accelerators
Introduction to Accelerators
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Último
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
Puma Security, LLC
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Enterprise Knowledge
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
shyamraj55
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
Pixlogix Infotech
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
naman860154
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Miguel Araújo
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
Results
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
Ridwan Fadjar
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Delhi Call girls
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Allon Mureinik
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Malak Abu Hammad
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
The Digital Insurer
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Drew Madelung
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
soniya singh
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
Paola De la Torre
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
Anna Loughnan Colquhoun
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
Scott Keck-Warren
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Gabriella Davis
Último
(20)
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
How to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
Lec12 debugging
1.
Debugging Perhaad Mistry
& Dana Schaa, Northeastern University Computer Architecture Research Lab, with Benedict R. Gaster, AMD © 2011
2.
Instructor Notes GPU
debugging is still immature, but being improved daily. You should definitely check to see the latest options available before giving this lecture. 2 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
3.
Debugging Techniques Compiling
for x86 CPU Debugging with GDB GPU printf Live debuggers Parallel Nsight gDEBugger 3 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
4.
CPU Debugging OpenCL
allows the same code to run on different types of devices Compiling to run on a CPU provides some extra facilities for debugging Additional forms of IO (such as writing to disk) are still not available from the kernel AMD’s OpenCL implementation recognizes any x86 processor as a target device Simply select the CPU as the target device when executing the program NVIDIA’s OpenCL implementation can support compiling to x86 CPUs if AMD’s installable client driver is installed 4 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
5.
CPU Debugging with
GDB Setting up for GDB Pass the compiler the “-g” flag Pass “-g” to clBuildProgram() Set an environment variable CPU_COMPILER_OPTIONS=“-g” Avoid non-deterministic execution by setting an environment variable CPU_MAX_COMPUTE_UNITS=1 5 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
6.
CPU Debugging with
GDB Run gdb with the OpenCL executable > gdba.out Breakpoints can be set by line number, function name, or kernel name To break at the kernel hello within gdb, enter: (gdb) b __OpenCL_hello_kernel The prefix and suffix are required for kernel names OpenCL kernel symbols are not known until the kernel is loaded, so setting a breakpoint at clEnqueueNDRangeKernel() is helpful (gdb) bclEnqueueNDRangeKernel 6 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
7.
CPU Debugging with
GDB To break on a certain thread, introduce a conditional statement in the kernel and set the breakpoint inside the conditional body Can use gdb commands to view thread state at this point ... if(get_global_id(1) == 20 && get_global_id(0) == 34) { ; // Set breakpoint on this line } 7 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
8.
GPU Printf AMD
GPUs support printing during execution using printf() NVIDIA does not currently support printing for OpenCL kernels (though they do with CUDA/C) AMD requires the OpenCL extension cl_amd_printf to be enabled in the kernel printf() closely matches the definition found in the C99 standard 8 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
9.
GPU Printf printf()
can be used to print information about threads or check help track down bugs The following example prints information about threads trying to perform an improper memory access intmyIdx = ... // index for addressing a matrix if(myIdx < 0 || myIdx >= rows || myIdx >= cols) { printf(“Thread %d,%d: bad index (%d)”, get_global_id(1), get_global_id(0), myIdx)); } 9 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
10.
GPU Printf printf()
works by buffering output until the end of execution and transferring the output back to the host It is important that a kernel completes in order to retrieve printed information Commenting out code following printf() is a good technique if the kernel is crashing 10 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
11.
gDEBugger Developed by
Graphic Remedy Cost: not free Debugger, profiler, memory analyzer Integrated with AMD/ATI and NVIDIA performance counters 11 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
12.
gDEBugger Displays information
about OpenCL platforms and devices present in the system 12 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
13.
gDEBugger Can step
through OpenCL calls, and view arguments Links to programs, kernels, etc. when possible in the function call view 13 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
14.
gDEBugger Automatically detects
OpenCL errors and memory leaks 14 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
15.
gDEBugger Displays contents
of buffers and images present on OpenCL devices View live Export to disk 15 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
16.
Summary GPU debugging
is still immature NVIDIA has a live debugger for Windows only AMD and NVIDIA allow restrictive printing from the GPU AMD allows code to be compiled and run with gdb on the CPU Graphic Remedy (gDEBugger) provides online memory analysis and is integrated with performance counters, but cannot debug on a thread-by-thread basis 16 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
Notas do Editor
AMD’s OpenCL programming guide has a section dedicated to debugging with GDB
Baixar agora