SlideShare uma empresa Scribd logo
1 de 16
Debugging Perhaad Mistry & Dana Schaa, Northeastern University Computer Architecture Research Lab, with Benedict R. Gaster, AMD © 2011
Instructor Notes GPU debugging is still immature, but being improved daily.  You should definitely check to see the latest options available before giving this lecture. 2 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
Debugging Techniques Compiling for x86 CPU Debugging with GDB GPU printf Live debuggers Parallel Nsight gDEBugger 3 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
CPU Debugging OpenCL allows the same code to run on different types of devices Compiling to run on a CPU provides some extra facilities for debugging Additional forms of IO (such as writing to disk) are still not available from the kernel AMD’s OpenCL implementation recognizes any x86 processor as a target device Simply select the CPU as the target device when executing the program NVIDIA’s OpenCL implementation can support compiling to x86 CPUs if AMD’s installable client driver is installed 4 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
CPU Debugging with GDB Setting up for GDB Pass the compiler the “-g” flag Pass “-g” to clBuildProgram() Set an environment variable CPU_COMPILER_OPTIONS=“-g” Avoid non-deterministic execution by setting an environment variable CPU_MAX_COMPUTE_UNITS=1 5 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
CPU Debugging with GDB Run gdb with the OpenCL executable > gdba.out Breakpoints can be set by line number, function name, or kernel name To break at the kernel hello within gdb, enter: (gdb) b __OpenCL_hello_kernel The prefix and suffix are required for kernel names OpenCL kernel symbols are not known until the kernel is loaded, so setting a breakpoint at clEnqueueNDRangeKernel() is helpful 	(gdb) bclEnqueueNDRangeKernel 6 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
CPU Debugging with GDB To break on a certain thread, introduce a conditional statement in the kernel and set the breakpoint inside the conditional body Can use gdb commands to view thread state at this point ... if(get_global_id(1) == 20 &&    get_global_id(0) == 34) {    ;  // Set breakpoint on this line } 7 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
GPU Printf AMD GPUs support printing during execution using printf() NVIDIA does not currently support printing for OpenCL kernels (though they do with CUDA/C) AMD requires the OpenCL extension cl_amd_printf to be enabled in the kernel printf() closely matches the definition found in the C99 standard 8 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
GPU Printf printf() can be used to print information about threads or check help track down bugs The following example prints information about threads trying to perform an improper memory access intmyIdx = ... // index for addressing a matrix if(myIdx < 0 || myIdx >= rows || myIdx >= cols) { printf(“Thread %d,%d: bad index (%d)”,  		get_global_id(1), get_global_id(0), myIdx)); 	} 9 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
GPU Printf printf() works by buffering output until the end of execution and transferring the output back to the host It is important that a kernel completes in order to retrieve printed information Commenting out code following printf() is a good technique if the kernel is crashing 10 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
gDEBugger Developed by Graphic Remedy Cost: not free Debugger, profiler, memory analyzer Integrated with AMD/ATI and NVIDIA performance counters  11 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
gDEBugger Displays information about OpenCL platforms and devices present in the system 12 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
gDEBugger Can step through OpenCL calls, and view arguments Links to programs, kernels, etc. when possible in the function call view 13 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
gDEBugger Automatically detects OpenCL errors and memory leaks 14 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
gDEBugger Displays contents of buffers and images present on OpenCL devices View live Export to disk 15 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
Summary GPU debugging is still immature NVIDIA has a live debugger for Windows only AMD and NVIDIA allow restrictive printing from the GPU AMD allows code to be compiled and run with gdb on the CPU Graphic Remedy (gDEBugger) provides online memory analysis and is integrated with performance counters, but cannot debug on a thread-by-thread basis  16 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011

Mais conteúdo relacionado

Mais procurados

Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsJack (Jaegeun) Han
 
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...Pradeep Singh
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Intel® Software
 
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...Anne Nicolas
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...AMD Developer Central
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceAMD Developer Central
 
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」Shinya Takamaeda-Y
 
Tdd with python unittest for embedded c
Tdd with python unittest for embedded cTdd with python unittest for embedded c
Tdd with python unittest for embedded cBenux Wei
 
AGDK tutorial step by step
AGDK tutorial step by stepAGDK tutorial step by step
AGDK tutorial step by stepJungsoo Nam
 
2015 10-30 面試心得分享
2015 10-30 面試心得分享2015 10-30 面試心得分享
2015 10-30 面試心得分享Hung Liu
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...AMD Developer Central
 
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000Umang Parekh
 
Intel® Graphics Performance Analyzers
Intel® Graphics Performance AnalyzersIntel® Graphics Performance Analyzers
Intel® Graphics Performance AnalyzersIntel® Software
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Intel® Software
 
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIOptimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIIntel® Software
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceMr. Vengineer
 

Mais procurados (20)

Profiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systemsProfiling deep learning network using NVIDIA nsight systems
Profiling deep learning network using NVIDIA nsight systems
 
The GPGPU Continuum
The GPGPU ContinuumThe GPGPU Continuum
The GPGPU Continuum
 
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
Development of Signal Processing Algorithms using OpenCL for FPGA based Archi...
 
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
Fast Insights to Optimized Vectorization and Memory Using Cache-aware Rooflin...
 
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper...
 
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
PT-4102, Simulation, Compilation and Debugging of OpenCL on the AMD Southern ...
 
Webinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop IntelligenceWebinar: Whats New in Java 8 with Develop Intelligence
Webinar: Whats New in Java 8 with Develop Intelligence
 
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
助教が吼える! 各界の若手研究者大集合「ハードウェアはやわらかい」
 
Tdd with python unittest for embedded c
Tdd with python unittest for embedded cTdd with python unittest for embedded c
Tdd with python unittest for embedded c
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
AGDK tutorial step by step
AGDK tutorial step by stepAGDK tutorial step by step
AGDK tutorial step by step
 
2015 10-30 面試心得分享
2015 10-30 面試心得分享2015 10-30 面試心得分享
2015 10-30 面試心得分享
 
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
The Small Batch (and other) solutions in Mantle API, by Guennadi Riguer, Mant...
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
Syntutic
SyntuticSyntutic
Syntutic
 
xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000xapp744-HIL-Zynq-7000
xapp744-HIL-Zynq-7000
 
Intel® Graphics Performance Analyzers
Intel® Graphics Performance AnalyzersIntel® Graphics Performance Analyzers
Intel® Graphics Performance Analyzers
 
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
Bring Intelligent Motion Using Reinforcement Learning Engines | SIGGRAPH 2019...
 
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIOptimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER II
 
Cloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & InferenceCloud Deep Learning Chips Training & Inference
Cloud Deep Learning Chips Training & Inference
 

Destaque (15)

Visual Studio 2010 Testing & Lab Management Tools
Visual Studio 2010 Testing & Lab Management ToolsVisual Studio 2010 Testing & Lab Management Tools
Visual Studio 2010 Testing & Lab Management Tools
 
Lec05 buffers basic_examples
Lec05 buffers basic_examplesLec05 buffers basic_examples
Lec05 buffers basic_examples
 
Lec08 optimizations
Lec08 optimizationsLec08 optimizations
Lec08 optimizations
 
Lec02 03 opencl_intro
Lec02 03 opencl_introLec02 03 opencl_intro
Lec02 03 opencl_intro
 
Lec11 timing
Lec11 timingLec11 timing
Lec11 timing
 
Lec13 multidevice
Lec13 multideviceLec13 multidevice
Lec13 multidevice
 
Lec09 nbody-optimization
Lec09 nbody-optimizationLec09 nbody-optimization
Lec09 nbody-optimization
 
Notes on Debugging
Notes on DebuggingNotes on Debugging
Notes on Debugging
 
Lec07 threading hw
Lec07 threading hwLec07 threading hw
Lec07 threading hw
 
Debugging
DebuggingDebugging
Debugging
 
Online computer lab management system
Online computer lab management systemOnline computer lab management system
Online computer lab management system
 
Debugging
DebuggingDebugging
Debugging
 
GPU - Basic Working
GPU - Basic WorkingGPU - Basic Working
GPU - Basic Working
 
Lec04 gpu architecture
Lec04 gpu architectureLec04 gpu architecture
Lec04 gpu architecture
 
Welcome To Your Computer Lab Ppt
Welcome To Your Computer Lab PptWelcome To Your Computer Lab Ppt
Welcome To Your Computer Lab Ppt
 

Semelhante a Lec12 debugging

Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018NVIDIA
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsnARUNACHALAM468781
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8AbdullahMunir32
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...AMD Developer Central
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
 
Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)IT-Доминанта
 
LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness Peter Griffin
 
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLinaro
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to AcceleratorsDilum Bandara
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecturemohamedragabslideshare
 

Semelhante a Lec12 debugging (20)

Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018
 
Cuda materials
Cuda materialsCuda materials
Cuda materials
 
gpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsngpuprogram_lecture,architecture_designsn
gpuprogram_lecture,architecture_designsn
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
 
Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)Python и программирование GPU (Ивашкевич Глеб)
Python и программирование GPU (Ивашкевич Глеб)
 
LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness LAS16-403 - GDB Linux Kernel Awareness
LAS16-403 - GDB Linux Kernel Awareness
 
LAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel AwarenessLAS16-403: GDB Linux Kernel Awareness
LAS16-403: GDB Linux Kernel Awareness
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to Accelerators
 
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the CoupledCpu-GPU ArchitectureRevisiting Co-Processing for Hash Joins on the CoupledCpu-GPU Architecture
Revisiting Co-Processing for Hash Joins on the Coupled Cpu-GPU Architecture
 

Último

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 

Último (20)

Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 

Lec12 debugging

  • 1. Debugging Perhaad Mistry & Dana Schaa, Northeastern University Computer Architecture Research Lab, with Benedict R. Gaster, AMD © 2011
  • 2. Instructor Notes GPU debugging is still immature, but being improved daily. You should definitely check to see the latest options available before giving this lecture. 2 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 3. Debugging Techniques Compiling for x86 CPU Debugging with GDB GPU printf Live debuggers Parallel Nsight gDEBugger 3 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 4. CPU Debugging OpenCL allows the same code to run on different types of devices Compiling to run on a CPU provides some extra facilities for debugging Additional forms of IO (such as writing to disk) are still not available from the kernel AMD’s OpenCL implementation recognizes any x86 processor as a target device Simply select the CPU as the target device when executing the program NVIDIA’s OpenCL implementation can support compiling to x86 CPUs if AMD’s installable client driver is installed 4 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 5. CPU Debugging with GDB Setting up for GDB Pass the compiler the “-g” flag Pass “-g” to clBuildProgram() Set an environment variable CPU_COMPILER_OPTIONS=“-g” Avoid non-deterministic execution by setting an environment variable CPU_MAX_COMPUTE_UNITS=1 5 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 6. CPU Debugging with GDB Run gdb with the OpenCL executable > gdba.out Breakpoints can be set by line number, function name, or kernel name To break at the kernel hello within gdb, enter: (gdb) b __OpenCL_hello_kernel The prefix and suffix are required for kernel names OpenCL kernel symbols are not known until the kernel is loaded, so setting a breakpoint at clEnqueueNDRangeKernel() is helpful (gdb) bclEnqueueNDRangeKernel 6 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 7. CPU Debugging with GDB To break on a certain thread, introduce a conditional statement in the kernel and set the breakpoint inside the conditional body Can use gdb commands to view thread state at this point ... if(get_global_id(1) == 20 && get_global_id(0) == 34) { ; // Set breakpoint on this line } 7 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 8. GPU Printf AMD GPUs support printing during execution using printf() NVIDIA does not currently support printing for OpenCL kernels (though they do with CUDA/C) AMD requires the OpenCL extension cl_amd_printf to be enabled in the kernel printf() closely matches the definition found in the C99 standard 8 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 9. GPU Printf printf() can be used to print information about threads or check help track down bugs The following example prints information about threads trying to perform an improper memory access intmyIdx = ... // index for addressing a matrix if(myIdx < 0 || myIdx >= rows || myIdx >= cols) { printf(“Thread %d,%d: bad index (%d)”, get_global_id(1), get_global_id(0), myIdx)); } 9 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 10. GPU Printf printf() works by buffering output until the end of execution and transferring the output back to the host It is important that a kernel completes in order to retrieve printed information Commenting out code following printf() is a good technique if the kernel is crashing 10 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 11. gDEBugger Developed by Graphic Remedy Cost: not free Debugger, profiler, memory analyzer Integrated with AMD/ATI and NVIDIA performance counters 11 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 12. gDEBugger Displays information about OpenCL platforms and devices present in the system 12 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 13. gDEBugger Can step through OpenCL calls, and view arguments Links to programs, kernels, etc. when possible in the function call view 13 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 14. gDEBugger Automatically detects OpenCL errors and memory leaks 14 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 15. gDEBugger Displays contents of buffers and images present on OpenCL devices View live Export to disk 15 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011
  • 16. Summary GPU debugging is still immature NVIDIA has a live debugger for Windows only AMD and NVIDIA allow restrictive printing from the GPU AMD allows code to be compiled and run with gdb on the CPU Graphic Remedy (gDEBugger) provides online memory analysis and is integrated with performance counters, but cannot debug on a thread-by-thread basis 16 Perhaad Mistry & Dana Schaa, Northeastern Univ Computer Architecture Research Lab, with Ben Gaster, AMD © 2011

Notas do Editor

  1. AMD’s OpenCL programming guide has a section dedicated to debugging with GDB