With the technology development of medical industry, processing data is expanding rapidly and computation time also increases due to many factors like 3D, 4D treatment planning, the increasing sophistication of MRI pulse sequences and the growing complexity of algorithms. Graphics processing unit (GPU) addresses these problems and gives the solutions for using their features such as, high computation throughput, high memory bandwidth, support for floating-point arithmetic and low cost. Compute unified device architecture (CUDA) is a popular GPU programming model introduced by NVIDIA for parallel computing. This review paper briefly discusses the need of GPU CUDA computing in the medical image analysis. The GPU performances of existing algorithms are analyzed and the computational gain is discussed. A few open issues, hardware configurations and optimization principles of existing methods are discussed. This survey concludes the few optimization techniques with the medical imaging algorithms on GPU. Finally, limitation and future scope of GPU programming are discussed.
1. Survey of using GPU CUDA
programming model in medical
image analysis
Armin shoughi
May 2019
1
2. Introduction
Computed tomography (CT) , magnetic resonance imaging (MRI) , positron emission
tomography (PET) and ultrasound are famous medical modalities that produce the 2D,
3D and 4D types of medical images which are guiding the diagnosis process and
treatment planning.
GPU is highly parallel, multithread, multiple core processors and has high memory
bandwidth to give the solution to the computational problems.
2
3. Overview of GPU computing model – CUDA
3
• NVIDIA has introduced its own massively parallel architecture called
compute unified device architecture (CUDA) in 2006 and made the
evolution in GPU programming model.
• CUDA is an open source and extension of the C programming
language.
• CUDA program contains two phases that are executed in either host
(CPU) or device (GPU).
• There is no data parallelism in the host code.
4. GPU - CUDA architecture
1. programming model
2. memory model
3. CUDA work flow
5. GPU - CUDA programming model
5
GPU - CUDA hardware builds with three main parts to utilize
effectively the full computation capability of GPU. The grids,
blocks and threads build the CUDA architecture.
6. GPU- CUDA memory model
6
A GPU has M number of streaming multiprocessors (SM) and N number
of streaming processor cores (SPs) to each SM. Each thread can
access variables from the local memory and registers. Registers have the
largest bandwidth and frequently accessed variables are stored in the
registers.
7. CUDA work flow model
7
The CUDA program starts with host execution. The kernel function
generates the large amount of threads to execute data parallelism.
Before starting the kernel, all the necessary data is transferred from
host to allocated device memory.
8. GPU computation for medical
image analysis
• denoising
• Registration
• Segmentation
• visualization
9. Image denoising
9
Image denoising is an important task in medical imaging
applications in order to enhance and recover hidden details from
the data.
Solution of this problem may lead to improve the diagnosis and
surgical procedures.
The most commonly used denoising algorithms in the medical
domain are :
• adaptive filtering
• anisotropic diffusion
• bilateral filtering
• non-local means filter
10. Adaptive filtering
10
The denoising approach uses an adaptive filter introduced by
Knutsson et al., in 1983
The adaptive filter is a self-modifying digital filter that adjusts its filter
coefficients in an attempt to minimize an error function.
Adaptive filter is a directed method that does not need to be
iterated.
11. Anisotropic diffusion
11
Anisotropic diffusion filter is an iterative algorithm introduced by
Perona and Malik in 1987.
The algorithm aims at reducing image noise without affecting
significant parts of the image content, edges, regions, lines or other
details .
12. Registration
12
The term medical image registration determines the
spatial alignment between reference image and spatial
transformed image.
The reference image and transformed image have
acquired from same or different modalities.
Two popular registration algorithms are :
• block matching algorithm (BMA)
• rigid transformation estimation (RTE).
13. Block matching algorithm
13
The block-matching algorithm (BMA) is the most popular
method for the motion estimation from the image sequence.
This method splits an image into blocks and estimates the
displacement to each block.
14. Rigid transformation estimation
14
Rigid Transformation Estimation (RTE) is one of the simplest forms of image
registration in the medical imaging. RTE allows finding the transformation.
between references and moving images with the support of vectors given by
the BMA. A rigid transformation (T) represents the linear and/or angular
displacement of a rigid body.
15. Segmentation
15
Many segmentation methods are computationally expensive
while running on large amount of dataset produced by the
medical modalities.
Image segmentation in medical imaging is often used to
segment brain structures, blood vessels, tumors and bones.
The famous segmentation methods :
• thresholding
• region growing
16. Thresholding
16
Thresholding is a process to segment each pixel or voxel
using one or more threshold values. Thresholding is a
simplest technique to implement the data parallelism using
voxel per thread in 3D image or pixel per thread in 2D
image.
17. Region growing
17
Region growing is commonly used medical image
segmentation technique. Region growing starts with initial
seed point from object which is given by either manually or
automatically using prior knowledge.
18. Visualization
18
Medical image processing combined with visualization makes new
way to diagnoses and to evaluate the effect of treatment given to
the patient more accurate and reliable by using computers.
Image visualization categorized into two groups:
• surface rendering
• volume rendering