SlideShare uma empresa Scribd logo
1 de 33
Rohit khatana
Parallel Computing With GPU
Rohit Khatana
4344
Seminar guide
Prof. Aparna Joshi
ARMY INSTITUE OF TECHNOLOGY
Rohit khatana
Content
1.What is parallel computing?
2.Gpu
3.CUDA
4.Application
Rohit khatana
What is Parallel Computing?
Performing or Executing a task/program
on more than one machine or processor.
In simple way dividing a job in a group.
Rohit khatana
For example
Rohit khatana
What kind of processors will we
build?
(major design constraint: power)
Cpu: - Complex Control Hardware
Flexibility + Performance
Expensive in Terms of Power
GPU: - Simpler Control Hardware
More H/W for Computation
Potentially More power Efficient (ops/watt)
More Restrictive Programming Model
Modern GPU has more ALU’s
Graphics Logical Pipeline
• The GPU receives geometry information
from the CPU as an input and provides
a picture as an output
• Let’s see how that happens
Host Interface
• The host interface is the communication bridge
between the CPU and the GPU
• It receives commands from the CPU and also
pulls geometry information from system
memory
• It outputs a stream of vertices in object space
with all their associated information (normals,
texture coordinates, per vertex color etc)
Vertex Processing
• The vertex processing stage receives vertices from the
host interface in object space and outputs them in screen
space
• This may be a simple linear transformation, or a complex
operation involving morphing effects
• No new vertices are created in this stage, and no
vertices are discarded (input/output has 1:1 mapping)
Triangle Setup
• In this stage geometry information becomes raster
information (screen space geometry is the input,
pixels are the output)
• Prior to rasterization, triangles that are backfacing
or are located outside the viewing frustrum are
rejected
Triangle Setup
• A fragment is generated if and only if its center
is inside the triangle
• Every fragment generated has its attributes
computed to be the perspective correct
interpolation of the three vertices that make up
the triangle
Fragment Processing
• Each fragment provided by triangle setup is fed
into fragment processing as a set of attributes
(position, normal, texcoord etc), which are used to
compute the final color for this pixel
• The computations taking place here include
texture mapping and math operations
Memory Interface
• Fragments provided by the last step are written to
the framebuffer.
• Before the final write occurs, some fragments are
rejected by the zbuffer, stencil and alpha tests
Memory Model of GPU
Basic Architecture of GPU
CUDA(compute unified device
Architecture)
• CUDA is a parallel computing platform and
programming model.
• Created by NVIDIA and implemented by the
GPUs that they produce.
CUDA
• CUDA gives developers access to the
virtual instruction set and memory of the
parallel computational elements in CUDA
GPUs.
• CUDA supports standard programming
languages , including C++,python , Fortran.
Programming Model
• Threads are organized into blocks.
• Blocks are organized into a grid.
• A multiprocessor executes one block at a
time.
• A warp is the set of threads executed in
parallel.
• 32 threads in a warp.
Typical CUDA/GPU Program
1. CPU allocates storage on GPU (cudaMalloc).
2. CPU copies input data from CPU GPU
(cudaMemcpy).
3. CPU launches kernel on GPU to process the data.
(Kernel function<<<no of threads>>>(parameter))
4. CPU copies results back to CPU from GPU
(cudaMemcpy)
simply squaring the elements of an array
__global__ void square(float * d_out, float * d_in){
// Todo: Fill in this function
int idx = threadIdx.x;
float f = d_in[idx];
d_out[idx] = f*f
}
theadIdx.x =gives the current thread number
GPU/CUDA programming
Main program
int main(int argc, char **argv){
……………………
…………………….
float h_out[ARRAY_SIZE];
//declare GPU pointer
float * d_in;
float * d_out;
// allocate GPU memory
cudaMalloc( (void*) &d_in, ARRAY_BYTES);
cudaMalloc( (void*) &d_out, ARRAY_BYTES);
Main program(cont.)
// transfer the array to the GPU
cudaMemcpy(d_in, h_in, ARRAY_BYTES, cudaMemcpyHostToDevice);
// launch the kernel
square<<<1, ARRAY_SIZE>>>(d_out, d_in);
// copy back the result array to the CPU
cudaMemcpy(h_out, d_out, ARRAY_BYTES, cudaMemcpyDeviceToHost);
// print out the resulting array
for (int i =0; i < ARRAY_SIZE; i++) {
printf("%f", h_out[i]);
}
Programming Model
GPU vs CPU Code
Conclusion
• GPU computing is a good choice for fine-
grained data-parallel programs with limited
communication
• GPU computing is not so good for coarse-
grained program with a lot of communication
• The GPU has become a co-processor to the
CPU.
References
• 1.[‘IEEE’] Accelerating image processing capability using
graphics processors Jason. Dalea, Gordon. Caina, Brad.
ZellbaVision4ce Ltd. Crowthorne Enterprise Center,
Crowthorne, Berkshire, UK, RG45 6AWbVision4ce LLC
Severna Park, USA, MD2114
•
• 2.Udacity cs344,Intro to parallel Programming with GPU
• 3.Wikipedia
• 4.Nividia docs

Mais conteúdo relacionado

Mais procurados

GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)self employed
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An IntroductionDhan V Sagar
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)Fatima Qayyum
 
Graphics processing unit (GPU)
Graphics processing unit (GPU)Graphics processing unit (GPU)
Graphics processing unit (GPU)Amal R
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Jafar Khan
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing UnitKamran Ashraf
 
Gpu presentation
Gpu presentationGpu presentation
Gpu presentationJosiah Lund
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDAMartin Peniak
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPUChetan Gole
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware LandscapeGrigory Sapunov
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUGlobalLogic Ukraine
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit pptSandeep Singh
 

Mais procurados (20)

GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)GRAPHICS PROCESSING UNIT (GPU)
GRAPHICS PROCESSING UNIT (GPU)
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)GPU Architecture NVIDIA (GTX GeForce 480)
GPU Architecture NVIDIA (GTX GeForce 480)
 
Graphics processing unit (GPU)
Graphics processing unit (GPU)Graphics processing unit (GPU)
Graphics processing unit (GPU)
 
Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)Graphic Processing Unit (GPU)
Graphic Processing Unit (GPU)
 
Graphic Processing Unit
Graphic Processing UnitGraphic Processing Unit
Graphic Processing Unit
 
Gpu presentation
Gpu presentationGpu presentation
Gpu presentation
 
Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
 
Lec04 gpu architecture
Lec04 gpu architectureLec04 gpu architecture
Lec04 gpu architecture
 
Cuda
CudaCuda
Cuda
 
Graphics processing unit
Graphics processing unitGraphics processing unit
Graphics processing unit
 
NVIDIA CUDA
NVIDIA CUDANVIDIA CUDA
NVIDIA CUDA
 
Graphics Processing Unit - GPU
Graphics Processing Unit - GPUGraphics Processing Unit - GPU
Graphics Processing Unit - GPU
 
Deep learning: Hardware Landscape
Deep learning: Hardware LandscapeDeep learning: Hardware Landscape
Deep learning: Hardware Landscape
 
CPU vs GPU Comparison
CPU  vs GPU ComparisonCPU  vs GPU Comparison
CPU vs GPU Comparison
 
Architecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPUArchitecture of TPU, GPU and CPU
Architecture of TPU, GPU and CPU
 
Cuda Architecture
Cuda ArchitectureCuda Architecture
Cuda Architecture
 
Graphics processing unit ppt
Graphics processing unit pptGraphics processing unit ppt
Graphics processing unit ppt
 

Semelhante a Parallel computing with Gpu

Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUsfcassier
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to AcceleratorsDilum Bandara
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architectureDhaval Kaneria
 
lecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdflecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdfTigabu Yaya
 
A beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAA beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAPiyush Mittal
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
Parallel program design
Parallel program designParallel program design
Parallel program designZongYing Lyu
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilersAnastasiaStulova
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8AbdullahMunir32
 

Semelhante a Parallel computing with Gpu (20)

Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Monte Carlo on GPUs
Monte Carlo on GPUsMonte Carlo on GPUs
Monte Carlo on GPUs
 
Introduction to Accelerators
Introduction to AcceleratorsIntroduction to Accelerators
Introduction to Accelerators
 
Gpu with cuda architecture
Gpu with cuda architectureGpu with cuda architecture
Gpu with cuda architecture
 
lecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdflecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdf
 
A beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDAA beginner’s guide to programming GPUs with CUDA
A beginner’s guide to programming GPUs with CUDA
 
Exploring Gpgpu Workloads
Exploring Gpgpu WorkloadsExploring Gpgpu Workloads
Exploring Gpgpu Workloads
 
Programar para GPUs
Programar para GPUsProgramar para GPUs
Programar para GPUs
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
Parallel program design
Parallel program designParallel program design
Parallel program design
 
lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?
 
Challenges in GPU compilers
Challenges in GPU compilersChallenges in GPU compilers
Challenges in GPU compilers
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance Computing
 
Lecture 04
Lecture 04Lecture 04
Lecture 04
 
Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8Parallel and Distributed Computing Chapter 8
Parallel and Distributed Computing Chapter 8
 

Último

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Nikki Chapple
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch TuesdayIvanti
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Hiroshi SHIBATA
 

Último (20)

Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
Microsoft 365 Copilot: How to boost your productivity with AI – Part one: Ado...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
2024 April Patch Tuesday
2024 April Patch Tuesday2024 April Patch Tuesday
2024 April Patch Tuesday
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024Long journey of Ruby standard library at RubyConf AU 2024
Long journey of Ruby standard library at RubyConf AU 2024
 

Parallel computing with Gpu

  • 1. Rohit khatana Parallel Computing With GPU Rohit Khatana 4344 Seminar guide Prof. Aparna Joshi ARMY INSTITUE OF TECHNOLOGY
  • 2. Rohit khatana Content 1.What is parallel computing? 2.Gpu 3.CUDA 4.Application
  • 3. Rohit khatana What is Parallel Computing? Performing or Executing a task/program on more than one machine or processor. In simple way dividing a job in a group.
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. What kind of processors will we build? (major design constraint: power) Cpu: - Complex Control Hardware Flexibility + Performance Expensive in Terms of Power GPU: - Simpler Control Hardware More H/W for Computation Potentially More power Efficient (ops/watt) More Restrictive Programming Model
  • 13. Modern GPU has more ALU’s
  • 14. Graphics Logical Pipeline • The GPU receives geometry information from the CPU as an input and provides a picture as an output • Let’s see how that happens
  • 15. Host Interface • The host interface is the communication bridge between the CPU and the GPU • It receives commands from the CPU and also pulls geometry information from system memory • It outputs a stream of vertices in object space with all their associated information (normals, texture coordinates, per vertex color etc)
  • 16. Vertex Processing • The vertex processing stage receives vertices from the host interface in object space and outputs them in screen space • This may be a simple linear transformation, or a complex operation involving morphing effects • No new vertices are created in this stage, and no vertices are discarded (input/output has 1:1 mapping)
  • 17. Triangle Setup • In this stage geometry information becomes raster information (screen space geometry is the input, pixels are the output) • Prior to rasterization, triangles that are backfacing or are located outside the viewing frustrum are rejected
  • 18. Triangle Setup • A fragment is generated if and only if its center is inside the triangle • Every fragment generated has its attributes computed to be the perspective correct interpolation of the three vertices that make up the triangle
  • 19. Fragment Processing • Each fragment provided by triangle setup is fed into fragment processing as a set of attributes (position, normal, texcoord etc), which are used to compute the final color for this pixel • The computations taking place here include texture mapping and math operations
  • 20. Memory Interface • Fragments provided by the last step are written to the framebuffer. • Before the final write occurs, some fragments are rejected by the zbuffer, stencil and alpha tests
  • 23. CUDA(compute unified device Architecture) • CUDA is a parallel computing platform and programming model. • Created by NVIDIA and implemented by the GPUs that they produce.
  • 24. CUDA • CUDA gives developers access to the virtual instruction set and memory of the parallel computational elements in CUDA GPUs. • CUDA supports standard programming languages , including C++,python , Fortran.
  • 25. Programming Model • Threads are organized into blocks. • Blocks are organized into a grid. • A multiprocessor executes one block at a time. • A warp is the set of threads executed in parallel. • 32 threads in a warp.
  • 26. Typical CUDA/GPU Program 1. CPU allocates storage on GPU (cudaMalloc). 2. CPU copies input data from CPU GPU (cudaMemcpy). 3. CPU launches kernel on GPU to process the data. (Kernel function<<<no of threads>>>(parameter)) 4. CPU copies results back to CPU from GPU (cudaMemcpy)
  • 27. simply squaring the elements of an array __global__ void square(float * d_out, float * d_in){ // Todo: Fill in this function int idx = threadIdx.x; float f = d_in[idx]; d_out[idx] = f*f } theadIdx.x =gives the current thread number GPU/CUDA programming
  • 28. Main program int main(int argc, char **argv){ …………………… ……………………. float h_out[ARRAY_SIZE]; //declare GPU pointer float * d_in; float * d_out; // allocate GPU memory cudaMalloc( (void*) &d_in, ARRAY_BYTES); cudaMalloc( (void*) &d_out, ARRAY_BYTES);
  • 29. Main program(cont.) // transfer the array to the GPU cudaMemcpy(d_in, h_in, ARRAY_BYTES, cudaMemcpyHostToDevice); // launch the kernel square<<<1, ARRAY_SIZE>>>(d_out, d_in); // copy back the result array to the CPU cudaMemcpy(h_out, d_out, ARRAY_BYTES, cudaMemcpyDeviceToHost); // print out the resulting array for (int i =0; i < ARRAY_SIZE; i++) { printf("%f", h_out[i]); }
  • 31. GPU vs CPU Code
  • 32. Conclusion • GPU computing is a good choice for fine- grained data-parallel programs with limited communication • GPU computing is not so good for coarse- grained program with a lot of communication • The GPU has become a co-processor to the CPU.
  • 33. References • 1.[‘IEEE’] Accelerating image processing capability using graphics processors Jason. Dalea, Gordon. Caina, Brad. ZellbaVision4ce Ltd. Crowthorne Enterprise Center, Crowthorne, Berkshire, UK, RG45 6AWbVision4ce LLC Severna Park, USA, MD2114 • • 2.Udacity cs344,Intro to parallel Programming with GPU • 3.Wikipedia • 4.Nividia docs