SlideShare uma empresa Scribd logo
1 de 20
Baixar para ler offline
GPU Computing with Ruby



      SpeedGo Computing

         Chung Shin Yee
 shinyee@speedgocomputing.com
CPU vs GPU Architecture
        6 Core vs 1024 Core
6 GB/s vs 300 GB/s Memory Bandwidth




       By CUDA C Programming Guide
CUDA Programming Model



                              .
                              .
                              .
                              .




By CUDA C Programming Guide
Existing Programming Tools
●   Cg
●   BrookGPU
●   GLSL (OpenGL Shading Language)
●   Nvidia CUDA C/C++
●   OpenCL
●   PyCUDA     Where is the Red Ruby ?
Bridging Ruby & CUDA C/C++
●   Ruby C extension
       –   Hard to manipulate Ruby objects in C.
       –   Compilation problems.
●   Ruby FFI
       –   Bridging purely in Ruby.
       –   Support multiple Ruby implementations.
Ruby Bridge Sample
Developing SGC Ruby CUDA
●   Object-oriented API.
●   Start with crucial operations.
       –   Memory allocation.
       –   Memory transfer.
       –   Kernel launch.
       –   Wrapper for structures.
●   Documented with YARD.
Driver vs Runtime API
●   CUDA Driver API
      –   For system developers.
      –   Supported by PyCUDA.
●   CUDA Runtime API
      –   For computation centric developers.


          We going to support both API !
Using SGC Ruby CUDA
●   Kernel program in CUDA C.
Using SGC Ruby CUDA
●   Compiling kernel into PTX.
       –   nvcc --ptx vadd.cu
Using SGC Ruby CUDA
●   Setup
        require 'rubycu'
        include SGC::CU
        CUInit.init
        d = CUDevice.get(0)
        c = CUContext.create(d)
        m = CUModule.new.load(“vadd.ptx”)
        f = m.function(“vadd”)
Using SGC Ruby CUDA
●   Memory allocations
        da = CUDevice.malloc(10*4)
        db = CUDevice.malloc(10*4)
        dc = CUDevice.malloc(10*4)
        ha = Buffer.new(:int, 10)
        hb = Buffer.new(:int, 10)
        hc = Buffer.new(:int, 10)
Using SGC Ruby CUDA
●   Initialization
         (0...10).each { |i|
                ha[i] = i
                hb[i] = 1
                hc[i] = ha[i] + hb[i]
                hd[i] = 0
         }
Using SGC Ruby CUDA
●   Transfer inputs to the GPU
        CUMemory.memcpy_htod(da, ha, 4*10)
        CUMemory.memcpy_htod(db, hb, 4*10)
        CUMemory.memcpy_htod(dc, hc, 4*10)
Using SGC Ruby CUDA
●    Launch kernel on GPU
            # Launch with 1x1x1 grid,
            # 10x1x1 blocks,
            params = [da, db, dc, 10]
            f.launch_kernel(1, 1, 1, 10, 1, 1, 0, 0, params)




    By CUDA C Programming Guide       By CUDA C Programming Guide
Using SGC Ruby CUDA
●   Transfer results back to system memory
         CUMemory.memcpy_dtoh(hd, dc, 4*10)
●   Verify results
         (0...10).each { |i|
               assert_equal(hc[i], hd[i])
         }
Problematic CUDA Runtime API
●   For use in a CUDA C/C++ program.
●   Workaround
       –   CUDA C/C++ effectively uses C/C++
            bindings.
       –   Create dynamic library for the kernel
            programs.
       –   Load the library at runtime.
Current Limitations
●   Support limited data types.
       –   Fixnum   → int
       –   ??       → long
       –   Float    → float
       –   ??       → double
●   No supports for CUDA C++ templates.
●   No Ruby in a kernel program.
To Support
●   Texture memory.
●   New features in CUDA 4.0
       –   Multi-GPU.
       –   Unified Virtual Memory.
●   More C data types.
●   Mac platform.
Try It Now! Thank You ~
git clone git://github.com/xman/sgc-ruby-cuda.git
cd sgc-ruby-cuda
gem install ffi yard
rake test
rake yard

Mais conteúdo relacionado

Mais procurados

Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDAMartin Peniak
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Angela Mendoza M.
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Rob Gillen
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule
 
NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009Randall Hand
 
Introduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : NotesIntroduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : NotesSubhajit Sahu
 
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012DefCamp
 
GPGPU programming with CUDA
GPGPU programming with CUDAGPGPU programming with CUDA
GPGPU programming with CUDASavith Satheesh
 
How to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memoHow to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memoNaoto MATSUMOTO
 
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...Ural-PDC
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures 柏瑀 黃
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaRob Gillen
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...AMD Developer Central
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUsShree Kumar
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauAMD Developer Central
 

Mais procurados (20)

Introduction to parallel computing using CUDA
Introduction to parallel computing using CUDAIntroduction to parallel computing using CUDA
Introduction to parallel computing using CUDA
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
Kato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance ComputingKato Mivule: An Overview of CUDA for High Performance Computing
Kato Mivule: An Overview of CUDA for High Performance Computing
 
GPU: Understanding CUDA
GPU: Understanding CUDAGPU: Understanding CUDA
GPU: Understanding CUDA
 
CUDA
CUDACUDA
CUDA
 
NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009NVidia CUDA Tutorial - June 15, 2009
NVidia CUDA Tutorial - June 15, 2009
 
Introduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : NotesIntroduction to CUDA C: NVIDIA : Notes
Introduction to CUDA C: NVIDIA : Notes
 
Cuda tutorial
Cuda tutorialCuda tutorial
Cuda tutorial
 
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012NVidia CUDA for Bruteforce Attacks - DefCamp 2012
NVidia CUDA for Bruteforce Attacks - DefCamp 2012
 
Debugging CUDA applications
Debugging CUDA applicationsDebugging CUDA applications
Debugging CUDA applications
 
GPGPU programming with CUDA
GPGPU programming with CUDAGPGPU programming with CUDA
GPGPU programming with CUDA
 
How to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memoHow to Burn Multi-GPUs using CUDA stress test memo
How to Burn Multi-GPUs using CUDA stress test memo
 
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
Applying of the NVIDIA CUDA to the video processing in the task of the roundw...
 
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 qCUDA-ARM : Virtualization for Embedded GPU Architectures  qCUDA-ARM : Virtualization for Embedded GPU Architectures
qCUDA-ARM : Virtualization for Embedded GPU Architectures
 
Intro to GPGPU Programming with Cuda
Intro to GPGPU Programming with CudaIntro to GPGPU Programming with Cuda
Intro to GPGPU Programming with Cuda
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 
Cuda intro
Cuda introCuda intro
Cuda intro
 
Computing using GPUs
Computing using GPUsComputing using GPUs
Computing using GPUs
 
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill BilodeauGS-4108, Direct Compute in Gaming, by Bill Bilodeau
GS-4108, Direct Compute in Gaming, by Bill Bilodeau
 

Destaque

Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1
Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1
Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1Preston Lee
 
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...HAMSproject
 
Sun Tzu: The Art Of...Business?
Sun Tzu: The Art Of...Business?Sun Tzu: The Art Of...Business?
Sun Tzu: The Art Of...Business?Preston Lee
 
Computación paralela con gp us cuda
Computación paralela con gp us cudaComputación paralela con gp us cuda
Computación paralela con gp us cudaJavier Zarco
 
FAST MAP PROJECTION ON CUDA.ppt
FAST MAP PROJECTION ON CUDA.pptFAST MAP PROJECTION ON CUDA.ppt
FAST MAP PROJECTION ON CUDA.pptgrssieee
 
OpenHPI - Parallel Programming Concepts - Week 4
OpenHPI - Parallel Programming Concepts - Week 4OpenHPI - Parallel Programming Concepts - Week 4
OpenHPI - Parallel Programming Concepts - Week 4Peter Tröger
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentationVishal Singh
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An IntroductionDhan V Sagar
 

Destaque (9)

Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1
Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1
Ruby Supercomputing - Using The GPU For Massive Performance Speedup v1.1
 
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...
Hardware Acceleration of Computional Fluid Dynamics SImulations in an Oxygena...
 
Sun Tzu: The Art Of...Business?
Sun Tzu: The Art Of...Business?Sun Tzu: The Art Of...Business?
Sun Tzu: The Art Of...Business?
 
Equipo 2 gpus
Equipo 2 gpusEquipo 2 gpus
Equipo 2 gpus
 
Computación paralela con gp us cuda
Computación paralela con gp us cudaComputación paralela con gp us cuda
Computación paralela con gp us cuda
 
FAST MAP PROJECTION ON CUDA.ppt
FAST MAP PROJECTION ON CUDA.pptFAST MAP PROJECTION ON CUDA.ppt
FAST MAP PROJECTION ON CUDA.ppt
 
OpenHPI - Parallel Programming Concepts - Week 4
OpenHPI - Parallel Programming Concepts - Week 4OpenHPI - Parallel Programming Concepts - Week 4
OpenHPI - Parallel Programming Concepts - Week 4
 
CPU vs. GPU presentation
CPU vs. GPU presentationCPU vs. GPU presentation
CPU vs. GPU presentation
 
GPU - An Introduction
GPU - An IntroductionGPU - An Introduction
GPU - An Introduction
 

Semelhante a GPU Computing with Ruby

lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxssuser413a98
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUShohei Hido
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.J On The Beach
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesSubhajit Sahu
 
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersHybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersKoichi Shirahata
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapGeorge Markomanolis
 
DockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaDockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaŁukasz Piątkowski
 
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes][HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]Wong Hoi Sing Edison
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011Raymond Tay
 
lecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdflecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdfTigabu Yaya
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
NVIDIA GTC 2019:  Red Hat and the NVIDIA DGX: Tried, Tested, TrustedNVIDIA GTC 2019:  Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, TrustedJeremy Eder
 
Pycon2014 GPU computing
Pycon2014 GPU computingPycon2014 GPU computing
Pycon2014 GPU computingAshwin Ashok
 
Building a continuous delivery platform for the biggest spike in e-commerce -...
Building a continuous delivery platform for the biggest spike in e-commerce -...Building a continuous delivery platform for the biggest spike in e-commerce -...
Building a continuous delivery platform for the biggest spike in e-commerce -...Puppet
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computingbakers84
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudamistercteam
 

Semelhante a GPU Computing with Ruby (20)

lecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptxlecture11_GPUArchCUDA01.pptx
lecture11_GPUArchCUDA01.pptx
 
CuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPUCuPy: A NumPy-compatible Library for GPU
CuPy: A NumPy-compatible Library for GPU
 
Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.Using GPUs to handle Big Data with Java by Adam Roberts.
Using GPUs to handle Big Data with Java by Adam Roberts.
 
CUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : NotesCUDA by Example : The Final Countdown : Notes
CUDA by Example : The Final Countdown : Notes
 
Getting started with AMD GPUs
Getting started with AMD GPUsGetting started with AMD GPUs
Getting started with AMD GPUs
 
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous ClustersHybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
Hybrid Map Task Scheduling for GPU-based Heterogeneous Clusters
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Utilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmapUtilizing AMD GPUs: Tuning, programming models, and roadmap
Utilizing AMD GPUs: Tuning, programming models, and roadmap
 
DockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing AureaDockerCon EU '17 - Dockerizing Aurea
DockerCon EU '17 - Dockerizing Aurea
 
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes][HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]
[HKOSCon x COSCUP 2020][20200801][Ansible: From VM to Kubernetes]
 
Introduction to cuda geek camp singapore 2011
Introduction to cuda   geek camp singapore 2011Introduction to cuda   geek camp singapore 2011
Introduction to cuda geek camp singapore 2011
 
lecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdflecture_GPUArchCUDA02-CUDAMem.pdf
lecture_GPUArchCUDA02-CUDAMem.pdf
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
NVIDIA GTC 2019:  Red Hat and the NVIDIA DGX: Tried, Tested, TrustedNVIDIA GTC 2019:  Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
NVIDIA GTC 2019: Red Hat and the NVIDIA DGX: Tried, Tested, Trusted
 
Pycon2014 GPU computing
Pycon2014 GPU computingPycon2014 GPU computing
Pycon2014 GPU computing
 
Building a continuous delivery platform for the biggest spike in e-commerce -...
Building a continuous delivery platform for the biggest spike in e-commerce -...Building a continuous delivery platform for the biggest spike in e-commerce -...
Building a continuous delivery platform for the biggest spike in e-commerce -...
 
The Rise of Parallel Computing
The Rise of Parallel ComputingThe Rise of Parallel Computing
The Rise of Parallel Computing
 
S0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cudaS0333 gtc2012-gmac-programming-cuda
S0333 gtc2012-gmac-programming-cuda
 

Último

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusZilliz
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdflior mazor
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfOverkill Security
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbuapidays
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Zilliz
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 

Último (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source MilvusA Beginners Guide to Building a RAG App Using Open Source Milvus
A Beginners Guide to Building a RAG App Using Open Source Milvus
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Ransomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdfRansomware_Q4_2023. The report. [EN].pdf
Ransomware_Q4_2023. The report. [EN].pdf
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 

GPU Computing with Ruby

  • 1. GPU Computing with Ruby SpeedGo Computing Chung Shin Yee shinyee@speedgocomputing.com
  • 2. CPU vs GPU Architecture 6 Core vs 1024 Core 6 GB/s vs 300 GB/s Memory Bandwidth By CUDA C Programming Guide
  • 3. CUDA Programming Model . . . . By CUDA C Programming Guide
  • 4. Existing Programming Tools ● Cg ● BrookGPU ● GLSL (OpenGL Shading Language) ● Nvidia CUDA C/C++ ● OpenCL ● PyCUDA Where is the Red Ruby ?
  • 5. Bridging Ruby & CUDA C/C++ ● Ruby C extension – Hard to manipulate Ruby objects in C. – Compilation problems. ● Ruby FFI – Bridging purely in Ruby. – Support multiple Ruby implementations.
  • 7. Developing SGC Ruby CUDA ● Object-oriented API. ● Start with crucial operations. – Memory allocation. – Memory transfer. – Kernel launch. – Wrapper for structures. ● Documented with YARD.
  • 8. Driver vs Runtime API ● CUDA Driver API – For system developers. – Supported by PyCUDA. ● CUDA Runtime API – For computation centric developers. We going to support both API !
  • 9. Using SGC Ruby CUDA ● Kernel program in CUDA C.
  • 10. Using SGC Ruby CUDA ● Compiling kernel into PTX. – nvcc --ptx vadd.cu
  • 11. Using SGC Ruby CUDA ● Setup require 'rubycu' include SGC::CU CUInit.init d = CUDevice.get(0) c = CUContext.create(d) m = CUModule.new.load(“vadd.ptx”) f = m.function(“vadd”)
  • 12. Using SGC Ruby CUDA ● Memory allocations da = CUDevice.malloc(10*4) db = CUDevice.malloc(10*4) dc = CUDevice.malloc(10*4) ha = Buffer.new(:int, 10) hb = Buffer.new(:int, 10) hc = Buffer.new(:int, 10)
  • 13. Using SGC Ruby CUDA ● Initialization (0...10).each { |i| ha[i] = i hb[i] = 1 hc[i] = ha[i] + hb[i] hd[i] = 0 }
  • 14. Using SGC Ruby CUDA ● Transfer inputs to the GPU CUMemory.memcpy_htod(da, ha, 4*10) CUMemory.memcpy_htod(db, hb, 4*10) CUMemory.memcpy_htod(dc, hc, 4*10)
  • 15. Using SGC Ruby CUDA ● Launch kernel on GPU # Launch with 1x1x1 grid, # 10x1x1 blocks, params = [da, db, dc, 10] f.launch_kernel(1, 1, 1, 10, 1, 1, 0, 0, params) By CUDA C Programming Guide By CUDA C Programming Guide
  • 16. Using SGC Ruby CUDA ● Transfer results back to system memory CUMemory.memcpy_dtoh(hd, dc, 4*10) ● Verify results (0...10).each { |i| assert_equal(hc[i], hd[i]) }
  • 17. Problematic CUDA Runtime API ● For use in a CUDA C/C++ program. ● Workaround – CUDA C/C++ effectively uses C/C++ bindings. – Create dynamic library for the kernel programs. – Load the library at runtime.
  • 18. Current Limitations ● Support limited data types. – Fixnum → int – ?? → long – Float → float – ?? → double ● No supports for CUDA C++ templates. ● No Ruby in a kernel program.
  • 19. To Support ● Texture memory. ● New features in CUDA 4.0 – Multi-GPU. – Unified Virtual Memory. ● More C data types. ● Mac platform.
  • 20. Try It Now! Thank You ~ git clone git://github.com/xman/sgc-ruby-cuda.git cd sgc-ruby-cuda gem install ffi yard rake test rake yard