SlideShare uma empresa Scribd logo
1 de 14
Baixar para ler offline
––




Accelerating High Performance Applications
Strategic Focus on Applications

 Senior-level relationship and market
 managers

 Dedicated technical resources

 More than 150 people devoted to
 libraries, tools, application porting
 and market development

 Worldwide focus
Reaching a Broad Range of Markets




  Scientific computing   Creative pro   Education / research
Strategic Partners
CAD/ CAM/    CAE/ EDA    Computational   Computational   Defence &      Digital        Physical       Seismic
CAID                     chemistry       Finance         Intelligence   Content        Sciences       processing
                                                                        creation                      and
                                                                                                      visualization
Autodesk     Ansys       Amber           MATLAB          Ikena          Adobe          Quda (L-QCD)   Schlumberger



Dassault     Dassault    NAMD            Mathematica     Intergraph     Autodesk M&E   WRF            Landmark
Systemes:    Systemes:
CATIA        Simulia
Solidworks

PTC          Nastran     Gromacs         NAG             ESRI           Avid           ACUSA          Paradigm



Siemens      LSTC        Lammps          Murex           Manifold       MainConcept    HOMME



             Synopsys    GAMESS                                         Sony           HYCOM
Leading MD Applications


                    Features
 Application                             GPU Perf   Release Status                           Notes
                   Supported
                     PMEMD :                                                       Single and multi-GPUs.
  AMBER         Explicit & Implicit         8X         V11 Released            Expect 2x more performance in
                     Solvent                                                     V11 patch release (shortly)

               Implicit (5x), Explicit              Single GPU released,             Next release: 2H2011
 GROMACS           (2x) Solvent
                                          2x-5x         Version 4.5.4                 Better Explicit, MPI

               Lennard-Jones, Gay-
 LAMMPS              Berne
                                            6x           Released                    Single and multi-GPU.


                  Non-bond force
  NAMD              calculation
                                          2x-7x        Released, v2.8                Single and multi-GPU.


                                                                    GPU Perf compared against Multi-core x86 CPU socket.
                                                                       GPU Perf benchmarked on GPU supported features
                                                                           and may be a kernel to kernel perf comparison
Additional MD/MM Applications Ramping

                    Features
 Application                             GPU Perf           Release Status                          Notes
                   Supported

                       TBD,                 4-29X                                                Single GPU.
 Abalone           “Simulations”          (on 1060 GPU)
                                                                Released
                                                                                             Agile Molecule, Inc.

                                                                                        Production bio-molecular
                                          “µ-sec long
                 Written for use on                                                  dynamics (MD) software specially
  ACEMD                GPUs
                                        trajectories on         Released
                                                                                      optimized to run on single and
                                         workstation”
                                                                                               multi-GPUs
               Two-body Forces, Link-
                                                            V 4.0 Source only              Next release: 2H2011
 DL_POLY       cell Pairs, Ewald SPME         4x            Results Published        Multi-GPU, multi-node supported
                  forces, Shake VV

 HOOMD-          Written for use on           2X            Released, Version
                                                                                            Single and multi-GPU.
                                        (32 CPU cores vs.
                       GPUs                                       0.9.2
  Blue                                    2 10XX GPUs)


                                                                           GPU Perf compared against Multi-core x86 CPU socket.
                                                                              GPU Perf benchmarked on GPU supported features
                                                                                  and may be a kernel to kernel perf comparison
Viz and “Docking” Applications

  Related              Features
                                                       GPU Perf       Release Status                            Notes
Applications          Supported
                                                                                                     Visualization from Visage
                  3D visualization of
                                                                                                  Imaging. Next release, 5.4, will
Amira 5®         volumetric data and                      N/A        Released, Version 5.3.3
                                                                                                   use GPU for general purpose
                       surfaces
                                                                                                   processing in some functions

  Core              GPU accelerated                      Up to
                                                                      Released, Suite 2011
                                                                                                       Single and multi-GPUs.
                      application                        5000X                                            Schrodinger, Inc.
 Hopping
                   Real-time shape
                                                                                                      Single and multi-GPUs.
FastROCS              similarity                       800-3000X            Released
                                                                                                   Open Eyes Scientific Software
                searching/comparison
                      High quality rendering,
               large structures (100 million atoms),
                       GPU acceleration for
                                                       100-125X or                                Visualization from University of
   VMD         computationally demanding analysis
                 and visualization tasks, multiple
                GPU support for very fast display of
                                                         greater
                                                                     Released, Version 1.9
                                                                                                   Illinois at Urbana-Champaign
                    molecular orbitals arising in
                  quantum chemistry calculations
                                                                                       GPU Perf compared against Multi-core x86 CPU socket.
                                                                                          GPU Perf benchmarked on GPU supported features
                                                                                              and may be a kernel to kernel perf comparison
Quantum Chemistry
                   Features            GPU
Application                                        Release Status                           Notes
                  Supported            Perf
                  Libqc with Rys
                                                                              Single GPU supported in 10/1/10
              Quadrature Algorithm,
                                                                                          release.
GAMESS-US      integral evaluation,     2.5X            Released
                                                                                   Multi-GPU supported in
                 closed shell Fock
                                                                                     July 2011 release.
               matrix construction
               Triples part of Reg-
                                                                                   Development GPGPU
                CCSD(T), CCSD &         3-8X           Date TBA,
NWChem            EOMCCSD task        projected     in development
                                                                                benchmarks: www.nwchem-
                                                                                         sw.org
                    schedulers
                                                       Date TBA,
                Various features        8-14x
 Q-CHEM         including RI-MP2      projected
                                                    In development               Significant porting already

                                      44-650X                                    Single and Multi-GPU.
                “Full GPU-based         vs.                                  Completely redesigned to exploit
TeraChem           solution”          GAMESS
                                                  Version 1.45 released
                                                                                massive GPU parallelism
                                      CPU ver.
                                                                   GPU Perf compared against Multi-core x86 CPU socket.
                                                                      GPU Perf benchmarked on GPU supported features
                                                                          and may be a kernel to kernel perf comparison
Material Science



                   Features            GPU
Application                                    Release Status                           Notes
                  Supported            Perf
               BigDFT - 50% of the                                       http://inac.cea.fr/L_Sim/BigDFT
 Abinit          program (short        6-30X   Released June 2009                  /news.html
                  convolutions)

Quantum-       PWscf package: linear
                 algebra (matrix
                                                                          Created by Irish Centre for High-
Espresso/       multiply), explicit    TBD     Released May 5, 2011
                                                                                  End Computing
              computational kernels,
  PWscf               3D FFTs




                                                                GPU Perf compared against Multi-core x86 CPU socket.
                                                                   GPU Perf benchmarked on GPU supported features
                                                                       and may be a kernel to kernel perf comparison
Bioinformatics


CUDA-BLASTP                 HEX Protein Docking
CUDA-EC                     Jacket (MATLAB Plugin)
CUDA-MEME                   MUMmerGPU
CUDASW++ (Smith-Waterman)   MUMmerGPU++
DNADist                     SARUMAN
GPU Blast                   SeqNFind
GPU-HMMER                   UGENE


                            Additional details can be found at Tesla Bio Workbench:
                            http://www.nvidia.com/object/tesla_bio_workbench.html
Structural Mechanics
    Application      GPU Features               GPU Perf               Release Status                        Notes
ANSYS Mechanical     Linear eqn solvers           2x Total            Today, release 13 SP2          FE implicit, single-GPU

 Abaqus/Standard     Linear eqn solver            2x Total             Today, release 6.11           FE implicit, single-GPU

  IMPETUS Afea       Explicit solver, SPH   10x SPH, 2x Total           Today, release 1.0           FE explicit, multi-GPU


 LS-DYNA implicit    Linear eqn solver            3x Total              Planned for 2011             FE implicit, multi-GPU


   MD Nastran        Linear eqn solvers          2x Solver              Planned for 2011             FE implicit, multi-GPU


       Marc          Linear eqn solver           1.5x Total             Planned for 2011             FE implicit, single-GPU

 RADIOSS Implicit    Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

PAM-CRASH implicit   Linear eqn solver           1.5x Total               Demonstration              FE implicit, single-GPU

   NX Nastran        Linear eqn solver           1.4x Total               Demonstration              FE implicit, single-GPU
                                   GPU Perf compared against Multi-core x86 CPU socket.
                                   GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Fluid Dynamics
   Application      GPU Features                GPU Perf              Release Status                       Notes
 Altair AcuSolve    Linear eqn solver             2x Total             Today, release 1.8     FE unstructured NS, multi-GPU

Autodesk Moldflow   Linear eqn solver             2x Total            Today, release 2011     FE unstructured NS, single-GPU

 FluiDyna LBultra   LBM, particle CFD            20x Total             Today, release 1.0       Structured LBM, multi-GPU

FluiDyna Culises-   Linear eqn solvers           3x Solver             Today, release 1.0       Unstructured NS, single-GPU
OpenFOAM Solver
 Vratis SpeedIT-    Linear eqn solvers           3x Solver             Today, release 1.2       Unstructured NS, multi-GPU
OpenFOAM Solver
   Prometech        MPS, particle CFD           4x-9x Total           Q3CY11 release 2.5         Particle based, multi-GPU
  Particleworks
  Sandia NL S3D     Chemistry kernel       8x SP, 5x DP kernel           Demonstration        Structured grid DNS, multi-GPU

  Turbostream         Explicit solver            19x Total             Today, release 2.0      Structured grid NS, multi-GPU

 SD++ (Jameson)       Explicit solver            16x Total             Planned for 2011       FE unstructured NS, multi-GPU
                                    GPU Perf compared against Multi-core x86 CPU socket.
 FEFLO (Lohner)       Explicit solver            2x Total            Planned for 2011         FE unstructured NS, multi-GPU
                                    GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
Electromagnetics

                     Features
  Application                           GPU Perf       Release Status                      Notes
                    Supported
                                                                                     Single & multi-GPU;
 Agilent EMPro          FDTD                6X         2011.07 Released
                                                                                        EMPro 2011 PR

                     Transient (FIT)    9X on 1 GPU
CST Microwave                                                                        Single & multi-GPU;
                 solver; Combined MPI   to 20X+ on 4     2011 Released
                                                                                      www.cst.com/perf
    Studio         & GPU computing          GPUs
                                                                                   Single and multi-GPU;
Remcom XFdtd            FDTD              30-300X        XF7 Released
                                                                                 XStream GPU acceleration

                       FDTD;                                                        Single and multi-GPU;
SPEAG SEMCAD X       Acceleware
                                           100X         14.4.3 Released
                                                                                    www.speag.com/perf




                                                           GPU Performance compared against quad-core x86 CPU socket;
                                                        Remcom XFdtd GPU performance compared against single core CPU
Climate/ Weather/ Ocean
Application   GPU Features                 GPU Perf              Production Status                       Notes
               WSM5, WSM3, Ice
  WRF         Microphysics models
                                         4x-6x Models               Today, release 3.2                  single-GPU


 ASUCA           Most routines             12x Total              In production at JMA                  multi-GPU

   NIM           Most routines           7x Dynamics               Limited production                   multi-GPU

 HIRLAM         Dynamical core             3x Solver                 Planned for 2011                   multi-GPU


 HOMME              Models                 3x Models                 Planned for 2011                   single-GPU


  CAM          Linear eqn solver           2x Solver                 Planned for 2011                   single-GPU

                                        10x Models, 3x
 GEOS-5          Most routines
                                          Dynamics
                                                                      Demonstration                     multi-GPU


 MITgcm        Linear eqn solver           3x solver                  Demonstration                     single-GPU

 HYCOM         Linear eqn solver           2x solver                  Demonstration                     single-GPU
                             GPU Perf compared against Multi-core x86 CPU socket.
                             GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison

Mais conteúdo relacionado

Mais procurados

AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD
 
Poser pro reference manual
Poser pro reference manualPoser pro reference manual
Poser pro reference manualSykrayo
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD
 
AMD Unified Video Decoder
AMD Unified Video DecoderAMD Unified Video Decoder
AMD Unified Video DecoderAMD
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedThe Linux Foundation
 
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraCongatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraIşınsu Akçetin
 
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulCongatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulIşınsu Akçetin
 
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterToward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterRyousei Takano
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersEric Van Hensbergen
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processorsaccount inactive
 
Hp All In 1
Hp All In 1Hp All In 1
Hp All In 1RBratton
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialJeff Larkin
 
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011Shinya Takamaeda-Y
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)cloudmmog
 
Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Bengt Edlund
 

Mais procurados (20)

AMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series PresentationAMD Opteron 6200 and 4200 Series Presentation
AMD Opteron 6200 and 4200 Series Presentation
 
Poser pro reference manual
Poser pro reference manualPoser pro reference manual
Poser pro reference manual
 
AMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick BergmanAMD Analyst Day 2009: Rick Bergman
AMD Analyst Day 2009: Rick Bergman
 
AMD Unified Video Decoder
AMD Unified Video DecoderAMD Unified Video Decoder
AMD Unified Video Decoder
 
Hardware assisted Virtualization in Embedded
Hardware assisted Virtualization in EmbeddedHardware assisted Virtualization in Embedded
Hardware assisted Virtualization in Embedded
 
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_AnkaraCongatec_Global Vendor for Innovative Embedded Solutions_Ankara
Congatec_Global Vendor for Innovative Embedded Solutions_Ankara
 
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_IstanbulCongatec_Global Vendor for Innovative Embedded Solutions_Istanbul
Congatec_Global Vendor for Innovative Embedded Solutions_Istanbul
 
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC clusterToward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
Toward a practical “HPC Cloud”: Performance tuning of a virtualized HPC cluster
 
Implement Checkpointing for Android
Implement Checkpointing for AndroidImplement Checkpointing for Android
Implement Checkpointing for Android
 
AMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press PresentationAMD Opteron 6000 Series Platform Press Presentation
AMD Opteron 6000 Series Platform Press Presentation
 
Simulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to SupercomputersSimulation Directed Co-Design from Smartphones to Supercomputers
Simulation Directed Co-Design from Smartphones to Supercomputers
 
Case Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded ProcessorsCase Study: Porting Qt for Embedded Linux on Embedded Processors
Case Study: Porting Qt for Embedded Linux on Embedded Processors
 
Implement Checkpointing for Android (ELCE2012)
Implement Checkpointing for Android (ELCE2012)Implement Checkpointing for Android (ELCE2012)
Implement Checkpointing for Android (ELCE2012)
 
Hp All In 1
Hp All In 1Hp All In 1
Hp All In 1
 
Power7 facts and features 17 aug
Power7 facts and features 17 augPower7 facts and features 17 aug
Power7 facts and features 17 aug
 
AMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop ProductsAMD Chiplet Architecture for High-Performance Server and Desktop Products
AMD Chiplet Architecture for High-Performance Server and Desktop Products
 
HPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorialHPCMPUG2011 cray tutorial
HPCMPUG2011 cray tutorial
 
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
An FPGA-based Scalable Simulation Accelerator for Tile Architectures @HEART2011
 
Gentek Introduce(en)
Gentek Introduce(en)Gentek Introduce(en)
Gentek Introduce(en)
 
Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1Dme presentation-feb2013v2-1
Dme presentation-feb2013v2-1
 

Semelhante a Nvidia Cuda Apps Jun27 11

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrKohei KaiGai
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101John Holden
 
2D Games to HPC
2D Games to HPC2D Games to HPC
2D Games to HPCDVClub
 
GPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O ArchitectureGPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O Architectureguestb3fc97
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”HSA Foundation
 
Compute API –Past & Future
Compute API –Past & FutureCompute API –Past & Future
Compute API –Past & FutureOfer Rosenberg
 
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsightlaparuma
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation AMD
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationCliff Kinard
 
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverAnne Nicolas
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeireimec
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Slide_N
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point RacetrackDino, llc
 

Semelhante a Nvidia Cuda Apps Jun27 11 (20)

PG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated AsyncrPG-Strom - GPU Accelerated Asyncr
PG-Strom - GPU Accelerated Asyncr
 
N A G P A R I S280101
N A G P A R I S280101N A G P A R I S280101
N A G P A R I S280101
 
3 d to_hpc
3 d to_hpc3 d to_hpc
3 d to_hpc
 
2D Games to HPC
2D Games to HPC2D Games to HPC
2D Games to HPC
 
GPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O ArchitectureGPU Virtualization on VMware's Hosted I/O Architecture
GPU Virtualization on VMware's Hosted I/O Architecture
 
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.” AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
AFDS 2011 Phil Rogers Keynote: “The Programmer’s Guide to the APU Galaxy.”
 
PG-Strom
PG-StromPG-Strom
PG-Strom
 
Compute API –Past & Future
Compute API –Past & FutureCompute API –Past & Future
Compute API –Past & Future
 
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
[03 2][gpu용 개발자 도구 - parallel nsight 및 axe] gateau parallel-nsight
 
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation Heterogeneous Systems Architecture: The Next Area of Computing Innovation
Heterogeneous Systems Architecture: The Next Area of Computing Innovation
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client PresentationBladeCenter GPU Expansion Blade (BGE) - Client Presentation
BladeCenter GPU Expansion Blade (BGE) - Client Presentation
 
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driverKernel Recipes 2014 - The Linux graphics stack and Nouveau driver
Kernel Recipes 2014 - The Linux graphics stack and Nouveau driver
 
iMinds The Conference: Jan Lemeire
iMinds The Conference: Jan LemeireiMinds The Conference: Jan Lemeire
iMinds The Conference: Jan Lemeire
 
Pgopencl
PgopenclPgopencl
Pgopencl
 
PostgreSQL with OpenCL
PostgreSQL with OpenCLPostgreSQL with OpenCL
PostgreSQL with OpenCL
 
Introduction to GPU Programming
Introduction to GPU ProgrammingIntroduction to GPU Programming
Introduction to GPU Programming
 
Example Application of GPU
Example Application of GPUExample Application of GPU
Example Application of GPU
 
Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3Introduction to the Graphics Pipeline of the PS3
Introduction to the Graphics Pipeline of the PS3
 
Sears Point Racetrack
Sears Point RacetrackSears Point Racetrack
Sears Point Racetrack
 

Último

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 

Último (20)

[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

Nvidia Cuda Apps Jun27 11

  • 2. Strategic Focus on Applications Senior-level relationship and market managers Dedicated technical resources More than 150 people devoted to libraries, tools, application porting and market development Worldwide focus
  • 3. Reaching a Broad Range of Markets Scientific computing Creative pro Education / research
  • 4. Strategic Partners CAD/ CAM/ CAE/ EDA Computational Computational Defence & Digital Physical Seismic CAID chemistry Finance Intelligence Content Sciences processing creation and visualization Autodesk Ansys Amber MATLAB Ikena Adobe Quda (L-QCD) Schlumberger Dassault Dassault NAMD Mathematica Intergraph Autodesk M&E WRF Landmark Systemes: Systemes: CATIA Simulia Solidworks PTC Nastran Gromacs NAG ESRI Avid ACUSA Paradigm Siemens LSTC Lammps Murex Manifold MainConcept HOMME Synopsys GAMESS Sony HYCOM
  • 5. Leading MD Applications Features Application GPU Perf Release Status Notes Supported PMEMD : Single and multi-GPUs. AMBER Explicit & Implicit 8X V11 Released Expect 2x more performance in Solvent V11 patch release (shortly) Implicit (5x), Explicit Single GPU released, Next release: 2H2011 GROMACS (2x) Solvent 2x-5x Version 4.5.4 Better Explicit, MPI Lennard-Jones, Gay- LAMMPS Berne 6x Released Single and multi-GPU. Non-bond force NAMD calculation 2x-7x Released, v2.8 Single and multi-GPU. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 6. Additional MD/MM Applications Ramping Features Application GPU Perf Release Status Notes Supported TBD, 4-29X Single GPU. Abalone “Simulations” (on 1060 GPU) Released Agile Molecule, Inc. Production bio-molecular “µ-sec long Written for use on dynamics (MD) software specially ACEMD GPUs trajectories on Released optimized to run on single and workstation” multi-GPUs Two-body Forces, Link- V 4.0 Source only Next release: 2H2011 DL_POLY cell Pairs, Ewald SPME 4x Results Published Multi-GPU, multi-node supported forces, Shake VV HOOMD- Written for use on 2X Released, Version Single and multi-GPU. (32 CPU cores vs. GPUs 0.9.2 Blue 2 10XX GPUs) GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 7. Viz and “Docking” Applications Related Features GPU Perf Release Status Notes Applications Supported Visualization from Visage 3D visualization of Imaging. Next release, 5.4, will Amira 5® volumetric data and N/A Released, Version 5.3.3 use GPU for general purpose surfaces processing in some functions Core GPU accelerated Up to Released, Suite 2011 Single and multi-GPUs. application 5000X Schrodinger, Inc. Hopping Real-time shape Single and multi-GPUs. FastROCS similarity 800-3000X Released Open Eyes Scientific Software searching/comparison High quality rendering, large structures (100 million atoms), GPU acceleration for 100-125X or Visualization from University of VMD computationally demanding analysis and visualization tasks, multiple GPU support for very fast display of greater Released, Version 1.9 Illinois at Urbana-Champaign molecular orbitals arising in quantum chemistry calculations GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 8. Quantum Chemistry Features GPU Application Release Status Notes Supported Perf Libqc with Rys Single GPU supported in 10/1/10 Quadrature Algorithm, release. GAMESS-US integral evaluation, 2.5X Released Multi-GPU supported in closed shell Fock July 2011 release. matrix construction Triples part of Reg- Development GPGPU CCSD(T), CCSD & 3-8X Date TBA, NWChem EOMCCSD task projected in development benchmarks: www.nwchem- sw.org schedulers Date TBA, Various features 8-14x Q-CHEM including RI-MP2 projected In development Significant porting already 44-650X Single and Multi-GPU. “Full GPU-based vs. Completely redesigned to exploit TeraChem solution” GAMESS Version 1.45 released massive GPU parallelism CPU ver. GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 9. Material Science Features GPU Application Release Status Notes Supported Perf BigDFT - 50% of the http://inac.cea.fr/L_Sim/BigDFT Abinit program (short 6-30X Released June 2009 /news.html convolutions) Quantum- PWscf package: linear algebra (matrix Created by Irish Centre for High- Espresso/ multiply), explicit TBD Released May 5, 2011 End Computing computational kernels, PWscf 3D FFTs GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 10. Bioinformatics CUDA-BLASTP HEX Protein Docking CUDA-EC Jacket (MATLAB Plugin) CUDA-MEME MUMmerGPU CUDASW++ (Smith-Waterman) MUMmerGPU++ DNADist SARUMAN GPU Blast SeqNFind GPU-HMMER UGENE Additional details can be found at Tesla Bio Workbench: http://www.nvidia.com/object/tesla_bio_workbench.html
  • 11. Structural Mechanics Application GPU Features GPU Perf Release Status Notes ANSYS Mechanical Linear eqn solvers 2x Total Today, release 13 SP2 FE implicit, single-GPU Abaqus/Standard Linear eqn solver 2x Total Today, release 6.11 FE implicit, single-GPU IMPETUS Afea Explicit solver, SPH 10x SPH, 2x Total Today, release 1.0 FE explicit, multi-GPU LS-DYNA implicit Linear eqn solver 3x Total Planned for 2011 FE implicit, multi-GPU MD Nastran Linear eqn solvers 2x Solver Planned for 2011 FE implicit, multi-GPU Marc Linear eqn solver 1.5x Total Planned for 2011 FE implicit, single-GPU RADIOSS Implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU PAM-CRASH implicit Linear eqn solver 1.5x Total Demonstration FE implicit, single-GPU NX Nastran Linear eqn solver 1.4x Total Demonstration FE implicit, single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 12. Fluid Dynamics Application GPU Features GPU Perf Release Status Notes Altair AcuSolve Linear eqn solver 2x Total Today, release 1.8 FE unstructured NS, multi-GPU Autodesk Moldflow Linear eqn solver 2x Total Today, release 2011 FE unstructured NS, single-GPU FluiDyna LBultra LBM, particle CFD 20x Total Today, release 1.0 Structured LBM, multi-GPU FluiDyna Culises- Linear eqn solvers 3x Solver Today, release 1.0 Unstructured NS, single-GPU OpenFOAM Solver Vratis SpeedIT- Linear eqn solvers 3x Solver Today, release 1.2 Unstructured NS, multi-GPU OpenFOAM Solver Prometech MPS, particle CFD 4x-9x Total Q3CY11 release 2.5 Particle based, multi-GPU Particleworks Sandia NL S3D Chemistry kernel 8x SP, 5x DP kernel Demonstration Structured grid DNS, multi-GPU Turbostream Explicit solver 19x Total Today, release 2.0 Structured grid NS, multi-GPU SD++ (Jameson) Explicit solver 16x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf compared against Multi-core x86 CPU socket. FEFLO (Lohner) Explicit solver 2x Total Planned for 2011 FE unstructured NS, multi-GPU GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison
  • 13. Electromagnetics Features Application GPU Perf Release Status Notes Supported Single & multi-GPU; Agilent EMPro FDTD 6X 2011.07 Released EMPro 2011 PR Transient (FIT) 9X on 1 GPU CST Microwave Single & multi-GPU; solver; Combined MPI to 20X+ on 4 2011 Released www.cst.com/perf Studio & GPU computing GPUs Single and multi-GPU; Remcom XFdtd FDTD 30-300X XF7 Released XStream GPU acceleration FDTD; Single and multi-GPU; SPEAG SEMCAD X Acceleware 100X 14.4.3 Released www.speag.com/perf GPU Performance compared against quad-core x86 CPU socket; Remcom XFdtd GPU performance compared against single core CPU
  • 14. Climate/ Weather/ Ocean Application GPU Features GPU Perf Production Status Notes WSM5, WSM3, Ice WRF Microphysics models 4x-6x Models Today, release 3.2 single-GPU ASUCA Most routines 12x Total In production at JMA multi-GPU NIM Most routines 7x Dynamics Limited production multi-GPU HIRLAM Dynamical core 3x Solver Planned for 2011 multi-GPU HOMME Models 3x Models Planned for 2011 single-GPU CAM Linear eqn solver 2x Solver Planned for 2011 single-GPU 10x Models, 3x GEOS-5 Most routines Dynamics Demonstration multi-GPU MITgcm Linear eqn solver 3x solver Demonstration single-GPU HYCOM Linear eqn solver 2x solver Demonstration single-GPU GPU Perf compared against Multi-core x86 CPU socket. GPU Perf benchmarked on GPU supported features and may be a kernel to kernel perf comparison