2. 2
WHAT IS OPENACC
main()
{
<serial code>
#pragma acc kernels
{
<parallel code>
}
}
Add Simple Compiler Directive
POWERFUL & PORTABLE
Directives-based
programming model for
parallel
computing
Designed for
performance and
portability on
CPUs and GPUs
SIMPLE
Open Specification Developed by OpenACC.org Consortium
3. 3
silica IFPEN, RMM-DIIS on P100
OPENACC GROWING MOMENTUM
Wide Adoption Across Key HPC Codes
ANSYS Fluent
Gaussian
VASP
LSDalton
MPAS
GAMERA
GTC
XGC
ACME
FLASH
COSMO
Numeca
OVER 100 APPS* USING OpenACC
Prof. Georg Kresse
Computational Materials Physics
University of Vienna
For VASP, OpenACC is the way forward for GPU
acceleration. Performance is similar to CUDA, and
OpenACC dramatically decreases GPU
development and maintenance efforts. We’re
excited to collaborate with NVIDIA and PGI as an
early adopter of Unified Memory.
VASP
Top Quantum Chemistry and Material Science Code
* Applications in production and development
4. 4
0
20
40
60
80
100
120
140
160
Multicore
Haswell
Multicore
Broadwell
Multicore
Skylake
SINGLE CODE FOR MULTIPLE PLATFORMS
OpenPOWER
Sunway
x86 CPU
x86 Xeon Phi
NVIDIA GPU
AMD GPU
PEZY-SC
OpenACC - Performance Portable Programming Model for HPC
KeplerPascal
Volta V100
1x 2x 4x
AWE Hydrodynamics CloverLeaf mini-App, bm32 data set
Systems: Haswell: 2x16 core Haswell server, four K80s, CentOS 7.2 (perf-hsw10), Broadwell: 2x20 core Broadwell server, eight P100s (dgx1-prd-01), Broadwell server, eight V100s (dgx07), Skylake 2x20 core Xeon Gold server
(sky-4).
Compilers: Intel 2018.0.128, PGI 18.1
Benchmark: CloverLeaf v1.3 downloaded from http://uk-mac.github.io/CloverLeaf the week of November 7 2016; CloverlLeaf_Serial; CloverLeaf_ref (MPI+OpenMP); CloverLeaf_OpenACC (MPI+OpenACC)
Data compiled by PGI February 2018.
PGI 18.1 OpenACC
Intel 2018 OpenMP
7.6x 7.9x 10x 10x 11x
40x
14.8x 15x
SpeedupvsSingleHaswellCore
109x
67x
142x
5. 5
OPENACC IN THE NEWS
InsideHPC:
Accelerating HPC Applications on NVIDIA GPUs
with OpenACC
The Next Platform: OpenACC Developments:
Past, Present, and Future
READ MORE
READ MORE
6. 6
USER GROUP MEETING
Tuesday March 27th at 7:30PM
John Stone
Senior Research Programmer, Theoretical
and Computational Biophysics Group and
NIH Center for Macromolecular Modeling
and Bioinformatics
Randy Allen
Director of Advanced Research at Mentor
Graphics. Lead developer of OpenACC
GCC implementation.
VMD with
OpenACC
GCC
OpenACC
Updates
INVITED SPEAKERS FOOD, DRINKS & FUN
REGITER HERE
7. 7
OPENACC AT GTC 2018 – MARCH 26-29TH
Talks, Tutorials, Labs, User Group Meeting
Featured Talk Speaker
Accelerating Molecular Modeling
Tasks on Desktop and Pre-Exascale
Supercomputers
John Stone - Senior Research
Programmer, University of Illinois
An Agile Approach to Building a
GPU-enabled and Performance-
portable Global Cloud-resolving
Atmospheric Model
Richard Loft - CO, National Center
for Atmospheric Research
Analysis of Performance Gap
Between OpenACC and the Native
Approach on P100 GPU and
SW26010: A Case Study with GTC-P
Stephen Wang - GPU Specialist,
Shanghai Jiao Tong University
Porting VASP to GPUs with OpenACC Markus Wetzstein - HPC DevTech
Engineer, NVIDIA
Stefan Maintz - DevTech Engineer,
NVIDIA
LEARN MORE
8. 8
TUTORIALS AND LABS AT GTC
Tutorial/Lab Instructor
In-depth Performance Analysis for
OpenACC/CUDA/OpenCL
Applications with Score-P and
Vampir
Robert Henschel - Director Science
Community Tools, Indiana University
Guido Juckeland - Head of Computational
Science Group, Helmholtz-Zentrum
Dresden-Rossendorf
Best GPU Code Practices
Combining OpenACC, CUDA, and
OmpSs
Antonio J. Peña - Sr. Researcher, Barcelona
Supercomputing Center (BSC)
Programming GPU-Accelerated
OpenPOWER Systems with
OpenACC
Andreas Herten - Post-Doctoral Researcher
GPUs in HPC, Jülich Supercomputing
Centre
Fundamentals of Accelerated
Computing with OpenACC
Jeff Larkin - Senior DevTech Software
Engineer, NVIDIA
LEARN MORE
9. 9
RESOURCES
Paper: Accelerating a Landscape Evolution Model with
Parallelism
By Richard Barnes, Energy & Resources Group, Berkeley, USA
“Solving inverse problems and achieving statistical rigour in landscape
evolution models requires running many model realizations...The new
algorithm runs 43 x faster (70 s vs. 3,000 s on a 10,000 x 10,000 input)
than the previous state of the art and exhibits sublinear scaling with
input size...Tips for parallelization and a step-by-step guide to
achieving it are given to help others achieve good performance with
their own code. Complete, well-commented, easily adaptable source
code for all versions of the algorithm is available as a supplement and
on Github.”
READ NOW
10. 10
UPCOMING EVENTS
COMPLETE LIST OF EVENTS
Event Date
GTC 2018, San Jose, California, USA March 26-29, 2018
HPC Advisory Council: Swiss Conference 2018 April 9-12, 2018
OpenACC Workshop, BSC, Barcelona, Spain April 12-13, 2018
Pawsey Hackathon, Perth, Australia April 16-20, 2018
CSCS Directive Based GPU Programming May 14-15, 2018
GPU Hackathon: UC Boulder June 4-8, 2018
1st HPC Summer School in Colombia June 5-9, 2018
International Workshop on OpenPOWER for HPC June 28, 2018
GPU Hackathon: Brookhaven National Lab July 9-13, 2018
CSCS-USI SUMMER SCHOOL 2018 July 15-27, 2018
11. 11
CALL FOR PAPERS
Event Date
CU Boulder Hackathon, Boulder, CO, USA March 31, 2018
3rd International Workshop on Performance Portable
Programming Models for Accelerators (P^3MA)
April 3, 2018
So what is OpenACC? OpenACC is a directives-based programming model designed for performance and portability on CPUs and GPUs. It was created with scientists and engineers in mind, those who are interested in porting their codes to a wide-variety of architectures and looking for a solution that can help to do so with a minimal amount of efforts.
With OpenACC users can achieve significant acceleration of their codes within days and weeks by simply adding compiler directives that allow keeping their original code mostly untouched. As a result more time is saved for science and less used for programming it. Additionally to being a simple way to start with GPUs, OpenACC also allows to use the same code base for multiple platforms which saves a tremendous amount of time for scientists using machines with different architectures.
OpenACC is an open specification developed by the OpenACC organization that includes 20 members with NVIDIA as one of the founding members.
If you interested to learn more about OpenACC organization, its members and more, please check out openacc.org website.
OpenACC is an established programming model that has been adopted by over 100 applications including leading HPC codes.
In fact top 3 HPC application out of top 5 as defined by Intersect360 research adopted OpenACC.
Gaussian, the leading quantum chemistry code has deployed OpenACC in production across the code base.
ANSYS Fluent, the top computational fluid dynamics code, uses OpenACC
VASP, material science and quantum chemistry code, is in the process of development OpenACC VASP version.
The Oak Ridge National Lab in its Center for Accelerated Application Readiness (CAAR) program has selected OpenACC for 5 codes out of 13 to be ported to GPUs and ready for the Summit Supercomputer.
We will cover key OpenACC application in more detail later in this presentation.
OpenACC is unique because unlike other programming models it delivers performance portability – the same code runs on multiple platforms with high performance.
Implementations of OpenACC now support a variety of architectures starting from x86 and OpenPOWER CPUs and NVIDIA GPUs to Sunway and Pezy-SC chips.
To illustrate performance portability we ran CloverLeaf miniapp on multiple platforms using OpenACC and native CPU compilers. CloverLeaf achieves the same performance on x86 CPUs using a native Intel compiler and PGI 18.1 OpenACC implementations. Additionally OpenACC code stayed the same for CPU and GPU implementations. And yes, Pascal and Volta GPUs deliver the highest performance for CloverLeaf.