SlideShare a Scribd company logo
1 of 15
SpeedIT FLOW: a GPU success story
A GPU Success Story in 5 steps* 
* Compare with Accelerated ANSYS Fluent presentation by Robert Strzodka (NVIDIA).
Step 1: Identify the application 
OpenFOAM Computational Fluid Dynamics 
Why OpenFOAM*: 
Free to use and open source, more and 
more popular in CFD community. 
Large user base across engineering 
industries, e.g. car, oil&gas, material, 
aviation, etc 
OPENFOAM® is a registered trademark of ESI Group. 
This offering is not approved or endorsed by ESI Group, the 
producer of the OpenFOAM software and owner of the 
OPENFOAM® and OpenCFD® trademarks.
Step 2: Identify the bottlenecks 
GPU 
◼Partial acceleration (see Fig.) is not effective. 
◼ofgpu (from Symscape), Culises (Fluidyna), NVIDIA (nvAMG) offer 
the mild acceleration for non-linear problems. 
◼Maximal acceleration vs. Intel Xeon : ca. 1.8x (nvAMG). 
Problem: Amdahl’s Law should not be ignored. 
Solution: Implement the whole solver on GPU. 
Assemble 
Linear System of 
Equations 
Solve Linear System 
of Equations 
Converged 
? 
ca. 40% of 
Runtime 
ca. 60% of 
Runtime 
Fig. Navier Stokes Equation 
calculation scheme.
Step 3: Parallelize the Algorithm 
◼Implement non-linear solvers fully on GPU: 
◼PISO and SIMPLE flow solvers* 
◼AMG preconditioner (CUSP). 
◼kOmegaSST turbulence model. 
◼Various boundary conditions. 
* Acceleration of Iterative Navier-Stokes solvers on GPUs, Int J Comp Fluid Dynamics 07/2013; 27(4-5):201-209 
GPU 
Assemble 
Linear System of 
Equations 
Solve Linear System 
of Equations 
Converged?
Step 4: Create Production-ready 
Library 
◼Team of GPU specialists, physicists (Vratis, Wroclaw universities). 
◼Understand the algorithm, parallelize with caution, adapt to GPU hardware 
limitations. 
◼Invest in validation and testing with industrial partners. 
◼Have fun working with industry leaders!
Step 5: Enjoy the acceleration 
Motorbike 
(OpenFOAM tutorial) 
6.5M cells. 
external, aero, turbulent 
flow. 
simpleFoam, double 
precision 
Meshing on CPU 
Solving on GPU 
x.3.18 acceleration
Step 5: Enjoy the acceleration 
Aero* 
(based on Motorbike) 
3M cells 
external, aero, turbulent 
flow. 
simpleFoam, double 
precision 
Meshing on CPU, solving 
on GPU 
x.2.76 acceleration 
* Geometry obtained from 4-id network.
Step 5: Enjoy the acceleration 
Solar Car* 
(based on Motorbike) 
3.7M cells 
external, aero, turbulent 
flow. 
simpleFoam, double 
precision 
Meshing on CPU 
Solving on GPU 
x.2.85 acceleration 
* Geometry obtained from 4-id network.
SpeedIT FLOW: Hardware 
Hardware: 
CPU: 2 x Intel Xeon E5649 
12M Cache, 2.53 GHz, 5.86 GT/s Intel® QPI, 96GB RAM 
1333 MHz 
# cores: 2x6 
# threads: 2x12 
Software: OpenFOAM 2.1 
GPU: Quadro K6000 
12GB GDDR5 SDRAM 
OS: Ubuntu 12.04.4 LTS, 64 bit 
Software: SpeedIT FLOW
SpeedIT FLOW: OpenFOAM accelerator 
Use your GPU as an external accelerator 
for your OpenFOAM cases and use CPU as 
usual. 
4 Aero Simulations on CPU: 4 x 5493 
sec (12 CPU cores occupied) 
4 Aero Simulations on GPU: 4 x 2100 
sec. (1 CPU core occupied) 
… and your CPU is still not busy! You 
can use it for meshing, visualization, 
another simulations (Fig: 3xGPU + 1x11 
CPU cores is 3.2x faster!).
What is SpeedIT FLOW ?
SpeedIT FLOW 1.0 
◼Integration: 
Dynamically linked library with public C-interface, Easy to use API. 
◼Reader/Writer of OpenFOAM* cases. 
◼GPU accelerated solvers: gico, gsimple, gpiso 
◼Turbulence model: kOmegaSST. 
◼Solves problem with geometries up to 7 million cells on a single card (with 
12GB GPU RAM). 
◼Supports AMD** and NVIDIA cards. 
* Tested with OpenFOAM ver. 2.0.1 
** SpeedIT FLOWCL to be released.
SpeedIT FLOW: usage 
◼Run from command line 
simpleFoamToSpeedITflow -case ./myCaseInOF #OpenFOAM -> SpeedIT FLOW format conversion 
gpiso ./myCaseInOF & #solves the case on GPU. Use CPU for other tasks. 
paraFoam -case ./myCaseInOF # visualize the result. 
◼Use API
Don’t wait. Just SpeedIT. 
Vratis 
Muchoborska 18 
50-424 Wroclaw, Poland 
Questions? 
Contact us at: info@vratis.com

More Related Content

What's hot

DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DC
DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DCDevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DC
DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DCDevOpsDays Riga
 
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIOptimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIIntel® Software
 
Ch3 v70 project_structure_en
Ch3 v70 project_structure_enCh3 v70 project_structure_en
Ch3 v70 project_structure_enconfidencial
 
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, SynopsysGert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, Synopsyschiportal
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Eastern European Computer Vision Conference
 

What's hot (10)

DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DC
DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DCDevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DC
DevOpsDaysRiga 2017 Ignite: Toshaan Bharvani - POWER your DC
 
Eugene Khvedchenia - Image processing using FPGAs
Eugene Khvedchenia - Image processing using FPGAsEugene Khvedchenia - Image processing using FPGAs
Eugene Khvedchenia - Image processing using FPGAs
 
Super computer 2017
Super computer 2017Super computer 2017
Super computer 2017
 
SDC Server Sao Jose
SDC Server Sao JoseSDC Server Sao Jose
SDC Server Sao Jose
 
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms Fedor Polyakov - Optimizing computer vision problems on mobile platforms
Fedor Polyakov - Optimizing computer vision problems on mobile platforms
 
Optimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER IIOptimizing Total War*: WARHAMMER II
Optimizing Total War*: WARHAMMER II
 
Ch3 v70 project_structure_en
Ch3 v70 project_structure_enCh3 v70 project_structure_en
Ch3 v70 project_structure_en
 
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, SynopsysGert Goossens,Sen. Director, ASIP Tools, Synopsys
Gert Goossens,Sen. Director, ASIP Tools, Synopsys
 
Py drum
Py drumPy drum
Py drum
 
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...Maxim Kamensky - Applying image matching algorithms to video recognition and ...
Maxim Kamensky - Applying image matching algorithms to video recognition and ...
 

Similar to SpeedIT FLOW GPU success story for OpenFOAM CFD simulations

GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)Kohei KaiGai
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021Grigory Sapunov
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationIntel® Software
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer FugakuRCCSRENKEI
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLinaro
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementGanesan Narayanasamy
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Host Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsHost Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsNetronome
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018NVIDIA
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~Kohei KaiGai
 
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil GovindanNewton Alex
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIDataWorks Summit
 
Open power topics20191023
Open power topics20191023Open power topics20191023
Open power topics20191023Yutaka Kawai
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxssuser30e7d2
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...Edge AI and Vision Alliance
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computingArka Ghosh
 

Similar to SpeedIT FLOW GPU success story for OpenFOAM CFD simulations (20)

GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)GPGPU Accelerates PostgreSQL (English)
GPGPU Accelerates PostgreSQL (English)
 
AI Hardware Landscape 2021
AI Hardware Landscape 2021AI Hardware Landscape 2021
AI Hardware Landscape 2021
 
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning AccelerationclCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
clCaffe*: Unleashing the Power of Intel Graphics for Deep Learning Acceleration
 
08 Supercomputer Fugaku
08 Supercomputer Fugaku08 Supercomputer Fugaku
08 Supercomputer Fugaku
 
FPGA MeetUp
FPGA MeetUpFPGA MeetUp
FPGA MeetUp
 
PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )PowerAI Deep Dive ( key points )
PowerAI Deep Dive ( key points )
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
CAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablementCAPI and OpenCAPI Hardware acceleration enablement
CAPI and OpenCAPI Hardware acceleration enablement
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Host Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment ModelsHost Data Plane Acceleration: SmartNIC Deployment Models
Host Data Plane Acceleration: SmartNIC Deployment Models
 
PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018PGI Compilers & Tools Update- March 2018
PGI Compilers & Tools Update- March 2018
 
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
GPGPU Accelerates PostgreSQL ~Unlock the power of multi-thousand cores~
 
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
[Hadoop Meetup] Tensorflow on Apache Hadoop YARN - Sunil Govindan
 
Hortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AIHortonworks on IBM POWER Analytics / AI
Hortonworks on IBM POWER Analytics / AI
 
Open power topics20191023
Open power topics20191023Open power topics20191023
Open power topics20191023
 
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptxPACT_conference_2019_Tutorial_02_gpgpusim.pptx
PACT_conference_2019_Tutorial_02_gpgpusim.pptx
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese..."Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
"Making Computer Vision Software Run Fast on Your Embedded Platform," a Prese...
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 
Vpu technology &gpgpu computing
Vpu technology &gpgpu computingVpu technology &gpgpu computing
Vpu technology &gpgpu computing
 

More from University of Zurich

Blood Flow Simulations in the Cloud
Blood Flow Simulations in the CloudBlood Flow Simulations in the Cloud
Blood Flow Simulations in the CloudUniversity of Zurich
 
Multimodal Image Processing in Cytology
Multimodal Image Processing in CytologyMultimodal Image Processing in Cytology
Multimodal Image Processing in CytologyUniversity of Zurich
 
Evolutionary-driven Optimization in Computational Chemistry
Evolutionary-driven Optimization in Computational ChemistryEvolutionary-driven Optimization in Computational Chemistry
Evolutionary-driven Optimization in Computational ChemistryUniversity of Zurich
 
Evolution-based Reaction Path Following
Evolution-based Reaction Path FollowingEvolution-based Reaction Path Following
Evolution-based Reaction Path FollowingUniversity of Zurich
 
Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis
Artificial Intelligence in High Content Screening and Cervical Cancer DiagnosisArtificial Intelligence in High Content Screening and Cervical Cancer Diagnosis
Artificial Intelligence in High Content Screening and Cervical Cancer DiagnosisUniversity of Zurich
 
SpeedIT : GPU-based acceleration of sparse linear algebra
SpeedIT : GPU-based acceleration of sparse linear algebraSpeedIT : GPU-based acceleration of sparse linear algebra
SpeedIT : GPU-based acceleration of sparse linear algebraUniversity of Zurich
 

More from University of Zurich (7)

Blood Flow Simulations in the Cloud
Blood Flow Simulations in the CloudBlood Flow Simulations in the Cloud
Blood Flow Simulations in the Cloud
 
Multimodal Image Processing in Cytology
Multimodal Image Processing in CytologyMultimodal Image Processing in Cytology
Multimodal Image Processing in Cytology
 
Content-based Image Retrieval
Content-based Image RetrievalContent-based Image Retrieval
Content-based Image Retrieval
 
Evolutionary-driven Optimization in Computational Chemistry
Evolutionary-driven Optimization in Computational ChemistryEvolutionary-driven Optimization in Computational Chemistry
Evolutionary-driven Optimization in Computational Chemistry
 
Evolution-based Reaction Path Following
Evolution-based Reaction Path FollowingEvolution-based Reaction Path Following
Evolution-based Reaction Path Following
 
Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis
Artificial Intelligence in High Content Screening and Cervical Cancer DiagnosisArtificial Intelligence in High Content Screening and Cervical Cancer Diagnosis
Artificial Intelligence in High Content Screening and Cervical Cancer Diagnosis
 
SpeedIT : GPU-based acceleration of sparse linear algebra
SpeedIT : GPU-based acceleration of sparse linear algebraSpeedIT : GPU-based acceleration of sparse linear algebra
SpeedIT : GPU-based acceleration of sparse linear algebra
 

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 

SpeedIT FLOW GPU success story for OpenFOAM CFD simulations

  • 1. SpeedIT FLOW: a GPU success story
  • 2. A GPU Success Story in 5 steps* * Compare with Accelerated ANSYS Fluent presentation by Robert Strzodka (NVIDIA).
  • 3. Step 1: Identify the application OpenFOAM Computational Fluid Dynamics Why OpenFOAM*: Free to use and open source, more and more popular in CFD community. Large user base across engineering industries, e.g. car, oil&gas, material, aviation, etc OPENFOAM® is a registered trademark of ESI Group. This offering is not approved or endorsed by ESI Group, the producer of the OpenFOAM software and owner of the OPENFOAM® and OpenCFD® trademarks.
  • 4. Step 2: Identify the bottlenecks GPU ◼Partial acceleration (see Fig.) is not effective. ◼ofgpu (from Symscape), Culises (Fluidyna), NVIDIA (nvAMG) offer the mild acceleration for non-linear problems. ◼Maximal acceleration vs. Intel Xeon : ca. 1.8x (nvAMG). Problem: Amdahl’s Law should not be ignored. Solution: Implement the whole solver on GPU. Assemble Linear System of Equations Solve Linear System of Equations Converged ? ca. 40% of Runtime ca. 60% of Runtime Fig. Navier Stokes Equation calculation scheme.
  • 5. Step 3: Parallelize the Algorithm ◼Implement non-linear solvers fully on GPU: ◼PISO and SIMPLE flow solvers* ◼AMG preconditioner (CUSP). ◼kOmegaSST turbulence model. ◼Various boundary conditions. * Acceleration of Iterative Navier-Stokes solvers on GPUs, Int J Comp Fluid Dynamics 07/2013; 27(4-5):201-209 GPU Assemble Linear System of Equations Solve Linear System of Equations Converged?
  • 6. Step 4: Create Production-ready Library ◼Team of GPU specialists, physicists (Vratis, Wroclaw universities). ◼Understand the algorithm, parallelize with caution, adapt to GPU hardware limitations. ◼Invest in validation and testing with industrial partners. ◼Have fun working with industry leaders!
  • 7. Step 5: Enjoy the acceleration Motorbike (OpenFOAM tutorial) 6.5M cells. external, aero, turbulent flow. simpleFoam, double precision Meshing on CPU Solving on GPU x.3.18 acceleration
  • 8. Step 5: Enjoy the acceleration Aero* (based on Motorbike) 3M cells external, aero, turbulent flow. simpleFoam, double precision Meshing on CPU, solving on GPU x.2.76 acceleration * Geometry obtained from 4-id network.
  • 9. Step 5: Enjoy the acceleration Solar Car* (based on Motorbike) 3.7M cells external, aero, turbulent flow. simpleFoam, double precision Meshing on CPU Solving on GPU x.2.85 acceleration * Geometry obtained from 4-id network.
  • 10. SpeedIT FLOW: Hardware Hardware: CPU: 2 x Intel Xeon E5649 12M Cache, 2.53 GHz, 5.86 GT/s Intel® QPI, 96GB RAM 1333 MHz # cores: 2x6 # threads: 2x12 Software: OpenFOAM 2.1 GPU: Quadro K6000 12GB GDDR5 SDRAM OS: Ubuntu 12.04.4 LTS, 64 bit Software: SpeedIT FLOW
  • 11. SpeedIT FLOW: OpenFOAM accelerator Use your GPU as an external accelerator for your OpenFOAM cases and use CPU as usual. 4 Aero Simulations on CPU: 4 x 5493 sec (12 CPU cores occupied) 4 Aero Simulations on GPU: 4 x 2100 sec. (1 CPU core occupied) … and your CPU is still not busy! You can use it for meshing, visualization, another simulations (Fig: 3xGPU + 1x11 CPU cores is 3.2x faster!).
  • 12. What is SpeedIT FLOW ?
  • 13. SpeedIT FLOW 1.0 ◼Integration: Dynamically linked library with public C-interface, Easy to use API. ◼Reader/Writer of OpenFOAM* cases. ◼GPU accelerated solvers: gico, gsimple, gpiso ◼Turbulence model: kOmegaSST. ◼Solves problem with geometries up to 7 million cells on a single card (with 12GB GPU RAM). ◼Supports AMD** and NVIDIA cards. * Tested with OpenFOAM ver. 2.0.1 ** SpeedIT FLOWCL to be released.
  • 14. SpeedIT FLOW: usage ◼Run from command line simpleFoamToSpeedITflow -case ./myCaseInOF #OpenFOAM -> SpeedIT FLOW format conversion gpiso ./myCaseInOF & #solves the case on GPU. Use CPU for other tasks. paraFoam -case ./myCaseInOF # visualize the result. ◼Use API
  • 15. Don’t wait. Just SpeedIT. Vratis Muchoborska 18 50-424 Wroclaw, Poland Questions? Contact us at: info@vratis.com