SlideShare uma empresa Scribd logo
1 de 53
Baixar para ler offline
5 
4 
3 
2 
1 
0 
2003 
2005 
2007 
2009 
2011 
2013 
TeraFLOPS 
GPU 
CPU
GTC — GROWING AND EXPANDING 
2010 
2012 
2014 
397 
429 
729 
FASTEST GROWING TOPICS 
Big Data Analytics 
Machine Learning 
Computer Vision 
FASTEST GROWING TOPICS 
Energy Exploration 
Life Science & Genomics 
Molecular Dynamics 
#1 TOPIC 
HPC / Supercomputing
2012 
2013 
2014 
FOSTERING THE GPU ECOSYSTEM Big Data / Cloud / Computer Vision 
AudioStreamTV
CUDA EVERYWHERE
Takayuki Aoki 
Global Scientific Information and Computing Center Tokyo Institute of Technology 
“ Large-scale CFD Applications and a Full GPU Implementation of a Weather Prediction Code on the TSUBAME Supercomputer 
”
BANDWIDTH BOTTLENECKS 
CPU 
GPU 
PCIe 
PCI Express 
CPU Memory 
GPU Memory 
16GB/sec 
60GB/sec 
288GB/sec
INTRODUCING NVLINK 
CPU 
GPU 
PCIe 
Differential with embedded clock 
PCIe programming model (w/ DMA+) 
Unified Memory 
Cache coherency in Gen 2.0 
5 to 12X PCIe
5X More Bandwidth for Multi-GPU Scaling 
GPU 
PCIe SWITCH 
CPU 
GPU 
GPU 
GPU
3D MEMORY 
3D Chip-on-Wafer integration 
Many X bandwidth 
2.5X capacity 
4X energy efficiency 
0 
200 
400 
600 
800 
1000 
1200 
2008 
2010 
2012 
2014 
2016 
Memory Bandwidth
Blaise Pascal 
1623-1662 
Mechanical Calculator 
Probability Theory 
Pascal’s Theorem 
Pascal’s Law
PASCAL 
NVLink 
3D Memory 
Module 
5 to 12X PCIe 3.0 
2 to 4X memory BW & size 
1/3 size of PCIe card
SGEMM / W Normalized 
2012 
2014 
2008 
2010 
2016 
Tesla 
CUDA 
Fermi 
FP64 
Kepler 
Dynamic Parallelism 
Maxwell 
DX12 
Pascal 
Unified Memory 
3D Memory 
NVLink 
20 
16 
12 
8 
6 
2 
0 
GPU ROADMAP 
4 
10 
14 
18
MACHINE LEARNING 
Branch of Artificial Intelligence 
Computers that learn from data 
person 
car 
helmet 
motorcycle 
bird 
frog 
person 
dog 
chair 
person 
hammer 
flower pot 
power drill
Machine Learning using Deep Neural Networks 
Input 
Result
Building High-level Features Using Large Scale Unsupervised Learning 
Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, A. Ng 
Stanford / Google 
1 billion connections 
10 million 200x200 pixel images 
1,000 machines (16,000 cores) 
3 days
1,000 CPU Servers 2,000 CPUs • 16,000 cores 
600 kWatts 
$5,000,000 
GOOGLE BRAIN 
Today’s Largest Networks 
1B connections 
10M images 
~3 days 
~30 ExaFLOPS 
Human Brain 
~100B neurons x 1000 connections 
500M images 
5,000,000X “Google Brain” 
~150 YottaFLOPS 
~40,000 “Google Brain-Years” 
SOURCE: Ian Goodfellow
Deep Learning with COTS HPC Systems 
A. Coates, B. Huval, T. Wang, D. Wu, A. Ng, B. Catanzaro 
Stanford / NVIDIA • ICML 2013 
STANFORD AI LAB 
3 GPU-Accelerated Servers 12 GPUs • 18,432 cores 
4 kWatts 
$33,000 
Now You Can Build Google’s $1M Artificial Brain on the Cheap 
“ 
“ 
-Wired 
1,000 CPU Servers 2,000 CPUs • 16,000 cores 
600 kWatts 
$5,000,000 
GOOGLE BRAIN
DEMO: MACHINE LEARNING, SIMPLE TRAINING SET
1.2M 
1000 
2 
7 
25 
Image training set Classes Weeks of training GPUs EXAFLOPS total to train 
DEMO: MACHINE LEARNING, NYU OVERFEAT
CUDA for MACHINE LEARNING 
Talks @ GTC 
Image Detection 
Face Recognition 
Gesture Recognition 
Video Search & Analytics 
Speech Recognition & Translation 
Recommendation Engines 
Indexing & Search 
Use Cases 
Early Adopters 
Image Analytics for Creative Cloud 
Image Classification 
Speech/Image Recognition 
Recommendation 
Hadoop 
Search Rankings
Big Data & Infinite Compute Turbocharge Deep Learning 
SOURCE: KPCB/Mary Meeker, company data. Unstructured data: IDC's Digital Universe Study. 
800M photos uploaded per day 
100 hours of video uploaded per minute 
Unstructured data exploding 
0 
100 
200 
300 
400 
500 
600 
700 
800 
900 
2007 
2008 
2009 
2010 
2011 
2012 
2013 
2014 
Facebook 
Instagram 
Snapchat 
Flickr 
0 
20 
40 
60 
80 
100 
120 
2007 
2008 
2009 
2010 
2011 
2012 
2013 
Hours (YouTube) 
Millions 
1,104 
5,379 
0 
1,000 
2,000 
3,000 
4,000 
5,000 
6,000 
2010 
2015 
Exabytes of data
DEMO: TITAN Z REVEAL
5,760 CUDA cores 
12GB memory 
8 TeraFLOPS 
$2999
STANFORD AI LAB 
1 Titan Z-Accelerated Server 3 Titan Zs • 17,280 cores 
2 kWatts $12,000 
1,000 CPU Servers 2,000 CPUs • 16,000 cores 
600 kWatts 
$5,000,000 
GOOGLE BRAIN 
300X energy efficiency 
400X lower cost 
Fits next to a desk
RenderMan with programmable shading 
1.5 hours to render each frame 
CCI 6/32 minicomputer 
First CGI Film Nominated for an Academy Award®
State-of-the-art water simulator 48 hours to simulate the base water 250 hours to render each frame 
2013 Academy Award® Winner BEST VISUAL EFFECTS
DEMO: WHALE
DEMO: FLEX
DEMO: FLAMEWORKS
DEMO: UE4
One is a photo, One is Iray…
Bunkspeed 
Maya 
Catia 
3ds Max 
IRAY VCA SCALABLE GPU RENDERING APPLIANCE 
8 Kepler-class 
12GB per GPU 
23,040 
2 x 1GigE 2 x 10GigE 1 x InfiniBand 
GPUs 
GPU memory 
CUDA cores 
Network
DEMO: IRAY / HONDA
0 
20 
40 
60 
80 
Relative Performance 
CPU-only Workstation 
Quadro K5000 Workstation 
Iray VCA 
Bunkspeed 
Maya 
Catia 
3ds Max 
IRAY VCA SCALABLE GPU RENDERING APPLIANCE 
MSRP $50,000
GRID GPU in the Cloud
Ben Fathi 
Chief Technology Officer 
Horizon DaaS Platform
Mobile CUDA
“10 of the Top 10” Greenest Supercomputers Powered by CUDA GPUs
Unify GPU and Tegra Architecture 
192 fully programmable CUDA cores 
326 GFLOPS 
4X energy efficiency over A15 
TEGRA K1 Mobile Super Chip 
MOBILE ARCHITECTURE 
Maxwell 
Kepler 
Tesla 
Fermi 
Tegra 3 
Tegra 4 
Tegra K1 
GPU ARCHITECTURE
Computer Vision on CUDA 
Feature Detection / Tracking 
~30 GFLOPS @ 30 Hz 
Object Recognition / Tracking 
~180 GFLOPS @ 30 Hz 
3D Scene Interpretation 
~280 GFLOPS @ 30 Hz
JETSON TK1 1st MOBILE SUPERCOMPUTER FOR EMBEDDED SYSTEMS 
192 CUDA cores 
326 GFLOPS 
VisionWorks SDK 
$192
VISIONWORKS 
COMPUTER VISION ON CUDA 
Driver Assistance Computational Photography 
Augmented Reality Robotics 
CUDA 
Jetson TK1 
VisionWorks Primitives 
Your Code 
Sample Pipelines 
Object Detection / 
Tracking 
Structure from Motion … 
Classifier Corner Detection …
Single Precision GFLOPS / W Normalized 
80 
60 
0 
40 
2013 
2014 
2011 
2012 
2015 
Tegra 2 
Tegra 3 
Tegra 4 
Tegra K1 
Kepler GPU 
CUDA 
64b & 32b CPU 
Erista 
Maxwell GPU 
20 
TEGRA ROADMAP
Andreas Reich 
Head of Audi Pre-Development
VIDEO: AUDI ADAS
CUDA EVERYWHERE 
PASCAL 
PC 
CLOUD 
MOBILE
DEMO: PORTAL ON SHIELD
GPU Technology Conference 2014 Keynote

Mais conteúdo relacionado

Mais procurados

Nvidia CES 2013 Highlights
Nvidia CES 2013 HighlightsNvidia CES 2013 Highlights
Nvidia CES 2013 Highlights
Brian Caulfield
 

Mais procurados (20)

Nvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex SabatierNvidia Deep Learning Solutions - Alex Sabatier
Nvidia Deep Learning Solutions - Alex Sabatier
 
NVIDIA Overview 2015
NVIDIA Overview 2015NVIDIA Overview 2015
NVIDIA Overview 2015
 
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015 Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
Visual Computing: The Road Ahead, NVIDIA CEO Jen-Hsun Huang at CES 2015
 
Enabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. LowndesEnabling Artificial Intelligence - Alison B. Lowndes
Enabling Artificial Intelligence - Alison B. Lowndes
 
Shattering AI Performance Records
Shattering AI Performance RecordsShattering AI Performance Records
Shattering AI Performance Records
 
AI For Enterprise
AI For EnterpriseAI For Enterprise
AI For Enterprise
 
GTC 2016 Opening Keynote
GTC 2016 Opening KeynoteGTC 2016 Opening Keynote
GTC 2016 Opening Keynote
 
NVIDIA Is Revolutionizing Computing - June 2017
NVIDIA Is Revolutionizing Computing - June 2017 NVIDIA Is Revolutionizing Computing - June 2017
NVIDIA Is Revolutionizing Computing - June 2017
 
GTC 2018: A New AI Era Dawns
GTC 2018: A New AI Era DawnsGTC 2018: A New AI Era Dawns
GTC 2018: A New AI Era Dawns
 
NVIDIA Corporation Brochure: Who We Are
NVIDIA Corporation Brochure: Who We AreNVIDIA Corporation Brochure: Who We Are
NVIDIA Corporation Brochure: Who We Are
 
NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演NVIDIA Deep Learning Institute 2017 基調講演
NVIDIA Deep Learning Institute 2017 基調講演
 
NVIDIA – Inventor of the GPU
NVIDIA – Inventor of the GPUNVIDIA – Inventor of the GPU
NVIDIA – Inventor of the GPU
 
Innovation Roundtable
Innovation RoundtableInnovation Roundtable
Innovation Roundtable
 
HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018HPC Top 5 Stories: May 18th, 2018
HPC Top 5 Stories: May 18th, 2018
 
The AI Era Ignited by GPU Deep Learning
The AI Era Ignited by GPU Deep Learning The AI Era Ignited by GPU Deep Learning
The AI Era Ignited by GPU Deep Learning
 
Talk on commercialising space data
Talk on commercialising space data Talk on commercialising space data
Talk on commercialising space data
 
Nvidia CES 2013 Highlights
Nvidia CES 2013 HighlightsNvidia CES 2013 Highlights
Nvidia CES 2013 Highlights
 
Tales of AI agents saving the human race!
Tales of AI agents saving the human race!Tales of AI agents saving the human race!
Tales of AI agents saving the human race!
 
NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21 NVIDIA Keynote #GTC21
NVIDIA Keynote #GTC21
 
HPC Top 5 Stories: Nov. 11, 2016
HPC Top 5 Stories: Nov. 11, 2016HPC Top 5 Stories: Nov. 11, 2016
HPC Top 5 Stories: Nov. 11, 2016
 

Destaque

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
Jacob Wu
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
npinto
 
Open CL For Haifa Linux Club
Open CL For Haifa Linux ClubOpen CL For Haifa Linux Club
Open CL For Haifa Linux Club
Ofer Rosenberg
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
Angela Mendoza M.
 

Destaque (20)

Exaflop In 2018 Hardware
Exaflop In 2018   HardwareExaflop In 2018   Hardware
Exaflop In 2018 Hardware
 
IDC Report on HPC Market Trends June 2013
IDC Report on HPC Market Trends June 2013IDC Report on HPC Market Trends June 2013
IDC Report on HPC Market Trends June 2013
 
Gpgpu intro
Gpgpu introGpgpu intro
Gpgpu intro
 
Cliff sugerman
Cliff sugermanCliff sugerman
Cliff sugerman
 
CSTalks - GPGPU - 19 Jan
CSTalks  -  GPGPU - 19 JanCSTalks  -  GPGPU - 19 Jan
CSTalks - GPGPU - 19 Jan
 
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
PT-4057, Automated CUDA-to-OpenCL™ Translation with CU2CL: What's Next?, by W...
 
Newbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universeNewbie’s guide to_the_gpgpu_universe
Newbie’s guide to_the_gpgpu_universe
 
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
[Harvard CS264] 06 - CUDA Ninja Tricks: GPU Scripting, Meta-programming & Aut...
 
General Programming on the GPU - Confoo
General Programming on the GPU - ConfooGeneral Programming on the GPU - Confoo
General Programming on the GPU - Confoo
 
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
Advances in the Solution of Navier-Stokes Eqs. in GPGPU Hardware. Modelling F...
 
Gpgpu
GpgpuGpgpu
Gpgpu
 
LCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience ReportLCU13: GPGPU on ARM Experience Report
LCU13: GPGPU on ARM Experience Report
 
Open CL For Haifa Linux Club
Open CL For Haifa Linux ClubOpen CL For Haifa Linux Club
Open CL For Haifa Linux Club
 
Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)Intro to GPGPU with CUDA (DevLink)
Intro to GPGPU with CUDA (DevLink)
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
 
E-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPUE-Learning: Introduction to GPGPU
E-Learning: Introduction to GPGPU
 
GPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU ArchitecturesGPUDirect RDMA and Green Multi-GPU Architectures
GPUDirect RDMA and Green Multi-GPU Architectures
 
Introduction to gpu architecture
Introduction to gpu architectureIntroduction to gpu architecture
Introduction to gpu architecture
 
GPU Programming with Java
GPU Programming with JavaGPU Programming with Java
GPU Programming with Java
 
CS 354 GPU Architecture
CS 354 GPU ArchitectureCS 354 GPU Architecture
CS 354 GPU Architecture
 

Semelhante a GPU Technology Conference 2014 Keynote

abelbrownnvidiarakuten2016-170208065814 (1).pptx
abelbrownnvidiarakuten2016-170208065814 (1).pptxabelbrownnvidiarakuten2016-170208065814 (1).pptx
abelbrownnvidiarakuten2016-170208065814 (1).pptx
gopikahari7
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
Heiko Joerg Schick
 

Semelhante a GPU Technology Conference 2014 Keynote (20)

Deep Learning Update May 2016
Deep Learning Update May 2016Deep Learning Update May 2016
Deep Learning Update May 2016
 
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
Aplicações Potenciais de Deep Learning à Indústria do PetróleoAplicações Potenciais de Deep Learning à Indústria do Petróleo
Aplicações Potenciais de Deep Learning à Indústria do Petróleo
 
Harnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligenceHarnessing the virtual realm for successful real world artificial intelligence
Harnessing the virtual realm for successful real world artificial intelligence
 
組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム組み込みから HPC まで ARM コアで実現するエコシステム
組み込みから HPC まで ARM コアで実現するエコシステム
 
Hardware in Space
Hardware in SpaceHardware in Space
Hardware in Space
 
Silicom Ventures Talk Aug 2013 - GPUs and Parallel Programming create new opp...
Silicom Ventures Talk Aug 2013 - GPUs and Parallel Programming create new opp...Silicom Ventures Talk Aug 2013 - GPUs and Parallel Programming create new opp...
Silicom Ventures Talk Aug 2013 - GPUs and Parallel Programming create new opp...
 
GPU 101: The Beast In Data Centers
GPU 101: The Beast In Data CentersGPU 101: The Beast In Data Centers
GPU 101: The Beast In Data Centers
 
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習NVIDIA DGX-1 超級電腦與人工智慧及深度學習
NVIDIA DGX-1 超級電腦與人工智慧及深度學習
 
Introduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI PlatformIntroduction to PowerAI - The Enterprise AI Platform
Introduction to PowerAI - The Enterprise AI Platform
 
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIAH2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
H2O World 2017 Keynote - Jim McHugh, VP & GM of Data Center, NVIDIA
 
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)Backend.AI Technical Introduction (19.09 / 2019 Autumn)
Backend.AI Technical Introduction (19.09 / 2019 Autumn)
 
Distributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlowDistributed Deep Learning with Hadoop and TensorFlow
Distributed Deep Learning with Hadoop and TensorFlow
 
TiECon Florida keynote - New opportunities for entrepreneurs using GPU & CUDA
TiECon Florida keynote - New opportunities for entrepreneurs using GPU & CUDATiECon Florida keynote - New opportunities for entrepreneurs using GPU & CUDA
TiECon Florida keynote - New opportunities for entrepreneurs using GPU & CUDA
 
abelbrownnvidiarakuten2016-170208065814 (1).pptx
abelbrownnvidiarakuten2016-170208065814 (1).pptxabelbrownnvidiarakuten2016-170208065814 (1).pptx
abelbrownnvidiarakuten2016-170208065814 (1).pptx
 
Accelerating Data Science With GPUs
Accelerating Data Science With GPUsAccelerating Data Science With GPUs
Accelerating Data Science With GPUs
 
Tesla Accelerated Computing Platform
Tesla Accelerated Computing PlatformTesla Accelerated Computing Platform
Tesla Accelerated Computing Platform
 
Petascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big AnalyticsPetascale Analytics - The World of Big Data Requires Big Analytics
Petascale Analytics - The World of Big Data Requires Big Analytics
 
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
Getting Cloudy with Remote Graphics and GPU Compute Using G2 instances (CPN21...
 
The Rise of the Monorepo at NVIDIA 
The Rise of the Monorepo at NVIDIA The Rise of the Monorepo at NVIDIA 
The Rise of the Monorepo at NVIDIA 
 
Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)Introduction to Deep Learning (NVIDIA)
Introduction to Deep Learning (NVIDIA)
 

Mais de NVIDIA

NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October Summary
NVIDIA
 

Mais de NVIDIA (20)

NVIDIA Story 2023.pdf
NVIDIA Story 2023.pdfNVIDIA Story 2023.pdf
NVIDIA Story 2023.pdf
 
NVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring HighlightsNVIDIA GTC2022 Spring Highlights
NVIDIA GTC2022 Spring Highlights
 
NVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company OverviewNVIDIA Brochure 2021 Company Overview
NVIDIA Brochure 2021 Company Overview
 
NVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October SummaryNVIDIA GTC 2020 October Summary
NVIDIA GTC 2020 October Summary
 
The Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life SciencesThe Best of AI and HPC in Healthcare and Life Sciences
The Best of AI and HPC in Healthcare and Life Sciences
 
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
NVIDIA CEO Jensen Huang Presentation at Supercomputing 2019
 
NLP for Biomedical Applications
NLP for Biomedical ApplicationsNLP for Biomedical Applications
NLP for Biomedical Applications
 
Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019Top 5 Deep Learning and AI Stories - August 30, 2019
Top 5 Deep Learning and AI Stories - August 30, 2019
 
Seven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence ResearchSeven Ways to Boost Artificial Intelligence Research
Seven Ways to Boost Artificial Intelligence Research
 
NVIDIA Developer Program Overview
NVIDIA Developer Program OverviewNVIDIA Developer Program Overview
NVIDIA Developer Program Overview
 
NVIDIA at Computex 2019
NVIDIA at Computex 2019 NVIDIA at Computex 2019
NVIDIA at Computex 2019
 
Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019Top 5 DGX Sessions From GTC 2019
Top 5 DGX Sessions From GTC 2019
 
DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019DGX POD Top 4 Sessions From GTC 2019
DGX POD Top 4 Sessions From GTC 2019
 
Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019Top 5 Data Science Sessions from GTC 2019
Top 5 Data Science Sessions from GTC 2019
 
This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019This Week in Data Science - Top 5 News - April 26, 2019
This Week in Data Science - Top 5 News - April 26, 2019
 
GTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon ValleyGTC 2019 Keynote in Silicon Valley
GTC 2019 Keynote in Silicon Valley
 
CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019CUDA DLI Training Courses at GTC 2019
CUDA DLI Training Courses at GTC 2019
 
DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019DGX Sessions You Won't Want to Miss at GTC 2019
DGX Sessions You Won't Want to Miss at GTC 2019
 
Transforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon ValleyTransforming Healthcare at GTC Silicon Valley
Transforming Healthcare at GTC Silicon Valley
 
OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019OpenACC Monthly Highlights February 2019
OpenACC Monthly Highlights February 2019
 

Último

IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 

GPU Technology Conference 2014 Keynote

  • 1.
  • 2. 5 4 3 2 1 0 2003 2005 2007 2009 2011 2013 TeraFLOPS GPU CPU
  • 3. GTC — GROWING AND EXPANDING 2010 2012 2014 397 429 729 FASTEST GROWING TOPICS Big Data Analytics Machine Learning Computer Vision FASTEST GROWING TOPICS Energy Exploration Life Science & Genomics Molecular Dynamics #1 TOPIC HPC / Supercomputing
  • 4. 2012 2013 2014 FOSTERING THE GPU ECOSYSTEM Big Data / Cloud / Computer Vision AudioStreamTV
  • 6. Takayuki Aoki Global Scientific Information and Computing Center Tokyo Institute of Technology “ Large-scale CFD Applications and a Full GPU Implementation of a Weather Prediction Code on the TSUBAME Supercomputer ”
  • 7. BANDWIDTH BOTTLENECKS CPU GPU PCIe PCI Express CPU Memory GPU Memory 16GB/sec 60GB/sec 288GB/sec
  • 8. INTRODUCING NVLINK CPU GPU PCIe Differential with embedded clock PCIe programming model (w/ DMA+) Unified Memory Cache coherency in Gen 2.0 5 to 12X PCIe
  • 9. 5X More Bandwidth for Multi-GPU Scaling GPU PCIe SWITCH CPU GPU GPU GPU
  • 10. 3D MEMORY 3D Chip-on-Wafer integration Many X bandwidth 2.5X capacity 4X energy efficiency 0 200 400 600 800 1000 1200 2008 2010 2012 2014 2016 Memory Bandwidth
  • 11. Blaise Pascal 1623-1662 Mechanical Calculator Probability Theory Pascal’s Theorem Pascal’s Law
  • 12. PASCAL NVLink 3D Memory Module 5 to 12X PCIe 3.0 2 to 4X memory BW & size 1/3 size of PCIe card
  • 13. SGEMM / W Normalized 2012 2014 2008 2010 2016 Tesla CUDA Fermi FP64 Kepler Dynamic Parallelism Maxwell DX12 Pascal Unified Memory 3D Memory NVLink 20 16 12 8 6 2 0 GPU ROADMAP 4 10 14 18
  • 14. MACHINE LEARNING Branch of Artificial Intelligence Computers that learn from data person car helmet motorcycle bird frog person dog chair person hammer flower pot power drill
  • 15. Machine Learning using Deep Neural Networks Input Result
  • 16. Building High-level Features Using Large Scale Unsupervised Learning Q. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. Corrado, J. Dean, A. Ng Stanford / Google 1 billion connections 10 million 200x200 pixel images 1,000 machines (16,000 cores) 3 days
  • 17. 1,000 CPU Servers 2,000 CPUs • 16,000 cores 600 kWatts $5,000,000 GOOGLE BRAIN Today’s Largest Networks 1B connections 10M images ~3 days ~30 ExaFLOPS Human Brain ~100B neurons x 1000 connections 500M images 5,000,000X “Google Brain” ~150 YottaFLOPS ~40,000 “Google Brain-Years” SOURCE: Ian Goodfellow
  • 18. Deep Learning with COTS HPC Systems A. Coates, B. Huval, T. Wang, D. Wu, A. Ng, B. Catanzaro Stanford / NVIDIA • ICML 2013 STANFORD AI LAB 3 GPU-Accelerated Servers 12 GPUs • 18,432 cores 4 kWatts $33,000 Now You Can Build Google’s $1M Artificial Brain on the Cheap “ “ -Wired 1,000 CPU Servers 2,000 CPUs • 16,000 cores 600 kWatts $5,000,000 GOOGLE BRAIN
  • 19. DEMO: MACHINE LEARNING, SIMPLE TRAINING SET
  • 20. 1.2M 1000 2 7 25 Image training set Classes Weeks of training GPUs EXAFLOPS total to train DEMO: MACHINE LEARNING, NYU OVERFEAT
  • 21. CUDA for MACHINE LEARNING Talks @ GTC Image Detection Face Recognition Gesture Recognition Video Search & Analytics Speech Recognition & Translation Recommendation Engines Indexing & Search Use Cases Early Adopters Image Analytics for Creative Cloud Image Classification Speech/Image Recognition Recommendation Hadoop Search Rankings
  • 22. Big Data & Infinite Compute Turbocharge Deep Learning SOURCE: KPCB/Mary Meeker, company data. Unstructured data: IDC's Digital Universe Study. 800M photos uploaded per day 100 hours of video uploaded per minute Unstructured data exploding 0 100 200 300 400 500 600 700 800 900 2007 2008 2009 2010 2011 2012 2013 2014 Facebook Instagram Snapchat Flickr 0 20 40 60 80 100 120 2007 2008 2009 2010 2011 2012 2013 Hours (YouTube) Millions 1,104 5,379 0 1,000 2,000 3,000 4,000 5,000 6,000 2010 2015 Exabytes of data
  • 23.
  • 24. DEMO: TITAN Z REVEAL
  • 25. 5,760 CUDA cores 12GB memory 8 TeraFLOPS $2999
  • 26. STANFORD AI LAB 1 Titan Z-Accelerated Server 3 Titan Zs • 17,280 cores 2 kWatts $12,000 1,000 CPU Servers 2,000 CPUs • 16,000 cores 600 kWatts $5,000,000 GOOGLE BRAIN 300X energy efficiency 400X lower cost Fits next to a desk
  • 27. RenderMan with programmable shading 1.5 hours to render each frame CCI 6/32 minicomputer First CGI Film Nominated for an Academy Award®
  • 28. State-of-the-art water simulator 48 hours to simulate the base water 250 hours to render each frame 2013 Academy Award® Winner BEST VISUAL EFFECTS
  • 33. One is a photo, One is Iray…
  • 34. Bunkspeed Maya Catia 3ds Max IRAY VCA SCALABLE GPU RENDERING APPLIANCE 8 Kepler-class 12GB per GPU 23,040 2 x 1GigE 2 x 10GigE 1 x InfiniBand GPUs GPU memory CUDA cores Network
  • 35. DEMO: IRAY / HONDA
  • 36. 0 20 40 60 80 Relative Performance CPU-only Workstation Quadro K5000 Workstation Iray VCA Bunkspeed Maya Catia 3ds Max IRAY VCA SCALABLE GPU RENDERING APPLIANCE MSRP $50,000
  • 37.
  • 38. GRID GPU in the Cloud
  • 39. Ben Fathi Chief Technology Officer Horizon DaaS Platform
  • 41. “10 of the Top 10” Greenest Supercomputers Powered by CUDA GPUs
  • 42. Unify GPU and Tegra Architecture 192 fully programmable CUDA cores 326 GFLOPS 4X energy efficiency over A15 TEGRA K1 Mobile Super Chip MOBILE ARCHITECTURE Maxwell Kepler Tesla Fermi Tegra 3 Tegra 4 Tegra K1 GPU ARCHITECTURE
  • 43. Computer Vision on CUDA Feature Detection / Tracking ~30 GFLOPS @ 30 Hz Object Recognition / Tracking ~180 GFLOPS @ 30 Hz 3D Scene Interpretation ~280 GFLOPS @ 30 Hz
  • 44. JETSON TK1 1st MOBILE SUPERCOMPUTER FOR EMBEDDED SYSTEMS 192 CUDA cores 326 GFLOPS VisionWorks SDK $192
  • 45. VISIONWORKS COMPUTER VISION ON CUDA Driver Assistance Computational Photography Augmented Reality Robotics CUDA Jetson TK1 VisionWorks Primitives Your Code Sample Pipelines Object Detection / Tracking Structure from Motion … Classifier Corner Detection …
  • 46. Single Precision GFLOPS / W Normalized 80 60 0 40 2013 2014 2011 2012 2015 Tegra 2 Tegra 3 Tegra 4 Tegra K1 Kepler GPU CUDA 64b & 32b CPU Erista Maxwell GPU 20 TEGRA ROADMAP
  • 47. Andreas Reich Head of Audi Pre-Development
  • 49.
  • 50.
  • 51. CUDA EVERYWHERE PASCAL PC CLOUD MOBILE
  • 52. DEMO: PORTAL ON SHIELD