2. Tesla K40
FASTER
1.4 TF| 2880 Cores | 288
GB/s
ns/day
5
LARGER
2x Memory Enables More
Apps
SMARTER
Unlock Extra Performance
Using Power Headroom
AMBER Benchmark
4
6GB
3
2
Fluid
Dynamics
Rendering
Seismic
Analysis
1
0
CPU
K20X
K40
12GB
AMBER Benchmark: SPFP-Nucleosome
CPU: Dual E5-2687W @ 3.10GHz, 64GB System Memory, CentOS 6.2, GPU systems: Single Tesla K20X or Single Tesla K40
3. Tesla K40 : Acceleration for Large Problems
Structural Mechanics
ANSYS 14 SMP-V14sp-4
ANSYS
1.66x
Physics
Chroma
CHROMA
8.12x
Molecular Dynamics
AMBER
AMBER - SPFP-Cellulose_production_NPT
8.67x
Material Science
QMCPACK 4x4x1
QMCPACK
10.23x
Earth Science
SPECFEM3D
SPECFEM3D
E5-2687W @ 3.10GHz
Tesla K20X
Tesla K40
10.29x
0
2
4
6
8
10
12
3
4. Bigger Challenges – Less Time
CFD
Neural Networks
Larger models, higher
throughput
Larger training sets
High Energy Physics
Graph Analytics
More advanced event triggers
Accelerate larger graphs
Material Science
M&E
Larger ion/electron systems
More complex scenes,
Accelerates color grading
Molecular Dynamics
Quantum Chemistry
Bioinformatics
Larger problems, more
acceleration
Larger problems, more
acceleration
Newer algorithms, apps
5. Tesla K40 in Media and Entertianment
Creation
Color grading
for film and
video
Distribution
3D Rendering
Video Frame
Rate Conversion
Transcoding and
Encoding
broadcast video
6. Tesla K40 : Interactive and Real-time Analysis
1 Billion Tweets
8 Tesla K40
Live Streaming and Analysis
Faster Decisions
To Learn more: Register for
map-D Webinar on 29th Jan @ 9am PST
7.
8. Power Envelope
Board Power (Watts)
Avg GPU Power in Watts for
Real Applications on K20X
200
235W
150
100
50
0
Power headroom to higher Performance
4
9. GPU Boost on Tesla K40
Convert Power Headroom to Higher Performance
Boost
Clock #2
875Mhz
Boost
Clock #1
810Mhz
Base
Clock
745Mhz
235W
Workload # 1
Worst case
Reference App
235W
Workload # 2
E.g. AMBER
235W
Workload # 3
E.g. ANSYS Fluent
5
10. Real Apps Run Up to 1.4x faster with GPU Boost
1.6
Tesla K40 Performance Relative to Tesla K20X
1.40
1.4
1.27
1.27
1.23
1.28
1.26
1.25
1.2
1.07
1.0
0.8
0.6
0.4
0.2
0.0
ANSYS 14 SMP-V14sp-4
ANSYS
LAMMPS-EAM
LAMMPS-EAM
NAMD 2.9-APOA1
NAMD 2.9
APOA1
AMBER-SPFP-Nucleosome
AMBER-SPFPNucleosome
K20X
K40@base
LSMS-Fe32
LSMS-Fe32
QMCPACK 3x3x1
QMCPACK
3x3x1
CUBLAS
CUBLAS
DGEMM
DGEMM
K40 @ boost
6
11. Compute Workload Behavior with GPU Boost
Non-Tesla
Tesla K40
Boost Clock # 2
Boost Clock # 1
GPU
Clock
Base Clock # 1
Automatic clock switching
Deterministic Clocks
Default @
Shipping
Boost
Base
Preset Options
Lock to base clock
3 Levels: Base, Boost1 or Boost2
Boost Interface
Control Panel
NV-SMI, NVML
Target duration
for boost clocks
~50% of run-time
100% of workload run time
Must-have for HPC workload
12. Using GPU Boost on Tesla K40
View
the
clocks
nvidia-smi -q –d
CLOCK,SUPPORTED_CLOCKS
Set the
Boost
clocks
nvidia-smi -ac <MEM clock,
Graphics clock>
End User selects the
clocks
Host
GPU
Boost all 2880
Cores
GPU
GPU
Higher memory b/w
13. Customer Feedback on K40 w/GPU Boost
http://www.eyesopen.com/
fastrocs
17% Faster
13% Faster
http://blog.xcelerit.com/
benchmarks-nvidia-tesla-k40-vsk20x-gpu/
11% Faster
K40 w/GPU Boost 40% higher perf
*Tesla K40 Performance Relative to Tesla K20X
14. Tesla Resources
! Want to know more about Tesla Products
http://www.nvidia.com/object/tesla-servers.html
http://www.nvidia.com/object/tesla-workstations.html
! Need help on using GPU Boost on Tesla K40
http://www.nvidia.com/object/tesla_product_literature.html
! Product details, specs, etc.
http://www.nvidia.com/object/tesla_product_literature.html
! Where to buy
http://www.nvidia.com/object/where-to-buy-tesla.html
15. Test Drive the World’s Fastest GPU
1. Sign up for FREE GPU Test Drive
visit: http://www.Nvidia.com/GPUTestDrive
2. Accelerate your apps on latest K40
GPUs
3. Tell us how K40 and GPU Boost
worked for you
16. Upcoming GTC Express Webinars
January 29: map-D: A GPU Database for Real-time Dig Data
Analytics and Interactive Visualization
January 30: Debugging CUDA Fortran using Allinea DDT
February 5: OpenMM - Accelerating and Customizing Molecular
Dynamic Simulations on GPUs
February 25: Using GPUs to Supercharge Visualization and Analysis
of Molecular Dynamics Simulations with VMD
Register at www.gputechconf.com/gtcexpress
17. GTC 2014 Registration is Open
Hundreds of sessions in the areas of
§ Science and research
§ Professional graphics
§ Mobile computing
§ Automotive applications
§ Game development
§ Cloud computing
Register with GM20EXP for 20% discount
www.gputechconf.com