SlideShare uma empresa Scribd logo
1 de 29
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Deep Dive on Amazon EC2
Accelerated Computing Instances
Clinton Ford
Sr. Product Manager, AWS
August 7th, 2018
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 Instance Types
General
Purpose
Compute
Optimized
Storage
Optimized
Memory
Optimized
Accelerated
Computing
M5
T2
C5
C4
H1
I3
X1e
R4
F
1
P3
G3
D2
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Accelerated Computing Instances
F1: FPGA instance
• Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single instance. Programmable via
VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations.
• Designed for hardware-accelerated applications including financial computing, genomics,
accelerated search, and image processing
G3: GPU Graphics Instance
• Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote
workstations, video encoding, and virtual reality applications
P3: GPU Compute Instance
• Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU
communication
• Supporting a wide variety of use cases including deep learning, HPC simulations, financial
computing, and batch rendering
P3
G3
F1
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
• 10s-100s of processing
cores
• Pre-defined instruction set
& datapath widths
• Optimized for general-
purpose computing
CPU
CPUs vs GPUs vs FPGA for Compute
• 1,000s of processing
cores
• Pre-defined instruction set
and datapath widths
• Highly effective at parallel
execution
GPU
• Millions of programmable
digital logic cells
• No predefined instruction
set or datapath widths
• Hardware timed execution
FPGA
DRAM
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
Control
ALU
ALU
Cache
DRAM
ALU
ALU
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS EC2 F1 Instances for
Custom Hardware Acceleration
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
An FPGA is effective at processing data of many types in parallel, for example
creating a complex pipeline of parallel, multistage operations on a video stream, or
performing massive numbers of dependent or independent calculations for a
complex financial model…
• An FPGA does not have an
instruction-set!
• Data can be any bit-width (9-bit
integer? No problem!)
• Complex control logic (such as a
state machine) is easy to
implement in an FPGA
Each FPGA in
F1 has more
than 2M of
these cells
Parallel Processing in FPGAs
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
F1 FPGA instance types on AWS
§Up to 8 Xilinx UltraScale+ 16nm VU9P FPGA devices in a single instance
§The f1.16xlarge size provides:
§ 8 FPGAs, each with over 2 million customer-accessible FPGA
programmable logic cells and over 5000 programmable DSP blocks
§ Each of the 8 FPGAs has 4 DDR-4 interfaces, with each interface accessing
a 16GiB, 72-bit wide, ECC-protected memory
Instance Size FPGAs
FPGA Memory
(GB)
vCPUs
Instance
Memory (GB)
NVMe Instance
Storage (GB)
Network
Bandwidth
f1.2xlarge 1 64 8 122 1 x 470 Up to 10 Gbps
f1.16xlarge 8 512 64 976 4 x 940 25 Gbps
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
3 methods to use F1 Instance
Hardware Engineers/Developers1
• Developers who are comfortable programming FPGA
• Use F1 Hardware Development Kit (HDK) to develop and deploying custom FPGA accelerations using
Verilog and VHDL
Software Engineers/Developers2
• Developers who are not proficient in FPGA design
• Use OpenCL to create custom accelerations
Software Engineers/Developers3
• Developers who are not proficient in FPGA design
• Use pre-built and ready to use accelerations available in AWS Marketplace
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS EC2 G3 Instances for
Graphics Acceleration
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS G3 GPU instances
• Up to four NVIDIA M60 GPUs
• Includes GRID Virtual Workstation features and licenses, supports up to four monitors with
4096x2160 (4K) resolution
• Includes NVIDIA GRID Virtual Application capabilities for application virtualization software
like Citrix XenApp Essentials and VMWare Horizon, supporting up to 25 concurrent users
per GPU
• Hardware encoding to support up to 10 H.265 (HEVC) 1080p30 streams, and up to 18
H.264 1080p30 streams per GPU
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote
workstations, video encoding, and virtual reality applications
Instance Size GPUs vCPUs Memory (GiB)
Linux price per hour
(IAD)
Windows price per hour
(IAD)
g3.4xlarge 1 16 122 $1.14 $1.88
g3.8xlarge 2 32 244 $2.28 $3.75
g3.16xlarge 4 64 488 $4.56 $7.50
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
4 Modes of using G3 instances
CPU
16 vCPUs
GPU
1 x M60
Memory
122 GB
G3.4xlarge
Up to 10G
Network
Graphics
Rendering,
Simulations,
Video Encoding
EC2 Instance
with NVIDIA
Drivers &
Libraries
EC2 Instance with
NVIDIA GRID
NVIDIA GRID
Virtual
Workstation
NVIDIA GRID
Virtual
Application
Professional
Workstation
(Single User)
Virtual Apps
(25 concurrent
users) Gaming
Services
EC2 Instance w/
NVIDIA GRID for
Gaming
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
M&E – Content Creation
Auto – Car Configurators
E&P - Analytics
• Seismic Analysis, Energy E&P, Cloud GPU rendering &
visualization, such as high end car configurators,
AR/VR
• Desktop and Application Virtualization
• Productivity and consumer apps
• Design and engineering
• Media and entertainment post-production
• Media and entertainment: video playout/broadcast,
encoding/transcoding
• Cloud Gaming
G3 Use Cases
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS EC2 P3 Instances for
Compute Acceleration
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon EC2 P3 Instances (October 2017)
• Up to eight NVIDIA Tesla V100 GPUs
• 1 PetaFLOPs of computational performance
– Up to 14x better than P2
• 300 GB/s GPU-to-GPU communication
(NVLink) – 9X better than P2
• 16GB GPU memory with 900 GB/sec peak
GPU memory bandwidth
O n e o f t h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Use-Cases for P3 Instances
Machine Learning/AI High Performance Computing
Natural Language
Processing
Image and Video
recognition
Autonomous vehicle
systems
Recommendation
Systems
Computational Fluid
Dynamics
Financial and Data
Analytics
Weather
Simulation
Computational
Chemistry
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 Instances Details
Instance Size GPUs
GPU Peer
to Peer
vCPUs
Memory
(GB)
Network
Bandwidth
EBS
Bandwidth
On-Demand
Price/hr*
1-yr RI
Effective
Hourly*
3-yr RI
Effective
Hourly*
P3.2xlarge 1 No 8 61
Up to
10Gbps
1.7Gbps $3.06
$1.99
(35% Disc.)
$1.23
(60% Disc.)
P3.8xlarge 4 NVLink 32 244 10Gbps 7Gbps $12.24
$7.96
(35% Disc.)
$4.93
(60% Disc.)
P3.16xlarge 8 NVLink 64 488 25Gbps 14Gbps $24.48
$15.91
(35% Disc.)
$9.87
(60% Disc.)
Regional Availability
P3 instances are generally available in AWS US
East (Northern Virginia), US East (Ohio), US West
(Oregon), EU (Ireland), Asia Pacific (Seoul), Asia
Pacific (Tokyo), AWS GovCloud (US) and China
(Beijing) Regions.
Framework Support
P3 instances and their V100 GPUs supported
across all major frameworks (such as
TensorFlow, MXNet, PyTorch, Caffe2 and
CNTK)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS P3 vs P2 Instance
G P U P e rfo rm a n c e C o m p a ri s o n
• P2 Instances use K80 Accelerator (Kepler Architecture)
• P3 Instances use V100 Accelerator (Volta Architecture)
0
2
4
6
8
10
12
14
16
K80 P100 V100
FP32 Perf (TFLOPS)
1.7X
0
1
2
3
4
5
6
7
8
K80 P100 V100
FP64 Perf (TFLOPS)
2.6X
0
20
40
60
80
100
120
140
K80 P100 V100
Mixed/FP16 Perf (TFLOPS)
14X
over K80’s
max perf.
FP32
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
112
430
846820
3240
6300
0
1000
2000
3000
4000
5000
6000
7000
1 Accelerator 4 Accelerator 8 Accelerator
ResNet-50 Training Performance
(Using Synthetic Data, MXNet)
P2 (1 Accelerator = 2 GPUs) Images/S P3 (1 Accelerator = 1 GPU) Images/S
7.4X
7.5X
7.3X
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 Instances Details
Instance Size GPUs
GPU Peer
to Peer
vCPUs
Memory
(GB)
Network
Bandwidth
EBS
Bandwidth
On-Demand
Price/hr*
1-yr RI
Effective
Hourly*
3-yr RI
Effective
Hourly*
P3.2xlarge 1 No 8 61
Up to
10Gbps
1.7Gbps $3.06
$1.99
(35% Disc.)
$1.23
(60% Disc.)
P3.8xlarge 4 NVLink 32 244 10Gbps 7Gbps $12.24
$7.96
(35% Disc.)
$4.93
(60% Disc.)
P3.16xlarge 8 NVLink 64 488 25Gbps 14Gbps $24.48
$15.91
(35% Disc.)
$9.87
(60% Disc.)
• P3 instances provide GPU-to-GPU
data transfer over NVLink
• P2 instanced provided GPU-to-GPU
data transfer over PCI Express
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Description P3.16xlarge P2.16xlarge
P3 GPU Performance
Improvement
Number of GPUs 8 16 -
Number of Accelerators 8 (V100) 8 (K80)
GPU – Peer to Peer NVLink – 300 GB/s PCI-Express - 32 GB/s 9.4X
CPU to GPU Throughput
PCIe throughput per GPUs
8 GB/s 1 GB/s 8X
CPU to GPU Throughput
Total instance PCIe throughput
64 GB/s
(Four x16 Gen3)
16 GB/s
(One x16 Gen3)
4X
P3 vs P2 Peer-to-Peer Configurations
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 PCIe and NVLink Configurations
CPU0
GPU0
GPU1
GPU2
GPU3
PCIe Switches
CPU1
GPU4
GPU5
GPU6
GPU7
PCIe Switches
QPI
NVLink
PCIExpress
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
P3 PCIe and NVLink Configurations
CPU0
GPU0
GPU1
GPU2
GPU3
PCIe Switches
CPU1
GPU4
GPU5
GPU6
GPU7
PCIe Switches
QPI
NVLink
PCIExpress
0xFF
0xFF
0xFF
0xFF
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon S3
Secure, durable,
highly-scalable object
storage. Fast access,
low cost.
For long-term durable
storage of data, in a
readily accessible
get/put access format.
Primary durable and
scalable storage for
data
Amazon Glacier
Secure, durable, long
term, highly cost-
effective object
storage.
For long-term storage
and archival of data
that is infrequently
accessed.
Use for long-term,
lower-cost archival
of data
EC2+EBS
Create a single-AZ
shared file system
using EC2 and EBS,
with third-party or
open source software
(e.g., ZFS, Intel
Lustre, etc).
For near-line storage
of files optimized for
high I/O performance.
Use for high-IOPs,
temporary working
storage
AWS Storage Options
EFS
Highly available,
multi-AZ, fully
managed network-
attached elastic file
system.
For near-line, highly-
available storage of
files in a traditional
NFS format (NFSv4).
Use for read-often,
temporary working
storage
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Ingestion Options
• Within a P3 instance, we have maxed out the data throughput in to GPUs (PCI Express to/from
host CPUs) and between GPUs (NVLink)
• To maintain high utilization of GPUs, need high throughput data stream coming in to P3 instances
• Option 1: Use Multiple EBS Volumes
• Each Provisioned IOPS SSD (io1) EBS volume and provide about 500 MB/s of read or write throughput
(need to be provisioned with 20,000 IOPS)
• Customer can use independent EBS volume or combine multiple volumes via RAID to create a single
logical volume (5 io1 volumes can support 1.65 GB/s)
• http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.html
• Option 2: Amazon S3 -> EC2
• We have increased data transfer from Amazon S3 directly in to EC2 from 5 Gbps to 25Gbps
• Need to parallelize connections to Amazon S3 by using the TransferManager available in Amazon S3’s
Java SDK
• https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-s3-transfermanager.html
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Software Support for P3
Required Drivers & Libraries
• Hardware Driver version 384.81 or newer
• CUDA 9 or newer
• CuDNN 7 or newer & NCCL 2.0 or newer
• Generally packaged with CUDA
Machine Learning Frameworks
• For customers to take advantage of the new Tensor Cores in V100 GPUs, they will need to use
latest distros of ML framework
• All major frameworks have formally released support for V100 GPUs (ex - TensorFlow, MXNet,
Pytorch, Caffe)
• http://docs.nvidia.com/deeplearning/sdk/pdf/Training-Mixed-Precision-User-Guide.pdf
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS Deep Learning AMI
• Get started quickly with easy-to-launch tutorials
• Hassle-free setup and configuration
• Pay only for what you use – no additional charge for
the AMI
• Accelerate your model training and deployment
• Support for popular deep learning frameworks
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the second
$
Amazon SageMaker
Build, train, and deploy machine learning models at scale
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
EC2 Accelerated Computing Instances
F1: FPGA instance
• Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single instance. Programmable via
VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations.
• Designed for hardware-accelerated applications including financial computing, genomics,
accelerated search, and image processing
G3: GPU Graphics Instance
• Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses
• Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote
workstations, video encoding, and virtual reality applications
P3: GPU Compute Instance
• Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU
communication
• Supporting a wide variety of use cases including deep learning, HPC simulations, financial
computing, and batch rendering
P3
G3
F1
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Thank You!

Mais conteúdo relacionado

Mais procurados

Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data AnalyticsOpen Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analyticsinside-BigData.com
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overviewYutaka Kawai
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysTaylor Riggan
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group Ganesan Narayanasamy
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用NVIDIA Taiwan
 
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心NVIDIA Taiwan
 
AI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowAI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowSteve Guhr
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...Edge AI and Vision Alliance
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1IBM Sverige
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說NVIDIA Taiwan
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_ProcessingKohei KaiGai
 

Mais procurados (20)

Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data AnalyticsOpen Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
Open Source RAPIDS GPU Platform to Accelerate Predictive Data Analytics
 
Capi snap overview
Capi snap overviewCapi snap overview
Capi snap overview
 
A Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate ArraysA Primer on FPGAs - Field Programmable Gate Arrays
A Primer on FPGAs - Field Programmable Gate Arrays
 
AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group AI OpenPOWER Academia Discussion Group
AI OpenPOWER Academia Discussion Group
 
GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用GTC Taiwan 2017 企業端深度學習與人工智慧應用
GTC Taiwan 2017 企業端深度學習與人工智慧應用
 
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
GTC Taiwan 2017 如何在充滿未知的巨量數據時代中建構一個數據中心
 
AI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using KubeflowAI Pipeline Optimization using Kubeflow
AI Pipeline Optimization using Kubeflow
 
RAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data ScienceRAPIDS – Open GPU-accelerated Data Science
RAPIDS – Open GPU-accelerated Data Science
 
PowerAI Deep dive
PowerAI Deep divePowerAI Deep dive
PowerAI Deep dive
 
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta..."The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
"The Xilinx AI Engine: High Performance with Future-proof Architecture Adapta...
 
SQREAM DB on IBM Power9
SQREAM DB on IBM Power9SQREAM DB on IBM Power9
SQREAM DB on IBM Power9
 
Ac922 watson 180208 v1
Ac922 watson 180208 v1Ac922 watson 180208 v1
Ac922 watson 180208 v1
 
IBM HPC Transformation with AI
IBM HPC Transformation with AI IBM HPC Transformation with AI
IBM HPC Transformation with AI
 
GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說GTC Taiwan 2017 主題演說
GTC Taiwan 2017 主題演說
 
Deeplearningusingcloudpakfordata
DeeplearningusingcloudpakfordataDeeplearningusingcloudpakfordata
Deeplearningusingcloudpakfordata
 
Rapids: Data Science on GPUs
Rapids: Data Science on GPUsRapids: Data Science on GPUs
Rapids: Data Science on GPUs
 
Ac922 cdac webinar
Ac922 cdac webinarAc922 cdac webinar
Ac922 cdac webinar
 
20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing20201006_PGconf_Online_Large_Data_Processing
20201006_PGconf_Online_Large_Data_Processing
 
Mellanox OpenPOWER features
Mellanox OpenPOWER featuresMellanox OpenPOWER features
Mellanox OpenPOWER features
 
RAPIDS Overview
RAPIDS OverviewRAPIDS Overview
RAPIDS Overview
 

Semelhante a AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing

Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingAmazon Web Services
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksAmazon Web Services
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Amazon Web Services
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Amazon Web Services
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Amazon Web Services
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon Web Services
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon Web Services
 
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon Web Services
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitAmazon Web Services
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Amazon Web Services
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon Web Services
 
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...Amazon Web Services Korea
 
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon Web Services
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Amazon Web Services
 

Semelhante a AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing (20)

Deep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated ComputingDeep Dive on Amazon EC2 Accelerated Computing
Deep Dive on Amazon EC2 Accelerated Computing
 
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech TalksDeep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
Deep Dive on Amazon EC2 Accelerated Computing - AWS Online Tech Talks
 
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
Accelerate ML workloads using EC2 accelerated computing - CMP202 - Santa Clar...
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
EC2 Foundations - Laura Thomson
EC2 Foundations - Laura ThomsonEC2 Foundations - Laura Thomson
EC2 Foundations - Laura Thomson
 
SRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 FoundationsSRV319 Amazon EC2 Foundations
SRV319 Amazon EC2 Foundations
 
Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319 Foundations of Amazon EC2 - SRV319
Foundations of Amazon EC2 - SRV319
 
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
Introducing Amazon EC2 P3 Instance - Featuring the Most Powerful GPU for Mach...
 
Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017Amazon EC2 Foundations - CMP203 - re:Invent 2017
Amazon EC2 Foundations - CMP203 - re:Invent 2017
 
Amazon EC2 Foundations
Amazon EC2 FoundationsAmazon EC2 Foundations
Amazon EC2 Foundations
 
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS SummitAmazon EC2 Foundations - SRV319 - Anaheim AWS Summit
Amazon EC2 Foundations - SRV319 - Anaheim AWS Summit
 
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS SummitAmazon EC2 Foundations - SRV319 - Toronto AWS Summit
Amazon EC2 Foundations - SRV319 - Toronto AWS Summit
 
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS SummitAmazon EC2 Foundations - SRV319 - Atlanta AWS Summit
Amazon EC2 Foundations - SRV319 - Atlanta AWS Summit
 
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS SummitFoundations of Amazon EC2 - SRV319 - Chicago AWS Summit
Foundations of Amazon EC2 - SRV319 - Chicago AWS Summit
 
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
Accelerate Your C/C++ Applications with Amazon EC2 F1 Instances (CMP405) - AW...
 
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
Amazon EC2 deepdive and a sprinkel of AWS Compute | AWS Floor28
 
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
[Games on AWS 2019] AWS 입문자를 위한 초단기 레벨업 트랙 | AWS 레벨업 하기! : 컴퓨팅 - 조용진 AWS 솔루션즈...
 
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
Amazon EC2 Foundations (CMP208-R1) - AWS re:Invent 2018
 
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
Accelerating Development Using Custom Hardware Accelerations with Amazon EC2 ...
 
Builders' Day - What's New on EC2
Builders' Day - What's New on EC2Builders' Day - What's New on EC2
Builders' Day - What's New on EC2
 

Mais de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceAmazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

AWS Compute Evolved Week: Deep Dive on Amazon EC2 Accelerated Computing

  • 1. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Deep Dive on Amazon EC2 Accelerated Computing Instances Clinton Ford Sr. Product Manager, AWS August 7th, 2018
  • 2. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 Instance Types General Purpose Compute Optimized Storage Optimized Memory Optimized Accelerated Computing M5 T2 C5 C4 H1 I3 X1e R4 F 1 P3 G3 D2
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Accelerated Computing Instances F1: FPGA instance • Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations. • Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing G3: GPU Graphics Instance • Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications P3: GPU Compute Instance • Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication • Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch rendering P3 G3 F1
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. • 10s-100s of processing cores • Pre-defined instruction set & datapath widths • Optimized for general- purpose computing CPU CPUs vs GPUs vs FPGA for Compute • 1,000s of processing cores • Pre-defined instruction set and datapath widths • Highly effective at parallel execution GPU • Millions of programmable digital logic cells • No predefined instruction set or datapath widths • Hardware timed execution FPGA DRAM Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU Control ALU ALU Cache DRAM ALU ALU
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS EC2 F1 Instances for Custom Hardware Acceleration
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. An FPGA is effective at processing data of many types in parallel, for example creating a complex pipeline of parallel, multistage operations on a video stream, or performing massive numbers of dependent or independent calculations for a complex financial model… • An FPGA does not have an instruction-set! • Data can be any bit-width (9-bit integer? No problem!) • Complex control logic (such as a state machine) is easy to implement in an FPGA Each FPGA in F1 has more than 2M of these cells Parallel Processing in FPGAs
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. F1 FPGA instance types on AWS §Up to 8 Xilinx UltraScale+ 16nm VU9P FPGA devices in a single instance §The f1.16xlarge size provides: § 8 FPGAs, each with over 2 million customer-accessible FPGA programmable logic cells and over 5000 programmable DSP blocks § Each of the 8 FPGAs has 4 DDR-4 interfaces, with each interface accessing a 16GiB, 72-bit wide, ECC-protected memory Instance Size FPGAs FPGA Memory (GB) vCPUs Instance Memory (GB) NVMe Instance Storage (GB) Network Bandwidth f1.2xlarge 1 64 8 122 1 x 470 Up to 10 Gbps f1.16xlarge 8 512 64 976 4 x 940 25 Gbps
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 3 methods to use F1 Instance Hardware Engineers/Developers1 • Developers who are comfortable programming FPGA • Use F1 Hardware Development Kit (HDK) to develop and deploying custom FPGA accelerations using Verilog and VHDL Software Engineers/Developers2 • Developers who are not proficient in FPGA design • Use OpenCL to create custom accelerations Software Engineers/Developers3 • Developers who are not proficient in FPGA design • Use pre-built and ready to use accelerations available in AWS Marketplace
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS EC2 G3 Instances for Graphics Acceleration
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS G3 GPU instances • Up to four NVIDIA M60 GPUs • Includes GRID Virtual Workstation features and licenses, supports up to four monitors with 4096x2160 (4K) resolution • Includes NVIDIA GRID Virtual Application capabilities for application virtualization software like Citrix XenApp Essentials and VMWare Horizon, supporting up to 25 concurrent users per GPU • Hardware encoding to support up to 10 H.265 (HEVC) 1080p30 streams, and up to 18 H.264 1080p30 streams per GPU • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications Instance Size GPUs vCPUs Memory (GiB) Linux price per hour (IAD) Windows price per hour (IAD) g3.4xlarge 1 16 122 $1.14 $1.88 g3.8xlarge 2 32 244 $2.28 $3.75 g3.16xlarge 4 64 488 $4.56 $7.50
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 4 Modes of using G3 instances CPU 16 vCPUs GPU 1 x M60 Memory 122 GB G3.4xlarge Up to 10G Network Graphics Rendering, Simulations, Video Encoding EC2 Instance with NVIDIA Drivers & Libraries EC2 Instance with NVIDIA GRID NVIDIA GRID Virtual Workstation NVIDIA GRID Virtual Application Professional Workstation (Single User) Virtual Apps (25 concurrent users) Gaming Services EC2 Instance w/ NVIDIA GRID for Gaming
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. M&E – Content Creation Auto – Car Configurators E&P - Analytics • Seismic Analysis, Energy E&P, Cloud GPU rendering & visualization, such as high end car configurators, AR/VR • Desktop and Application Virtualization • Productivity and consumer apps • Design and engineering • Media and entertainment post-production • Media and entertainment: video playout/broadcast, encoding/transcoding • Cloud Gaming G3 Use Cases
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS EC2 P3 Instances for Compute Acceleration
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon EC2 P3 Instances (October 2017) • Up to eight NVIDIA Tesla V100 GPUs • 1 PetaFLOPs of computational performance – Up to 14x better than P2 • 300 GB/s GPU-to-GPU communication (NVLink) – 9X better than P2 • 16GB GPU memory with 900 GB/sec peak GPU memory bandwidth O n e o f t h e f a s t e s t , m o s t p o w e r f u l G P U i n s t a n c e s i n t h e c l o u d
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Use-Cases for P3 Instances Machine Learning/AI High Performance Computing Natural Language Processing Image and Video recognition Autonomous vehicle systems Recommendation Systems Computational Fluid Dynamics Financial and Data Analytics Weather Simulation Computational Chemistry
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 Instances Details Instance Size GPUs GPU Peer to Peer vCPUs Memory (GB) Network Bandwidth EBS Bandwidth On-Demand Price/hr* 1-yr RI Effective Hourly* 3-yr RI Effective Hourly* P3.2xlarge 1 No 8 61 Up to 10Gbps 1.7Gbps $3.06 $1.99 (35% Disc.) $1.23 (60% Disc.) P3.8xlarge 4 NVLink 32 244 10Gbps 7Gbps $12.24 $7.96 (35% Disc.) $4.93 (60% Disc.) P3.16xlarge 8 NVLink 64 488 25Gbps 14Gbps $24.48 $15.91 (35% Disc.) $9.87 (60% Disc.) Regional Availability P3 instances are generally available in AWS US East (Northern Virginia), US East (Ohio), US West (Oregon), EU (Ireland), Asia Pacific (Seoul), Asia Pacific (Tokyo), AWS GovCloud (US) and China (Beijing) Regions. Framework Support P3 instances and their V100 GPUs supported across all major frameworks (such as TensorFlow, MXNet, PyTorch, Caffe2 and CNTK)
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. © 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS P3 vs P2 Instance G P U P e rfo rm a n c e C o m p a ri s o n • P2 Instances use K80 Accelerator (Kepler Architecture) • P3 Instances use V100 Accelerator (Volta Architecture) 0 2 4 6 8 10 12 14 16 K80 P100 V100 FP32 Perf (TFLOPS) 1.7X 0 1 2 3 4 5 6 7 8 K80 P100 V100 FP64 Perf (TFLOPS) 2.6X 0 20 40 60 80 100 120 140 K80 P100 V100 Mixed/FP16 Perf (TFLOPS) 14X over K80’s max perf. FP32
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. 112 430 846820 3240 6300 0 1000 2000 3000 4000 5000 6000 7000 1 Accelerator 4 Accelerator 8 Accelerator ResNet-50 Training Performance (Using Synthetic Data, MXNet) P2 (1 Accelerator = 2 GPUs) Images/S P3 (1 Accelerator = 1 GPU) Images/S 7.4X 7.5X 7.3X
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 Instances Details Instance Size GPUs GPU Peer to Peer vCPUs Memory (GB) Network Bandwidth EBS Bandwidth On-Demand Price/hr* 1-yr RI Effective Hourly* 3-yr RI Effective Hourly* P3.2xlarge 1 No 8 61 Up to 10Gbps 1.7Gbps $3.06 $1.99 (35% Disc.) $1.23 (60% Disc.) P3.8xlarge 4 NVLink 32 244 10Gbps 7Gbps $12.24 $7.96 (35% Disc.) $4.93 (60% Disc.) P3.16xlarge 8 NVLink 64 488 25Gbps 14Gbps $24.48 $15.91 (35% Disc.) $9.87 (60% Disc.) • P3 instances provide GPU-to-GPU data transfer over NVLink • P2 instanced provided GPU-to-GPU data transfer over PCI Express
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Description P3.16xlarge P2.16xlarge P3 GPU Performance Improvement Number of GPUs 8 16 - Number of Accelerators 8 (V100) 8 (K80) GPU – Peer to Peer NVLink – 300 GB/s PCI-Express - 32 GB/s 9.4X CPU to GPU Throughput PCIe throughput per GPUs 8 GB/s 1 GB/s 8X CPU to GPU Throughput Total instance PCIe throughput 64 GB/s (Four x16 Gen3) 16 GB/s (One x16 Gen3) 4X P3 vs P2 Peer-to-Peer Configurations
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 PCIe and NVLink Configurations CPU0 GPU0 GPU1 GPU2 GPU3 PCIe Switches CPU1 GPU4 GPU5 GPU6 GPU7 PCIe Switches QPI NVLink PCIExpress
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. P3 PCIe and NVLink Configurations CPU0 GPU0 GPU1 GPU2 GPU3 PCIe Switches CPU1 GPU4 GPU5 GPU6 GPU7 PCIe Switches QPI NVLink PCIExpress 0xFF 0xFF 0xFF 0xFF
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon S3 Secure, durable, highly-scalable object storage. Fast access, low cost. For long-term durable storage of data, in a readily accessible get/put access format. Primary durable and scalable storage for data Amazon Glacier Secure, durable, long term, highly cost- effective object storage. For long-term storage and archival of data that is infrequently accessed. Use for long-term, lower-cost archival of data EC2+EBS Create a single-AZ shared file system using EC2 and EBS, with third-party or open source software (e.g., ZFS, Intel Lustre, etc). For near-line storage of files optimized for high I/O performance. Use for high-IOPs, temporary working storage AWS Storage Options EFS Highly available, multi-AZ, fully managed network- attached elastic file system. For near-line, highly- available storage of files in a traditional NFS format (NFSv4). Use for read-often, temporary working storage
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Data Ingestion Options • Within a P3 instance, we have maxed out the data throughput in to GPUs (PCI Express to/from host CPUs) and between GPUs (NVLink) • To maintain high utilization of GPUs, need high throughput data stream coming in to P3 instances • Option 1: Use Multiple EBS Volumes • Each Provisioned IOPS SSD (io1) EBS volume and provide about 500 MB/s of read or write throughput (need to be provisioned with 20,000 IOPS) • Customer can use independent EBS volume or combine multiple volumes via RAID to create a single logical volume (5 io1 volumes can support 1.65 GB/s) • http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/raid-config.html • Option 2: Amazon S3 -> EC2 • We have increased data transfer from Amazon S3 directly in to EC2 from 5 Gbps to 25Gbps • Need to parallelize connections to Amazon S3 by using the TransferManager available in Amazon S3’s Java SDK • https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/examples-s3-transfermanager.html
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Software Support for P3 Required Drivers & Libraries • Hardware Driver version 384.81 or newer • CUDA 9 or newer • CuDNN 7 or newer & NCCL 2.0 or newer • Generally packaged with CUDA Machine Learning Frameworks • For customers to take advantage of the new Tensor Cores in V100 GPUs, they will need to use latest distros of ML framework • All major frameworks have formally released support for V100 GPUs (ex - TensorFlow, MXNet, Pytorch, Caffe) • http://docs.nvidia.com/deeplearning/sdk/pdf/Training-Mixed-Precision-User-Guide.pdf
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. AWS Deep Learning AMI • Get started quickly with easy-to-launch tutorials • Hassle-free setup and configuration • Pay only for what you use – no additional charge for the AMI • Accelerate your model training and deployment • Support for popular deep learning frameworks
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. EC2 Accelerated Computing Instances F1: FPGA instance • Up to 8 Xilinx Virtex® UltraScale+™ VU9P FPGAs in a single instance. Programmable via VHDL, Verilog, or OpenCL. Growing marketplace of pre-built application accelerations. • Designed for hardware-accelerated applications including financial computing, genomics, accelerated search, and image processing G3: GPU Graphics Instance • Up to 4 NVIDIA M60 GPUs, with GRID Virtual Workstation features and licenses • Designed for workloads such as 3D rendering, 3D visualizations, graphics-intensive remote workstations, video encoding, and virtual reality applications P3: GPU Compute Instance • Up to 8 NVIDIA V100 GPUs in a single instance, with NVLink for peer-to-peer GPU communication • Supporting a wide variety of use cases including deep learning, HPC simulations, financial computing, and batch rendering P3 G3 F1
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Thank You!