Scientific Computing
Science is one of the greatest areas of
computation and can benefit from a
democratization in cost and global
accessibility that the cloud brings.
It’s also where we think Amazon can
make a huge, really disruptive, impact
on the world by participating - which is, at
the most basic level, what we are about
as a company.
Existing
1. Oregon
2. California
3. Virginia
4. Dublin
5. Frankfurt
6. Singapore
7. Sydney
8. Seoul
9. Tokyo
10. Sao Paulo
11. Beijng
12. US GovCloud
1. Ohio
2. India
3. UK
4. Canada
5. China+1
AWS Region
Availability Zone
regions are sovereign your data never
leaves
Meeeeelions of uncorrelated workloads
cores
time
Collective action
When everyone comes
together in the cloud to
share the resource, and
only pays for what they
use, the efficiency is
huge.
Spot Market
cores
time
Spot Market
Our ultimate space
filler.
Spot Instances allow you
to name your own price
for spare AWS EC2
computing capacity.
Great for workloads that
aren’t time sensitive, and
especially popular in
research (hint: it’s really
cheap).
Spot Market Behavior
Spot Bid Advisor
The Spot Bid Advisor
analyzes Spot price history
to help you determine a bid
price that suits your needs.
You should weigh your
application’s tolerance for
interruption and your cost
saving goals when selecting
a Spot instance and bid
price.
The lower your frequency of
being outbid, the longer
your Spot instances are
likely to run without
interruption.
https://aws.amazon.com/ec2/spot/bid-advisor/
Bid Price & Savings
Your bid price affects your
ranking when it comes to
acquiring resources in the
SPOT market, and is the
maximum price you will pay.
But frequently you’ll pay a
lot less.
Breakthrough discoveries in the Cloud
The CHILES project astronomers have detected radio emissions from
hydrogen in a galaxy more than 5 billion light years away, shattering the
previous record by almost twice. This has important implications for our
understanding of how galaxies have evolved over time.
The team at ICRAR in Western Australia estimates that the amount of
compute capacity required to shift and crunch this data would have made this
work infeasible.
By using AWS, they were able to quickly and cheaply build their new
pipelines, and then scale them as massive amounts of data arrived from their
instruments.
AWS Building blocks
TECHNICAL &
BUSINESS
SUPPORT
Account
Management
Support
Professional
Services
Solutions
Architects
Training &
Certification
Security
& Pricing
Reports
Partner
Ecosystem
AWS
MARKETPLACE
Backup
Big Data
& HPC
Business
Apps
Databases
Development
Industry
Solutions
Security
MANAGEMENT
TOOLS
Queuing
Notifications
Search
Orchestration
Email
ENTERPRISE
APPS
Virtual
Desktops
Storage
Gateway
Sharing &
Collaboration
Email &
Calendaring
Directories
HYBRID CLOUD
MANAGEMENT
Backups
Deployment
Direct
Connect
Identity
Federation
Integrated
Management
SECURITY &
MANAGEMENT
Virtual Private
Networks
Identity &
Access
Encryption
Keys
Configuration Monitoring Dedicated
INFRASTRUCTURE
SERVICES
Regions
Availability
Zones
Compute
Storage
Objects,
Blocks, Files
Databases
SQL, NoSQL,
Caching
CDNNetworking
PLATFORM
SERVICES
App
Mobile
& Web
Front-end
Functions
Identity
Data Store
Real-time
Development
Containers
Source
Code
Build
Tools
Deploymen
t
DevOps
Mobile
Sync
Identity
Push
Notifications
Mobile
Analytics
Mobile
Backend
Analytics
Data
Warehousing
Hadoop
Streaming
Data
Pipelines
Machine
Learning
EC2There’s a couple dozen
EC2 compute instance
types alone, each of
which is optimized for
different things.
One size does not fit
all.
C4Intel Xeon E5-2666 v3, custom built for AWS.
Intel Haswell, 16 FLOPS/tick
2.9 GHz, turbo to 3.5 GHz
http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/c4-instances.html
Feature Specification
Processor Number E5-2666 v3
Intel® Smart Cache 25 MiB
Instruction Set 64-bit
Instruction Set Extensions AVX 2.0
Lithography 22 nm
Processor Base Frequency 2.9 GHz
Max All Core Turbo Frequency 3.2 GHz
Max Turbo Frequency 3.5 GHz (available on c4.2xLarge)
Intel® Turbo Boost Technology 2.0
Intel® vPro Technology Yes
Intel® Hyper-Threading Technology Yes
Intel® Virtualization Technology (VT-x) Yes
Intel® Virtualization Technology for Directed
I/O (VT-d)
Yes
Intel® VT-x with Extended Page Tables (EPT) Yes
Intel® 64 Yes
cfnCluster - provision an HPC cluster in minutes
#cfncluster
https://github.com/awslabs/cfncluster
cfncluster is a sample code framework that deploys and maintains clusters on AWS. It is reasonably
agnostic to what the cluster is for and can easily be extended to support different frameworks. The
CLI is stateless, everything is done using CloudFormation or resources within AWS.
10minutes
http://boofla.io/u/cfnCluster – (Boof’s HOWTO slides)
§ 750+ popular scientific applications
AWS Marketplace
iimmediately
Introducing Alces Flight - self-scaling HPC clusters instantly ready to compute, billed by the hour and using
the AWS Spot market by default to achieve supercomputing for ~1c per core per hour.
Self-service HPC … 2016
http://boofla.io/u/alcesFlight
Requirements for Launching your HPC cluster
• An Amazon Web Services (AWS) account
• An SSH key-pair in your AWS region
• An SSH client
• Optionally – a VNC client
• A workload to process
Alces Gridware Application library
• Over 850 application, library and MPI versions
• Pre-optimized and stored in S3
• Option to compile and optimize on-demand
• Includes modules environment management
• Gridware project keeps pace with latest versions
• Support for commercial and licensed applications
• http://tiny.cc/gridware
Using Storage Services
• Cluster includes large
storage volume for
data and apps
• Tools to manage data
held in object storage
• Store your data in
AWS S3 quickly and
easily
S3
Cluster job scheduler
• Choice of HPC cluster job
schedulers
• Automate job processing on
your HPC cluster
• Queue jobs for processing
when nodes are available
• Auto-scaling compute nodes
within user-defined limits
• Automatically rerun any jobs
stopped when spot price
exceeded
Landsat Satellite mapping data
• Continuous record of Earth’s
surface
• Data from the 1970s to
present day
• Public data set available to
everyone
• Stored on object storage,
including AWS S3
Workload
• Survey of cloud cover around Northern Tropic
• Task-array job running 360 degrees around the Earth
• Measures average cloud cover in each image
• Generates a deck of sample images
• Uploads deck to S3 object storage
• Uses 360 x compute cores
? S3
Approximate costs
• 360 jobs each taking ~5 mins
• Total CPU time = 30 core hours
• Cost of 36 core hours in AWS spot market* = $0.44
• Cost of one T2 login node for 1 hour* = $0.12
• Cost of 100GB EBS volume for apps* = <$0.01
• Alces Flight software cost = $0.00
• Total cost per daily run = $0.60 / 45p
• Cost for one year of research = $219 / £168
* based on C4.8xlarge spot rate in EU-West region; T2.large on-demand instance; EBS st1 volume; excludes S3 storage costs and sales tax where applicable
OpenFoam CFD
• Computational Fluid Design workload
• Simulates liquid and air-flow for engineering projects
• Open-source software available to all
• Commercial support available from CFD Direct Ltd.
• Run as a parallel job across multiple compute nodes
Workload
• Generate a mesh representing the problem
• Decomposition of the problem into sections
• Processing of the sections
• Visualization of the solution
Approximate costs (full solve)
• 1 job using 128 cores taking 4 hours
• Total CPU time = 1024 core hours
• Cost of 1024 core hours in AWS spot market* = $7.04
• Cost of one T2 login node for 4 hours* = $0.45
• Cost of 100GB EBS volume for apps* = $0.02
• Alces Flight software cost = $0.00
• Total cost per simulation = $7.51 / £5.75
* based on C4.8xlarge spot rate in EU-West region; T2.large on-demand instance; EBS st1 volume; excludes sales tax where applicable
Filesystems in the marketplace, too
BeeGFS is a scalable parallel cluster filesystem
developed with a strong focus on performance and
designed easy installation and management developed
by the Fraunhofer Institute.
Intel Lustre®
Cloud Edition is a scalable, parallel file
system purpose-built for HPC and with a long history in
the field supporting a range of workloads.
There’s more to come - the AWS Marketplace is
growing all the time and new offerings are added
frequently. Watch this space.
There are cluster filesystem options, too– for when you need extreme I/O scaling.