Researchers from around the world are increasingly using AWS for a wide-array of use cases. This presentation describes how AWS facilitates scientific collaboration and powers some of the world's largest scientific efforts, including real-world examples from NASA JPL, the European Space Agency (ESA) and CERN's CMS particle detector.
1. Scien&fic
Compu&ng
on
AWS:
NASA/JPL,
ESA
and
CERN
Jamie Kinney
Principal Solutions Architect
World Wide Public Sector
jkinney@amazon.com
@jamiekinney
1
2. ?
How do researchers use AWS today?
Can you run HPC on AWS?
Should everything run on the cloud?
How does AWS facilitate scientific collaboration?
2
21. UseCases
•Science-as-a-Service
•Large-scale HTC (100,000+ core clusters)
•Large-scale MapReduce (Hadoop/Spark/Shark) using EMR or EC2
•Small to medium-scale MPI clusters (hundreds of nodes)
•Many small MPI clusters working in parallel to explore parameter space
•GPGPU workloads
•Dev/test of MPI workloads prior to submitting to supercomputing centers
•Collaborative research environments
•On-demand academic training/lab environments
21
23. ESAGaiaMissionOverview
ESA’s Gaia is an ambitious mission to chart a three-dimensional
map of the Milky Way Galaxy in order to reveal the composition,
formation and evolution of our Galaxy.
Gaia will repeatedly analyze and record the positions and
magnitude of approximately one billion stars over the course of
several years.
1 billion stars x 80 observations x 10 readouts = ~1 x 10^12
samples.
1ms processing time/sample = more than 30 years of processing
23
24. GaiaSolutionOverview
• Purchase at the beginning of the mission for the anticipated high-water mark
• Pay as you go: Launch what you need, as you need it. Turn instances off when you’re done
• Purchase additional systems for redundancy
• If an instance fails, turn it off and launch a replacement at no additional charge
• Large-scale data reprocessing is constrained to available infrastructure. No way to accelerate jobs
without additional CapEx
• Need to reprocess the data within a few hours, simply launch more instances. 100 machines running
for 1 hour at the same cost as 1 machine running for 100 hours
• Performance constrained to processor/disk/memory available at time of procurement...for a multi-
year mission
• AWS frequently launches new instance types running the latest hardware. Simply restart your
instances on a newer instance type and stop paying for less-capable infrastructure.
• Data transfer and security policies make it difficult to collaborate with researchers located elsewhere
• Easily and securely collaborate with researchers around the world
24
28. JPL
Pasadena, CA
CDSCC
Canberra Deep Space
Communication Complex
MDSCC
Madrid Deep Space
Communication Complex
GDSCC
Goldstone Deep Space
Communication Complex
ARC
CheMin
Moffett Field, CA
MSSS
MARDI, MAHLI,
MastCam
San Diego, CA
KSC
IKI
DAN
Moscow, Russia
INTA
REMS
Madrid,
Spain
LANL
ChemCam
Los Alamos, NM
UofGuelph
APXS
Guelph, OntarioSwRI
RAD
Boulder, CO
GSFC
SAM
Greenbelt, MD
Plus hundreds of other
sites around the world for
Co-Is and Colleagues
MSL Distributed Operations
28
29. Data Locality Challenges
Scientist 1 retrieves data from L.A.
Scientist 1 returns data to L.A.
Scientist 2 retrieves data from L.A.
Scientist 2 returns data to L.A.
29
32. Data Locality Challenges
Researcher in L.A. uploads
data to the cloud
Scientist 1 uses cloud
resources to process data
Scientist 2 retrieves data
products from edge network
Scientist 2 uses cloud resources
to process data
Global collaboration
32