SlideShare uma empresa Scribd logo
1 de 23
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
NASA Goddard:
Head in the Clouds
Dan Duffy, NASA
Steve Orrin, Intel
Tim Carroll, Cycle Computing
©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Fastest growing workloads
Fraud Detection
Risk Modeling
Drug Design
Genomics
Modeling and
Simulation
Unstructured Data
Analysis,
Data Lakes
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Most resource intensive
1 core 8 cores 8 servers 10–10000 servers
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Great, so…
what’s the problem?
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
The challenge of fixed capacity
Time
Capability
Internal Capacity
System Organization
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Transform/life sciences
The problem in 2013:
• Cancer research needed 50,000 cores,
not available in-house
The options they didn’t choose:
• Buy infrastructure: Spend $2M, wait 6 months
• Write software for 9–12 months this 1 app
Solution:
• Created 10,600 server cluster
• 39.5 years of computing in 8 hours
• Found 3 potential drug candidates!
• Total infrastructure bill: $4,372
6
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Cycle powers cloud BigData and BigCompute
Data Workflow
Cloud Orchestration
Analytics
Modeling
Internal
Compute
Compute Burst
Software required to drive
analytics and simulation at
scale:
• Easy access
• Highly automated
• On-demand
• Ask the right questions
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Best way to try it… try it
8
to try it…try Tim@cyclecomputing.com
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 20159
Measure Woody Biomass on South Side of the
Sahara at the 40–50 cm Scale Using AWS
Overview of the NASA Head in the Clouds Project presented at the Amazon Web Services
Public Summit 2015
Daniel Duffy daniel.q.duffy@nasa.gov and on Twitter @dqduffy
High Performance Computing Lead at the
NASA Center for Climate Simulation (NCCS) – http://www.nccs.nasa.gov and @NASA_NCCS
Goddard Space Flight Center (GSFC) – http://www.nasa.gov/centers/goddard/home/
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
ESD Project Won Intel Head in Clouds Challenge Award to
Estimate Biomass in South Sahara
Project Goal
• Using NGA data to estimate tree and bush biomass over the
entire arid and semi-arid zone on the south side of the Sahara
Project Summary
• Estimate carbon stored in trees and bushes in arid and semi-
arid south Sahara
• Establish carbon baseline for later research on expected CO2
uptake on the south side of the Sahara
Principal Investigators
• Dr. Compton J. Tucker, NASA Goddard Space Flight Center
• Dr. Paul Morin, University of Minnesota
Tree
Crown
Shadow
NGA 40 cm imagery representing tree
and shrub automated recognition
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Intel
• Professional Services and Funding for AWS Resources
Amazon Web Services (AWS)
• Compute and storage
• Support to set up environment
Cycle Computing
• Cloud Resource Management Software
• Services to install and configure the software
Climate Model Data Services (CDS – GSFC Code 600)
• NGA data support
NASA Center for Climate Simulation (NCCS – GSFC Code 606.2)
• System administration, application support, and data movement
NASA CIO
• General cloud consulting and coordination support
Partners and Resources
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Existing Sub-Saharan Arid and Semi-Arid Sub-Meter Commercial Imagery
9600 Strips (~80TB) to Be Delivered to GSFC
~1600 strips (~20TB) at GSFC
Area Of Interest (AOI) for Sub-Saharan Arid and Semi-Arid Africa
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
The DigtalGlobe Constellation
The Entire Archive is Licensed to the USG
Geoeye
Quickbird
Ikonos
Worldview 1
Worldview 2
Worldview 3 (Available Q1 2015)
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
14
Panchromatic and multispectral mapping
at the 40- and 50-cm scale
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Use Niger as the test case
NGA data over Niger
• Currently have about 16,000 total scenes covering Niger (the data is already orthorectified)
• For this test case, approximately 3,120 scenes need to be processed to generate the vegetation index
• Each scene is approximately 30,000 x 30,000 data points (pixels)
• Will break each scene up into 100 tiles (3,000 x 3,000)
Where is the data?
• Data currently resides within the NCCS and in AWS
Additional data
• If we are successful and have additional time and resources, other African areas can be studied.
15NASA Head in the Clouds Project
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Processing requirements
Based on the tests run in the NCCS private cloud, following processing requirements were
estimated
• The tests were run on a single core (Intel E5-2670 2.5 GHz processor) virtual machine with 2 GB of
memory
• Each of the 3,120 scenes is broken up into 100 tiles
• Each tile took 24 minutes
• Hence, one scene will then take 24 * 100 = 2,400 minutes of total processor time (about 40 wall
clock hours)
• Tiles and scenes can be run in parallel
• Total scene to process = 312,000
• Total compute hours = 124,800
Target completion time
• 1 month will take between 175 to 200 virtual machines running non-stop
16NASA Head in the Clouds Project
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Input and output data
Input data
• Total input of about 8 TB for the 3,120 scenes
• Average of about 2.63 GB of data per scene
• Average of about 26.3 MB of data per tile
Intermediate data products
• Unsure of how much intermediate data products are needed; this will impact the amount of
temporary space required for each run
Output data products
• Total output data is estimated to be 25% of the input data
• Estimated total output is about 2 to 3 TB
• Output data will be transferred back to the NCCS
17NASA Head in the Clouds Project
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Cluster configuration requirements
18
Category Description Requirement
Number of Cores How many cores are required on a single node for the
application?
1 per tile
Amount of Memory (RAM) How much memory on a node (or per core) is required for
the application?
2 GB per tile
Operating System (OS) What operating system does the application need? Linux (Centos or debian)
Libraries/Tools/Software What additional libraries, tools, and software are needed to
be installed? Compilers? Commercial software?
None
Parallelization Can the application run in a parallel manner? If so, how
(threaded, MPI, or multiple instances of the application)?
Inherently parallel processing of each scene
and/or tile
Cluster If the application runs in parallel across many nodes, how
many nodes are required?
175 – 200 to complete in 1 month; more can be
used
Storage How much storage space will be required for each run
(input, intermediate, and output files)?
Total Input – 8 TB (approx. 2.6 GB for each
scene)
Intermediate – To be determined
Total Output Back to NCCS – 2 TB ( approx.
25% of total input)
Shared Storage Does this storage have to be shared across all nodes? No
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Workflow
19
DataMan
Cycle Computing Data
Transfer Software
NCCS Science Cloud
(Internal Cloud)
Shared File System
NGA Data at NASA
NGA Data External to NASA (PGC, Digital Globe)
Data to be copied into the
NCCS science cloud NGA
data repository.
NCCS/NASA VM
Local
Data
A resource manager (batch queue) will be running in AWS. Scientists
will interact and launch jobs through the Cycle Computing system
directly in AWS.
Virtual machines will be launched in AWS. After the job is completed,
the results will be copied back to the NCCS.
VM
Local
Data
VM
Local
Data
VM
Local
Data
AWS
VM VM VM
Virtual machines in the internal cloud can read the data directly
from the shared disk in the NASA internal cloud. No additional
data movement is required.
Amazon S3
Data to be processed is staged into Amazon S3. Data will be moved
to the local storage of the VM’s for processing. Products could be
stored in S3 for transfer to the NCCS at a later time.
Batch
Queue
System
The Cycle Computing
DataMan software will
be used to transfer the
data into Amazon S3.
Cycle
Computing
System
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Time line
20
Category Dec Jan Feb Mar Apr May Jun Jul Aug Sep
Bi-Weekly Tag Ups
Requirements/Scope
Setup/Configuration
Test Runs
Transfer Data to S3
Configure S3 Buckets
Production Runs
Analysis
Final Report
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Why use Cycle Computing and AWS?
• The bigger goal is to analyze the entire arid and semi-arid zone on the south side of
the Sahara
– About 80 TB
– 10x the data that the initial project will analyze
• On 200 virtual machines, this will take 10 months!
– How can we accelerate this?
• Can easily scale up the number of virtual machines using the Cycle Computing
software and the AWS resources
– Once the data is in AWS, 80 TB of data can be analyzed in approximately the same amount
of time as 8 TB of data
– Scientists really love this part!
• Might need longer given the data transfers may take time – can overlap data transfers
and computation
21
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Thanks goes to the following…
NASA
• Dr. Compton Tucker (Co-PI)
• Katherine Melocik (GSFC)
• Jennifer Small (GSFC)
• Dr. Tsengdar Lee (HQ)
• Daniel Duffy (GSFC)
• Mark McInerney (GSFC)
• Hoot Thompson (GSFC)
• Garrison Vaughn (GSFC)
• Brittany Wills (GSFC)
• Scott Sinno (GSFC)
• Ray Obrien (ARC)
• Richard Schroeder (ARC)
• Milton Checchi (ARC)
University Partners
• Paul Morin (Co-PI, Univ. Minnesota)
• Claire Porter (Univ. Minnesota)
• Jamon Van Den Hoek (Oak Ridge)
22
Cycle Computing
• Tim Carroll
• Michael Requa
• Carl Chesal
• Bob Nordlund
• Glen Otero
• Rob Futrick
AWS
• Jamie Baker
• Jeff Layton
There are others… My apologies for those I
missed. These are typically the ones on the our
conference calls!
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015
Thank You.
This presentation will be loaded to SlideShare the week following the Symposium.
http://www.slideshare.net/AmazonWebServices
AWS Government, Education, and Nonprofit Symposium
Washington, DC I June 25-26, 2015

Mais conteúdo relacionado

Mais procurados

3. 195883 open gis data slides jw_edit_js-mh
3. 195883 open gis data slides jw_edit_js-mh3. 195883 open gis data slides jw_edit_js-mh
3. 195883 open gis data slides jw_edit_js-mh
Amazon Web Services
 
An Update on the AWS/FedRAMP TIC Overlay Pilot
An Update on the AWS/FedRAMP TIC Overlay PilotAn Update on the AWS/FedRAMP TIC Overlay Pilot
An Update on the AWS/FedRAMP TIC Overlay Pilot
Amazon Web Services
 
AWS GovCloud (US) – A Deep Dive into Compliance
AWS GovCloud (US) – A Deep Dive into ComplianceAWS GovCloud (US) – A Deep Dive into Compliance
AWS GovCloud (US) – A Deep Dive into Compliance
Amazon Web Services
 
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
Amazon Web Services
 
C2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid PrototypingC2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid Prototyping
Amazon Web Services
 

Mais procurados (20)

Introduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud ComputingIntroduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud Computing
 
3. 195883 open gis data slides jw_edit_js-mh
3. 195883 open gis data slides jw_edit_js-mh3. 195883 open gis data slides jw_edit_js-mh
3. 195883 open gis data slides jw_edit_js-mh
 
Enterprise Cloud Adoption Strategies in Higher Education
Enterprise Cloud Adoption Strategies in Higher EducationEnterprise Cloud Adoption Strategies in Higher Education
Enterprise Cloud Adoption Strategies in Higher Education
 
An Update on the AWS/FedRAMP TIC Overlay Pilot
An Update on the AWS/FedRAMP TIC Overlay PilotAn Update on the AWS/FedRAMP TIC Overlay Pilot
An Update on the AWS/FedRAMP TIC Overlay Pilot
 
AWS GovCloud (US) – A Deep Dive into Compliance
AWS GovCloud (US) – A Deep Dive into ComplianceAWS GovCloud (US) – A Deep Dive into Compliance
AWS GovCloud (US) – A Deep Dive into Compliance
 
Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C.
Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C. Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C.
Moving Workloads into AWS GovCloud (US) - AWS Symposium 2014 - Washington D.C.
 
Big Data in The Cloud: Architecting a Better Platform
Big Data in The Cloud: Architecting a Better PlatformBig Data in The Cloud: Architecting a Better Platform
Big Data in The Cloud: Architecting a Better Platform
 
Hybrid IT Approach and Technologies on AWS
Hybrid IT Approach and Technologies on AWSHybrid IT Approach and Technologies on AWS
Hybrid IT Approach and Technologies on AWS
 
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
DevOps in the Public Sector: How the Democratic Party Implemented DevOps to M...
 
C2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid PrototypingC2S Tech Tips: Rapid Prototyping
C2S Tech Tips: Rapid Prototyping
 
Using AWS Services to Go “All In” on AWS
Using AWS Services to Go “All In” on AWSUsing AWS Services to Go “All In” on AWS
Using AWS Services to Go “All In” on AWS
 
AWS GovCloud (US) - An Overview
AWS GovCloud (US) - An OverviewAWS GovCloud (US) - An Overview
AWS GovCloud (US) - An Overview
 
AWS Cost Management Lessons from the Private Sector
AWS Cost Management Lessons from the Private SectorAWS Cost Management Lessons from the Private Sector
AWS Cost Management Lessons from the Private Sector
 
Acquisition Strategies and Contract Vehicles in the Public Sector
Acquisition Strategies and Contract Vehicles in the Public SectorAcquisition Strategies and Contract Vehicles in the Public Sector
Acquisition Strategies and Contract Vehicles in the Public Sector
 
Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...
Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...
Federal Compliance Deep Dive: FISMA, FedRAMP, and Beyond - AWS Symposium 2014...
 
Security & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPA
Security & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPASecurity & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPA
Security & Privacy: Using AWS to Meet Requirements for HIPAA, CJIS, and FERPA
 
C2S: What’s Next
C2S: What’s NextC2S: What’s Next
C2S: What’s Next
 
Scaling by Design: AWS Web Services Patterns
Scaling by Design:AWS Web Services PatternsScaling by Design:AWS Web Services Patterns
Scaling by Design: AWS Web Services Patterns
 
(ISM206) Modern IT Governance Through Transparency and Automation
(ISM206) Modern IT Governance Through Transparency and Automation(ISM206) Modern IT Governance Through Transparency and Automation
(ISM206) Modern IT Governance Through Transparency and Automation
 
AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C.
AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C. AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C.
AWS Deployment Best Practices - AWS Symposium 2014 - Washington D.C.
 

Semelhante a NASA Goddard: Head in the Clouds

AWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening KeynoteAWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening Keynote
Amazon Web Services
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
confluent
 

Semelhante a NASA Goddard: Head in the Clouds (20)

Disaster Recovery of On-Premises IT Infrastructure with AWS
Disaster Recovery of On-Premises IT Infrastructure with AWSDisaster Recovery of On-Premises IT Infrastructure with AWS
Disaster Recovery of On-Premises IT Infrastructure with AWS
 
Big Data and Analytics on AWS
Big Data and Analytics on AWS Big Data and Analytics on AWS
Big Data and Analytics on AWS
 
ModernizationAWS.pdf
ModernizationAWS.pdfModernizationAWS.pdf
ModernizationAWS.pdf
 
Using the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science ResearchUsing the Open Science Data Cloud for Data Science Research
Using the Open Science Data Cloud for Data Science Research
 
Networking: New Capabilities for Amazon Virtual Private Cloud
Networking: New Capabilities for Amazon Virtual Private CloudNetworking: New Capabilities for Amazon Virtual Private Cloud
Networking: New Capabilities for Amazon Virtual Private Cloud
 
AWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening KeynoteAWS Canberra WWPS Summit 2013 - Opening Keynote
AWS Canberra WWPS Summit 2013 - Opening Keynote
 
Self-Service Supercomputing
Self-Service SupercomputingSelf-Service Supercomputing
Self-Service Supercomputing
 
Time to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the CloudTime to Science/Time to Results: Transforming Research in the Cloud
Time to Science/Time to Results: Transforming Research in the Cloud
 
Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery   Advanced Strategies for Leveraging AWS for Disaster Recovery
Advanced Strategies for Leveraging AWS for Disaster Recovery
 
Understanding The Azure Platform November 09
Understanding The Azure Platform   November 09Understanding The Azure Platform   November 09
Understanding The Azure Platform November 09
 
Tooling Up for Efficiency: DIY Solutions @ Netflix - ABD319 - re:Invent 2017
Tooling Up for Efficiency: DIY Solutions @ Netflix - ABD319 - re:Invent 2017Tooling Up for Efficiency: DIY Solutions @ Netflix - ABD319 - re:Invent 2017
Tooling Up for Efficiency: DIY Solutions @ Netflix - ABD319 - re:Invent 2017
 
Vancouver keynote - AWS Innovate - Sam Elmalak
Vancouver keynote - AWS Innovate - Sam ElmalakVancouver keynote - AWS Innovate - Sam Elmalak
Vancouver keynote - AWS Innovate - Sam Elmalak
 
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
 
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos DeligiannisCourse 3 : Types of data and opportunities by Nikolaos Deligiannis
Course 3 : Types of data and opportunities by Nikolaos Deligiannis
 
Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?Webinar: SQL for Machine Data?
Webinar: SQL for Machine Data?
 
Jeff Kratz - Cloud Computing
Jeff Kratz - Cloud ComputingJeff Kratz - Cloud Computing
Jeff Kratz - Cloud Computing
 
Hybrid Cloud Solutions to Transform Your Organization
Hybrid Cloud Solutions to Transform Your OrganizationHybrid Cloud Solutions to Transform Your Organization
Hybrid Cloud Solutions to Transform Your Organization
 
Tapping the cloud for real time data analytics
 Tapping the cloud for real time data analytics Tapping the cloud for real time data analytics
Tapping the cloud for real time data analytics
 
Key Database Criteria for Cloud Applications
Key Database Criteria for Cloud ApplicationsKey Database Criteria for Cloud Applications
Key Database Criteria for Cloud Applications
 
Snowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data WarehousingSnowflake Best Practices for Elastic Data Warehousing
Snowflake Best Practices for Elastic Data Warehousing
 

Mais de Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

Mais de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 

NASA Goddard: Head in the Clouds

  • 1. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 NASA Goddard: Head in the Clouds Dan Duffy, NASA Steve Orrin, Intel Tim Carroll, Cycle Computing ©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 2. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Fastest growing workloads Fraud Detection Risk Modeling Drug Design Genomics Modeling and Simulation Unstructured Data Analysis, Data Lakes
  • 3. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Most resource intensive 1 core 8 cores 8 servers 10–10000 servers
  • 4. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Great, so… what’s the problem?
  • 5. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 The challenge of fixed capacity Time Capability Internal Capacity System Organization
  • 6. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Transform/life sciences The problem in 2013: • Cancer research needed 50,000 cores, not available in-house The options they didn’t choose: • Buy infrastructure: Spend $2M, wait 6 months • Write software for 9–12 months this 1 app Solution: • Created 10,600 server cluster • 39.5 years of computing in 8 hours • Found 3 potential drug candidates! • Total infrastructure bill: $4,372 6
  • 7. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Cycle powers cloud BigData and BigCompute Data Workflow Cloud Orchestration Analytics Modeling Internal Compute Compute Burst Software required to drive analytics and simulation at scale: • Easy access • Highly automated • On-demand • Ask the right questions
  • 8. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Best way to try it… try it 8 to try it…try Tim@cyclecomputing.com
  • 9. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 20159 Measure Woody Biomass on South Side of the Sahara at the 40–50 cm Scale Using AWS Overview of the NASA Head in the Clouds Project presented at the Amazon Web Services Public Summit 2015 Daniel Duffy daniel.q.duffy@nasa.gov and on Twitter @dqduffy High Performance Computing Lead at the NASA Center for Climate Simulation (NCCS) – http://www.nccs.nasa.gov and @NASA_NCCS Goddard Space Flight Center (GSFC) – http://www.nasa.gov/centers/goddard/home/
  • 10. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 ESD Project Won Intel Head in Clouds Challenge Award to Estimate Biomass in South Sahara Project Goal • Using NGA data to estimate tree and bush biomass over the entire arid and semi-arid zone on the south side of the Sahara Project Summary • Estimate carbon stored in trees and bushes in arid and semi- arid south Sahara • Establish carbon baseline for later research on expected CO2 uptake on the south side of the Sahara Principal Investigators • Dr. Compton J. Tucker, NASA Goddard Space Flight Center • Dr. Paul Morin, University of Minnesota Tree Crown Shadow NGA 40 cm imagery representing tree and shrub automated recognition
  • 11. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Intel • Professional Services and Funding for AWS Resources Amazon Web Services (AWS) • Compute and storage • Support to set up environment Cycle Computing • Cloud Resource Management Software • Services to install and configure the software Climate Model Data Services (CDS – GSFC Code 600) • NGA data support NASA Center for Climate Simulation (NCCS – GSFC Code 606.2) • System administration, application support, and data movement NASA CIO • General cloud consulting and coordination support Partners and Resources
  • 12. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Existing Sub-Saharan Arid and Semi-Arid Sub-Meter Commercial Imagery 9600 Strips (~80TB) to Be Delivered to GSFC ~1600 strips (~20TB) at GSFC Area Of Interest (AOI) for Sub-Saharan Arid and Semi-Arid Africa
  • 13. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 The DigtalGlobe Constellation The Entire Archive is Licensed to the USG Geoeye Quickbird Ikonos Worldview 1 Worldview 2 Worldview 3 (Available Q1 2015)
  • 14. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 14 Panchromatic and multispectral mapping at the 40- and 50-cm scale
  • 15. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Use Niger as the test case NGA data over Niger • Currently have about 16,000 total scenes covering Niger (the data is already orthorectified) • For this test case, approximately 3,120 scenes need to be processed to generate the vegetation index • Each scene is approximately 30,000 x 30,000 data points (pixels) • Will break each scene up into 100 tiles (3,000 x 3,000) Where is the data? • Data currently resides within the NCCS and in AWS Additional data • If we are successful and have additional time and resources, other African areas can be studied. 15NASA Head in the Clouds Project
  • 16. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Processing requirements Based on the tests run in the NCCS private cloud, following processing requirements were estimated • The tests were run on a single core (Intel E5-2670 2.5 GHz processor) virtual machine with 2 GB of memory • Each of the 3,120 scenes is broken up into 100 tiles • Each tile took 24 minutes • Hence, one scene will then take 24 * 100 = 2,400 minutes of total processor time (about 40 wall clock hours) • Tiles and scenes can be run in parallel • Total scene to process = 312,000 • Total compute hours = 124,800 Target completion time • 1 month will take between 175 to 200 virtual machines running non-stop 16NASA Head in the Clouds Project
  • 17. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Input and output data Input data • Total input of about 8 TB for the 3,120 scenes • Average of about 2.63 GB of data per scene • Average of about 26.3 MB of data per tile Intermediate data products • Unsure of how much intermediate data products are needed; this will impact the amount of temporary space required for each run Output data products • Total output data is estimated to be 25% of the input data • Estimated total output is about 2 to 3 TB • Output data will be transferred back to the NCCS 17NASA Head in the Clouds Project
  • 18. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Cluster configuration requirements 18 Category Description Requirement Number of Cores How many cores are required on a single node for the application? 1 per tile Amount of Memory (RAM) How much memory on a node (or per core) is required for the application? 2 GB per tile Operating System (OS) What operating system does the application need? Linux (Centos or debian) Libraries/Tools/Software What additional libraries, tools, and software are needed to be installed? Compilers? Commercial software? None Parallelization Can the application run in a parallel manner? If so, how (threaded, MPI, or multiple instances of the application)? Inherently parallel processing of each scene and/or tile Cluster If the application runs in parallel across many nodes, how many nodes are required? 175 – 200 to complete in 1 month; more can be used Storage How much storage space will be required for each run (input, intermediate, and output files)? Total Input – 8 TB (approx. 2.6 GB for each scene) Intermediate – To be determined Total Output Back to NCCS – 2 TB ( approx. 25% of total input) Shared Storage Does this storage have to be shared across all nodes? No
  • 19. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Workflow 19 DataMan Cycle Computing Data Transfer Software NCCS Science Cloud (Internal Cloud) Shared File System NGA Data at NASA NGA Data External to NASA (PGC, Digital Globe) Data to be copied into the NCCS science cloud NGA data repository. NCCS/NASA VM Local Data A resource manager (batch queue) will be running in AWS. Scientists will interact and launch jobs through the Cycle Computing system directly in AWS. Virtual machines will be launched in AWS. After the job is completed, the results will be copied back to the NCCS. VM Local Data VM Local Data VM Local Data AWS VM VM VM Virtual machines in the internal cloud can read the data directly from the shared disk in the NASA internal cloud. No additional data movement is required. Amazon S3 Data to be processed is staged into Amazon S3. Data will be moved to the local storage of the VM’s for processing. Products could be stored in S3 for transfer to the NCCS at a later time. Batch Queue System The Cycle Computing DataMan software will be used to transfer the data into Amazon S3. Cycle Computing System
  • 20. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Time line 20 Category Dec Jan Feb Mar Apr May Jun Jul Aug Sep Bi-Weekly Tag Ups Requirements/Scope Setup/Configuration Test Runs Transfer Data to S3 Configure S3 Buckets Production Runs Analysis Final Report
  • 21. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Why use Cycle Computing and AWS? • The bigger goal is to analyze the entire arid and semi-arid zone on the south side of the Sahara – About 80 TB – 10x the data that the initial project will analyze • On 200 virtual machines, this will take 10 months! – How can we accelerate this? • Can easily scale up the number of virtual machines using the Cycle Computing software and the AWS resources – Once the data is in AWS, 80 TB of data can be analyzed in approximately the same amount of time as 8 TB of data – Scientists really love this part! • Might need longer given the data transfers may take time – can overlap data transfers and computation 21
  • 22. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Thanks goes to the following… NASA • Dr. Compton Tucker (Co-PI) • Katherine Melocik (GSFC) • Jennifer Small (GSFC) • Dr. Tsengdar Lee (HQ) • Daniel Duffy (GSFC) • Mark McInerney (GSFC) • Hoot Thompson (GSFC) • Garrison Vaughn (GSFC) • Brittany Wills (GSFC) • Scott Sinno (GSFC) • Ray Obrien (ARC) • Richard Schroeder (ARC) • Milton Checchi (ARC) University Partners • Paul Morin (Co-PI, Univ. Minnesota) • Claire Porter (Univ. Minnesota) • Jamon Van Den Hoek (Oak Ridge) 22 Cycle Computing • Tim Carroll • Michael Requa • Carl Chesal • Bob Nordlund • Glen Otero • Rob Futrick AWS • Jamie Baker • Jeff Layton There are others… My apologies for those I missed. These are typically the ones on the our conference calls!
  • 23. AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015 Thank You. This presentation will be loaded to SlideShare the week following the Symposium. http://www.slideshare.net/AmazonWebServices AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015

Notas do Editor

  1. Key Points: Multi-billion dollar corps committed to getting better answers faster (key on the hook)