SlideShare uma empresa Scribd logo
1 de 16
Baixar para ler offline
Enabling Cloud Bursting for
Life Sciences within Galaxy
Enis Afgan
Johns Hopkins University
Galaxy Team
Slides available at bit.ly/gxy-bursting
What is
•  A data analysis and integration tool
•  A (free for everyone) web service integrating a wealth of tools,
compute resources, terabytes of reference data and permanent
storage
•  Open source software that makes integrating your own tools
and data and customizing for your own site simple
?
usegalaxy.org
or
any of the other
60+ public servers
$ hg clone bitbucket.org/
galaxy/galaxy-dist
$ sh run.sh
Galaxy
/Tools
/Data
/Indices
DB
Compute
resources
Galaxy
Galaxy
Galaxy
RNA-Seq
Assembly
Quality
Control (QC)
Local Federated
Galaxy
Object
Store
interface
DB
Indices A
Data A
Tools A
S3, Swift
Pulsar
Indices B
Data B
Tools B
Local
Pulsar
Indices C
Data C
Tools C
Artifact & job provenance
RNA-Seq, Assembly, QC
GalaxyGalaxy
CloudMan
Focus on Cloud Bursting
Peak usage scenarios
Resource heterogeneity
Software licensing
Software installation restrictions
National cyber infrastructure resource access
Per-user, merit-based resource access
Burst Triggers
When?
Resource capacity
Job requirements
Data locality
System configuration
User preferences
Where?
Remote resource availability
Cost
Burst Architecture
1.  Galaxy dynamic job destination framework
2.  Galaxy CloudMan cluster with Pulsar
3.  A job destination mapper function
CloudMan
Pulsar
CloudMan
Pulsar
Local
DRM
Galaxy
<dynamic)job)
destination)
framework)/>
f(mapper)
Pulsar
A standalone job manager server for Galaxy
Can be deployed on dedicated or transient servers (even MS Windows!)
Handles data staging and remote job execution
Pulsarjob
Stage data
Submit job
Monitor job
Send back the data
1. Galaxy dynamic job destination framework
Define job execution properties
•  Runners: local, Slurm, HTCondor, DRMAA, Pulsar, …
•  Destinations: resource & job properties (e.g., DRM queue, wall
time)
2. CloudMan with Pulsar
A.  Launch a Galaxy on the Cloud instance
B.  Enable Pulsar service
C.  Add the instance as a
destination in job config
Tool availability
•  Direct tool install
•  Docker images
3. Job mapper function
Determine job destination at runtime
import pyslurm
 
def cloud_burst():
   n = pyslurm.node()
   nodes_state = n.get()
   available_nodes = []
   for node in nodes_state.itervalues():
       if node['total_cpus'] > 0:
           available_nodes.append(node)
   if not available_nodes:
       return 'pulsar_nectar_galaxy'
   return 'drmaa_runner’
job destination
CloudMan
Pulsar
CloudMan
Pulsar
Local
DRM
Galaxy
<dynamic)job)
destination)
framework)/>
f(mapper)
Pulsar
?
An outcome?
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
0
100
200
300
400
500
600
700
800
900
1000 2013-41
2013-43
2013-45
2013-47
2013-49
2013-51
2013-53
2014-02
2014-04
2014-06
2014-08
2014-10
2014-12
2014-14
2014-16
2014-18
2014-20
2014-22
2014-24
2014-26
2014-28
2014-30
2014-32
2014-34
2014-36
2014-38
2014-40
2014-42
2014-44
2014-46
2014-48
2014-50
2014-52
2015-01
2015-03
Jobsruntocompletion(count)
Averagewaittime(minutes)
Week
Average wait
Jobs run to completion
usegalaxy.org Start bursting No job wait
More jobs
An outcome?
usegalaxy.org
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
12.00%
14.00%
2013-41
2013-43
2013-45
2013-47
2013-49
2013-51
2013-53
2014-02
2014-04
2014-06
2014-08
2014-10
2014-12
2014-14
2014-16
2014-18
2014-20
2014-22
2014-24
2014-26
2014-28
2014-30
2014-32
2014-34
2014-36
2014-38
2014-40
2014-42
2014-44
2014-46
2014-48
2014-50
2014-52
2015-01
2015-03
Jobsdeletedwhilequeued
(%ofjobssubmitted)
Week
User frustration level
Enabling Cloud Bursting for Life Sciences within Galaxy

Mais conteúdo relacionado

Mais procurados

OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
marpierc
 
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
North Texas Chapter of the ISSA
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Spark Summit
 

Mais procurados (20)

Elastic Stack Roadmap
Elastic Stack RoadmapElastic Stack Roadmap
Elastic Stack Roadmap
 
Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22Cyberinfrastructure and Applications Overview: Howard University June22
Cyberinfrastructure and Applications Overview: Howard University June22
 
Computing Outside The Box June 2009
Computing Outside The Box June 2009Computing Outside The Box June 2009
Computing Outside The Box June 2009
 
OGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA SupportOGCE TeraGrid 2010 ASTA Support
OGCE TeraGrid 2010 ASTA Support
 
OGCE MSI Presentation
OGCE MSI PresentationOGCE MSI Presentation
OGCE MSI Presentation
 
Indiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway SupportIndiana University's Advanced Science Gateway Support
Indiana University's Advanced Science Gateway Support
 
Computing Outside The Box September 2009
Computing Outside The Box September 2009Computing Outside The Box September 2009
Computing Outside The Box September 2009
 
Benchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging ServicesBenchmarking Cloud-based Tagging Services
Benchmarking Cloud-based Tagging Services
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 
Sgg crest-presentation-final
Sgg crest-presentation-finalSgg crest-presentation-final
Sgg crest-presentation-final
 
Using Elastic to Monitor Anything
Using Elastic to Monitor Anything Using Elastic to Monitor Anything
Using Elastic to Monitor Anything
 
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
Ntxissacsc5 yellow 6-abusing protocols for dynamic addressing in space-jacenr...
 
Microservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital OneMicroservices, Continuous Delivery, and Elasticsearch at Capital One
Microservices, Continuous Delivery, and Elasticsearch at Capital One
 
Genomic Scale Big Data Pipelines
Genomic Scale Big Data PipelinesGenomic Scale Big Data Pipelines
Genomic Scale Big Data Pipelines
 
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...
Using Elastic to Monitor Everything - Christoph Wurm, Elastic - DevOpsDays Te...
 
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...DEVNET-1140	InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
DEVNET-1140 InterCloud Mapreduce and Spark Workload Migration and Sharing: Fi...
 
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
Fighting Cybercrime: A Joint Task Force of Real-Time Data and Human Analytics...
 
Log Monitoring and Anomaly Detection at Scale at ORNL
Log Monitoring and Anomaly Detection at Scale at ORNLLog Monitoring and Anomaly Detection at Scale at ORNL
Log Monitoring and Anomaly Detection at Scale at ORNL
 
使用 Elastic Stack 进行端对端安全分析
使用 Elastic Stack 进行端对端安全分析 使用 Elastic Stack 进行端对端安全分析
使用 Elastic Stack 进行端对端安全分析
 
Open source log analytics
Open source log analyticsOpen source log analytics
Open source log analytics
 

Semelhante a Enabling Cloud Bursting for Life Sciences within Galaxy

Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
Kirill Osipov
 
Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189
Vahid Jalili
 
WangCheng_CMU_ResumeS16
WangCheng_CMU_ResumeS16WangCheng_CMU_ResumeS16
WangCheng_CMU_ResumeS16
Cheng Wang
 

Semelhante a Enabling Cloud Bursting for Life Sciences within Galaxy (20)

OGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial IntroOGCE TeraGrid 2010 Science Gateway Tutorial Intro
OGCE TeraGrid 2010 Science Gateway Tutorial Intro
 
Science as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate DiscoveryScience as a Service: How On-Demand Computing can Accelerate Discovery
Science as a Service: How On-Demand Computing can Accelerate Discovery
 
Science cloud foster june 2013
Science cloud foster june 2013Science cloud foster june 2013
Science cloud foster june 2013
 
Introduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-SeqIntroduction to Galaxy and RNA-Seq
Introduction to Galaxy and RNA-Seq
 
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
Globus Genomics: How Science-as-a-Service is Accelerating Discovery (BDT310) ...
 
OGCE SciDAC2010 Tutorial
OGCE SciDAC2010 TutorialOGCE SciDAC2010 Tutorial
OGCE SciDAC2010 Tutorial
 
Scientific
Scientific Scientific
Scientific
 
Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189Federated Storage Resources GCC2018 https://vimeo.com/291738189
Federated Storage Resources GCC2018 https://vimeo.com/291738189
 
Unified Situational Awareness Dashboard for Spacecraft Operations: an inte...
Unified Situational Awareness Dashboard for Spacecraft Operations: an inte...Unified Situational Awareness Dashboard for Spacecraft Operations: an inte...
Unified Situational Awareness Dashboard for Spacecraft Operations: an inte...
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
Powering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta LakePowering Interactive BI Analytics with Presto and Delta Lake
Powering Interactive BI Analytics with Presto and Delta Lake
 
Introduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 TutorialIntroduction to Globus - XSEDE14 Tutorial
Introduction to Globus - XSEDE14 Tutorial
 
Michael stack -the state of apache h base
Michael stack -the state of apache h baseMichael stack -the state of apache h base
Michael stack -the state of apache h base
 
Open Stack and SDN
Open Stack and SDNOpen Stack and SDN
Open Stack and SDN
 
LinkedinResume
LinkedinResumeLinkedinResume
LinkedinResume
 
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
Conceptualizing And Prototyping A Scalable Genomic Data Analysis Pipeline: Us...
 
OGCE SC10
OGCE SC10OGCE SC10
OGCE SC10
 
Dash UCCSC 2016
Dash UCCSC 2016Dash UCCSC 2016
Dash UCCSC 2016
 
Enterprise guide to building a Data Mesh
Enterprise guide to building a Data MeshEnterprise guide to building a Data Mesh
Enterprise guide to building a Data Mesh
 
WangCheng_CMU_ResumeS16
WangCheng_CMU_ResumeS16WangCheng_CMU_ResumeS16
WangCheng_CMU_ResumeS16
 

Mais de Enis Afgan

GCC 2014 scriptable workshop
GCC 2014 scriptable workshopGCC 2014 scriptable workshop
GCC 2014 scriptable workshop
Enis Afgan
 
Galaxy workshop
Galaxy workshopGalaxy workshop
Galaxy workshop
Enis Afgan
 
CloudMan workshop
CloudMan workshopCloudMan workshop
CloudMan workshop
Enis Afgan
 

Mais de Enis Afgan (16)

Federated Galaxy: Biomedical Computing at the Frontier
Federated Galaxy: Biomedical Computing at the FrontierFederated Galaxy: Biomedical Computing at the Frontier
Federated Galaxy: Biomedical Computing at the Frontier
 
From laptop to super-computer: standardizing installation and management of G...
From laptop to super-computer: standardizing installation and management of G...From laptop to super-computer: standardizing installation and management of G...
From laptop to super-computer: standardizing installation and management of G...
 
Horizontal scaling with Galaxy
Horizontal scaling with GalaxyHorizontal scaling with Galaxy
Horizontal scaling with Galaxy
 
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
Endofday: A Container Workflow Engine for Scalable, Reproducible ComputationEndofday: A Container Workflow Engine for Scalable, Reproducible Computation
Endofday: A Container Workflow Engine for Scalable, Reproducible Computation
 
2016 07 - CloudBridge Python library (XSEDE16)
2016 07 - CloudBridge Python library (XSEDE16)2016 07 - CloudBridge Python library (XSEDE16)
2016 07 - CloudBridge Python library (XSEDE16)
 
2017.07.19 Galaxy & Jetstream cloud
2017.07.19 Galaxy & Jetstream cloud2017.07.19 Galaxy & Jetstream cloud
2017.07.19 Galaxy & Jetstream cloud
 
Resource planning on the (Amazon) cloud
Resource planning on the (Amazon) cloudResource planning on the (Amazon) cloud
Resource planning on the (Amazon) cloud
 
The pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an exampleThe pulse of cloud computing with bioinformatics as an example
The pulse of cloud computing with bioinformatics as an example
 
Cloud computing and bioinformatics
Cloud computing and bioinformaticsCloud computing and bioinformatics
Cloud computing and bioinformatics
 
Galaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWSGalaxy CloudMan performance on AWS
Galaxy CloudMan performance on AWS
 
Adding Transparency and Automation into the Galaxy Tool Installation Process
Adding Transparency and Automation into the Galaxy Tool Installation ProcessAdding Transparency and Automation into the Galaxy Tool Installation Process
Adding Transparency and Automation into the Galaxy Tool Installation Process
 
IRB Galaxy CloudMan radionica
IRB Galaxy CloudMan radionicaIRB Galaxy CloudMan radionica
IRB Galaxy CloudMan radionica
 
GCC 2014 scriptable workshop
GCC 2014 scriptable workshopGCC 2014 scriptable workshop
GCC 2014 scriptable workshop
 
Data analysis with Galaxy on the Cloud
Data analysis with Galaxy on the CloudData analysis with Galaxy on the Cloud
Data analysis with Galaxy on the Cloud
 
Galaxy workshop
Galaxy workshopGalaxy workshop
Galaxy workshop
 
CloudMan workshop
CloudMan workshopCloudMan workshop
CloudMan workshop
 

Último

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Último (20)

[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 

Enabling Cloud Bursting for Life Sciences within Galaxy