SlideShare uma empresa Scribd logo
1 de 39
Containers in Science:
neuroimaging use cases
Chris Gorgolewski
Software in
Science:
current
limitations
Test coverage
Portability
Reproducibility
Reusability
Documentation
User support
Software in
Science:
current
limitations
Test coverage
Portability
Reproducibility
Reusability
Documentation
User support
Portability
“I coded up my analysis on my laptop – how can I run it
on that fancy cluster you mentioned?”
Portability
“I want to send my analysis to a collaborator – how can I
do it without writing an essay on software installation?”
Portability
“Just got a new shiny laptop – how can I keep working
on my analysis without having to spend hours setting up
everything?”
Reproducibility
“My paper was under review for 7 months; I need to
rerun my analyses, but my laptop got stolen in the
meantime.”
Reproducibility
“I’m trying to replicate results from a paper; got the
data, but software configuration details from the paper
are missing.”
Reproducibility
“I want my analysis method to work the same way for all
scientists who use it.”
reproducibility
==
portability in time
Software Containers:
quick refresh
your analysis code
binary dependencies
configuration files
environment variables
data dependencies
Software Containers:
quick refresh
Everything above the kernel level
captured in one convenient package.
Software Containers:
quick refresh
Same container runs on:
Windows
Mac
Linux
HPCs
No need to port code.
Software Containers
Software containers greatly improve
portability and reproducibility*
*within some limits
Container technology vs.
implementation
There are many implementation of software containers. We use
two:
Docker
(for single user Windows, Mac, Linux machines)
Singularity
(for multi user HPCs)
Docker vs. Singularity
Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.
Singularity workflow
or
https://github.com/singularityware/docker2singularity
Kurtzer GM, Sochat V, Bauer MW (2017) Singularity: Scientific containers for mobility of compute. PLoS ONE 12(5): e0177459.
How do we use containers?
Everyday custom data analysis
Development and distribution of complex pipelines
Aggregating collections of portable pipelines
Deploying analysis pipelines in science as a service platform
Every day data analysis
1. Prototype analysis on a small subset of data
1. Use Docker on a laptop
2. Convert Docker image to singularity (docker2singularity)
3. Copy the image to a cluster
4. Run at scale
FMRIPREP
http://fmriprep.readthedocs.io
MRIQC
http://mriqc.readthedocs.io
Development and distribution of
complex pipelines
MRIQC and FMRIPREP depend on a lot of software:
1. AFNI
2. FSL
3. FreeSurfer
4. ANTs
5. Nipype
6. Nilearn
7. Etc…
Development and distribution of
complex pipelines
Setting up all of the binary dependencies is
a major road block for the users.
Development and distribution of
complex pipelines
Containers solve two problems:
1. Ease of installation for users
2. Consistent environment for developers
Aggregating collections of
portable pipelines
Containers can also be used to distribute collection of
processing pipelines.
portability + ease of use
==
containers + data standards
BIDS Apps
BIDS Apps is a natural combination of container
technology and neuroimaging dataset
description.
BIDS Apps
The goal: chose a data analysis pipeline from a library and
quickly run it on your data
Components:
1. Input data standard: Brain Imaging Data Structure
2. Command line interface standard
3. Containers
Simple parallelization scheme – map/reduce
Gorgolewski KJ, Alfaro-Almagro F, Auer T, Bellec P, Capotă M, Chakravarty MM, et al. (2017) BIDS apps: Improving ease of use, accessibility, and
reproducibility of neuroimaging data analysis methods. PLoS Comput Biol 13(3)
BIDS Apps
Organic growth
Fewer restrictions than other software
distribution schemes ( i.e. Debian)
Developers are in control
BIDS Apps: misusing Docker
Strong versioning and testing requires careful planning.
Modern Continuous Integration services are essential.
bids-apps.neuroimaging.io
Using containers in Science as a
Service platform
The ultimate goal:
Making more data available to
more researchers
OpenNeuro
The carrot: cutting edge computationally
expensive analysis pipelines with a click of a
button.
The “price”: the input data and the analysis
results become publicly available after a grace
period
OpenNeuro
The carrot: cutting edge computationally
expensive analysis pipelines with a click of a
button.
How? Containers!
Summary
Containers are useful for:
Individual scientists
Pipeline developers
Science as a Service platforms

Mais conteúdo relacionado

Mais procurados

Mais procurados (20)

Reproducible research: practice
Reproducible research: practiceReproducible research: practice
Reproducible research: practice
 
Ngsp
NgspNgsp
Ngsp
 
An Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart CommunitiesAn Ontology-Driven Integration Framework for Smart Communities
An Ontology-Driven Integration Framework for Smart Communities
 
Peer Review and Science2.0
Peer Review and Science2.0Peer Review and Science2.0
Peer Review and Science2.0
 
Aug2014 giab intro slides
Aug2014 giab intro slidesAug2014 giab intro slides
Aug2014 giab intro slides
 
Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?Why should Journals ask fo RRIDs?
Why should Journals ask fo RRIDs?
 
RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015RDA Scholarly Infrastructure 2015
RDA Scholarly Infrastructure 2015
 
Better Software, Better Research
Better Software, Better ResearchBetter Software, Better Research
Better Software, Better Research
 
Aspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth ScienceAspects of Reproducibility in Earth Science
Aspects of Reproducibility in Earth Science
 
Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how Reproducibility and Scientific Research: why, what, where, when, who, how
Reproducibility and Scientific Research: why, what, where, when, who, how
 
Reproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An OverviewReproducibility Using Semantics: An Overview
Reproducibility Using Semantics: An Overview
 
Reproducible research: First steps.
Reproducible research: First steps. Reproducible research: First steps.
Reproducible research: First steps.
 
Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014Results may vary: Collaborations Workshop, Oxford 2014
Results may vary: Collaborations Workshop, Oxford 2014
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials research
 
2015 msu-code-review
2015 msu-code-review2015 msu-code-review
2015 msu-code-review
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
Datat and donuts: how to write a data management plan
Datat and donuts: how to write a data management planDatat and donuts: how to write a data management plan
Datat and donuts: how to write a data management plan
 
HiPipe Professional
HiPipe ProfessionalHiPipe Professional
HiPipe Professional
 
Evidence based data analysis
Evidence based data analysisEvidence based data analysis
Evidence based data analysis
 
Data Management for librarians
Data Management for librariansData Management for librarians
Data Management for librarians
 

Semelhante a Containers in Science: neuroimaging use cases

SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford ConsortiumSDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
Keiichiro Ono
 
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
Neo4j
 

Semelhante a Containers in Science: neuroimaging use cases (20)

Cytoscape: Now and Future
Cytoscape: Now and FutureCytoscape: Now and Future
Cytoscape: Now and Future
 
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
2018 ABRF Tools for improving rigor and reproducibility in bioinformatics
 
Reproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and AndurilReproducible bioinformatics pipelines with Docker and Anduril
Reproducible bioinformatics pipelines with Docker and Anduril
 
Docker Application to Scientific Computing
Docker Application to Scientific ComputingDocker Application to Scientific Computing
Docker Application to Scientific Computing
 
Demystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data ScientistsDemystifying Containerization Principles for Data Scientists
Demystifying Containerization Principles for Data Scientists
 
Dockers and kubernetes
Dockers and kubernetesDockers and kubernetes
Dockers and kubernetes
 
Practical Chaos Engineering
Practical Chaos EngineeringPractical Chaos Engineering
Practical Chaos Engineering
 
Beyond static configuration
Beyond static configurationBeyond static configuration
Beyond static configuration
 
Genomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus PlatformGenomics Applications in the Cloud with the DNAnexus Platform
Genomics Applications in the Cloud with the DNAnexus Platform
 
Creating Developer-Friendly Docker Containers with Chaperone
Creating Developer-Friendly Docker Containers with ChaperoneCreating Developer-Friendly Docker Containers with Chaperone
Creating Developer-Friendly Docker Containers with Chaperone
 
Understanding Kubernetes
Understanding KubernetesUnderstanding Kubernetes
Understanding Kubernetes
 
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford ConsortiumSDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
SDCSB CYTOSCAPE AND NETWORK ANALYSIS WORKSHOP at Sanford Consortium
 
Introduction to r
Introduction to rIntroduction to r
Introduction to r
 
Containerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deploymentContainerization Principles Overview for app development and deployment
Containerization Principles Overview for app development and deployment
 
Docker: Containers for Data Science
Docker: Containers for Data ScienceDocker: Containers for Data Science
Docker: Containers for Data Science
 
ResumeJagannath
ResumeJagannathResumeJagannath
ResumeJagannath
 
DevOps
DevOpsDevOps
DevOps
 
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
An Introduction to Container Organization with Docker Swarm, Kubernetes, Meso...
 
Executable papers
Executable papersExecutable papers
Executable papers
 
2016 05 sanger
2016 05 sanger2016 05 sanger
2016 05 sanger
 

Mais de Krzysztof Gorgolewski

Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...
Krzysztof Gorgolewski
 
Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giants
Krzysztof Gorgolewski
 

Mais de Krzysztof Gorgolewski (18)

ML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging DataML Researcher’s Guide to Open Brain Imaging Data
ML Researcher’s Guide to Open Brain Imaging Data
 
Study pre-registration: Benefits and considerations
Study pre-registration: Benefits and considerationsStudy pre-registration: Benefits and considerations
Study pre-registration: Benefits and considerations
 
Towards open and reproducible neuroscience in the age of big data
Towards open and  reproducible neuroscience in the age of big dataTowards open and  reproducible neuroscience in the age of big data
Towards open and reproducible neuroscience in the age of big data
 
FMRIPREP - robust and easy to use fMRI preprocessing pipeline
FMRIPREP - robust and easy to use fMRI preprocessing pipelineFMRIPREP - robust and easy to use fMRI preprocessing pipeline
FMRIPREP - robust and easy to use fMRI preprocessing pipeline
 
Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...Evaluation of full brain parcellation schemes using the NeuroVault database o...
Evaluation of full brain parcellation schemes using the NeuroVault database o...
 
Quality control for structural and functional MRI
Quality control for structural and functional MRIQuality control for structural and functional MRI
Quality control for structural and functional MRI
 
Software testing for scientists
Software testing for scientistsSoftware testing for scientists
Software testing for scientists
 
Docker for scientists
Docker for scientistsDocker for scientists
Docker for scientists
 
The Brain Imaging Data Structure (OHBM 2016)
The Brain Imaging Data Structure (OHBM 2016)The Brain Imaging Data Structure (OHBM 2016)
The Brain Imaging Data Structure (OHBM 2016)
 
Share and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next levelShare and Reuse: how data sharing can take your research to the next level
Share and Reuse: how data sharing can take your research to the next level
 
Brain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible NeuroscinceBrain Imaging Data Structure and Center for Reproducible Neuroscince
Brain Imaging Data Structure and Center for Reproducible Neuroscince
 
Brain Imaging Data Structure
Brain Imaging Data StructureBrain Imaging Data Structure
Brain Imaging Data Structure
 
Meta analysis in neuroimaging 101
Meta analysis in neuroimaging 101Meta analysis in neuroimaging 101
Meta analysis in neuroimaging 101
 
Data sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challengesData sharing in neuroimaging: incentives, tools, and challenges
Data sharing in neuroimaging: incentives, tools, and challenges
 
Making data sharing count
Making data sharing countMaking data sharing count
Making data sharing count
 
If you liked it you should've put a p-value on it ...or not
If you liked it you should've put a p-value on it ...or notIf you liked it you should've put a p-value on it ...or not
If you liked it you should've put a p-value on it ...or not
 
NeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimagingNeuroVault and the vision for data sharing in neuroimaging
NeuroVault and the vision for data sharing in neuroimaging
 
Reusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giantsReusable Science: How not to slip from the shoulders of giants
Reusable Science: How not to slip from the shoulders of giants
 

Último

Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Sérgio Sacani
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
Areesha Ahmad
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
PirithiRaju
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
Areesha Ahmad
 

Último (20)

Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verifiedConnaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
Connaught Place, Delhi Call girls :8448380779 Model Escorts | 100% verified
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)GBSN - Biochemistry (Unit 1)
GBSN - Biochemistry (Unit 1)
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptxCOST ESTIMATION FOR A RESEARCH PROJECT.pptx
COST ESTIMATION FOR A RESEARCH PROJECT.pptx
 
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRLKochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
Kochi ❤CALL GIRL 84099*07087 ❤CALL GIRLS IN Kochi ESCORT SERVICE❤CALL GIRL
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
module for grade 9 for distance learning
module for grade 9 for distance learningmodule for grade 9 for distance learning
module for grade 9 for distance learning
 
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai YoungDubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
Dubai Call Girls Beauty Face Teen O525547819 Call Girls Dubai Young
 
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit flypumpkin fruit fly, water melon fruit fly, cucumber fruit fly
pumpkin fruit fly, water melon fruit fly, cucumber fruit fly
 
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.Molecular markers- RFLP, RAPD, AFLP, SNP etc.
Molecular markers- RFLP, RAPD, AFLP, SNP etc.
 
Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.Proteomics: types, protein profiling steps etc.
Proteomics: types, protein profiling steps etc.
 
Bacterial Identification and Classifications
Bacterial Identification and ClassificationsBacterial Identification and Classifications
Bacterial Identification and Classifications
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)GBSN - Microbiology (Unit 1)
GBSN - Microbiology (Unit 1)
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
9999266834 Call Girls In Noida Sector 22 (Delhi) Call Girl Service
 
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
High Class Escorts in Hyderabad ₹7.5k Pick Up & Drop With Cash Payment 969456...
 
Conjugation, transduction and transformation
Conjugation, transduction and transformationConjugation, transduction and transformation
Conjugation, transduction and transformation
 
300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx300003-World Science Day For Peace And Development.pptx
300003-World Science Day For Peace And Development.pptx
 

Containers in Science: neuroimaging use cases