O slideshow foi denunciado.
Utilizamos seu perfil e dados de atividades no LinkedIn para personalizar e exibir anúncios mais relevantes. Altere suas preferências de anúncios quando desejar.

From Zero to Nextflow 2017

211 visualizações

Publicada em

The Bioinformatics Core Facility implemented during the years a number of procedures and pipelines for providing high quality results to an increasing number of users. Here we present our experience with migrating some of extensively used pipelines to the Nextflow framework and creating docker/singularity containers for reproducibility.

Publicada em: Ciências
  • Seja o primeiro a comentar

From Zero to Nextflow 2017

  1. 1. FROM ZERO TO NEXTFLOW Luca Cozzuto Bioinformatics Core Facility
  2. 2. Bioinformatics Unit@CRG • Julia Ponomarenko • Luca Cozzuto • Toni Hermoso • Sarah Bonnin
  3. 3. Core facility typical workflow User Standardised analysis Non standard analysis Building a database Reproducing an analysis… Chipseq RNAseq SNP calling …
  4. 4. Core facility typical workflow User Standardised analysis Non standard analysis Semi-automatic pipelines Chipseq RNAseq SNP calling … Building a database Reproducing an analysis…
  5. 5. Core facility typical workflow User Standardised analysis Non standard analysis Bunch of tools, custom scripts, R magics etc.. Semi-automatic pipelines Chipseq RNAseq SNP calling … Building a database Reproducing an analysis…
  6. 6. Core facility typical workflow User Standardised analysis Non standard analysis Bunch of tools, custom scripts, R magics etc.. Semi-automatic pipelines 50%50%
  7. 7. Core facility typical workflow Genomics 39% Database 15% Microbiome 12% RNA-seq 18% ChIP-seq 13% Microarray & HTqPCR 3% Hours by type of projects (2015 & 2016 )
  8. 8. Core facility typical workflow Genomics 39% Database 15% Microbiome 12% RNA-seq 18% ChIP-seq 13% Microarray & HTqPCR 3% Hours by type of projects (2015 & 2016 )
  9. 9. Why nextflow? • Standard analysis: • Automation, parallelization, portability, reproducibility (together with containers). • NF allows adding new steps without pain (thanks to isolation of processes) in a collaborative way [After 2 years] Can you redo the SAME analysis with new samples?
  10. 10. Why nextflow? • Standard analysis: • Automation, parallelization, portability, reproducibility (together with containers) • NF allows adding new steps without pain (thanks to isolation of processes) in a collaborative way • Non standard analysis can benefit too: • NF code is easy to reuse / modify. It is polyglot! • Using containers prevent several problems like portability, OS upgrade, libraries / version mismatch, etc.
  11. 11. Our experience First day with NextFlow
  12. 12. Our experience Progressionincoding Time Documentation / examples
  13. 13. Our experience Progressionincoding Time
  14. 14. Our experience Progressionincoding Time Invite Paolo for a coffee
  15. 15. Our experience Progressionincoding Time
  16. 16. Our experience Progressionincoding Time Second coffee
  17. 17. Our experience Progressionincoding Time Start using the Gitter Chat
  18. 18. Our experience Progressionincoding Time Asking to the Singularity Google Group
  19. 19. RNAseq pipeline ver 0.1
  20. 20. RNAseq pipeline ver 0.2
  21. 21. Our experience Now we are developing / developed: • ChIPseq pipeline • RNAseq pipeline • small RNAseq pipeline • SNP calling procedures (based on GATK) Standard analysis
  22. 22. Our experience Now we are developing / developed: • ChIPseq pipeline • RNAseq pipeline • small RNAseq pipeline • SNP calling procedures (based on GATK) • Pipeline for analysis or single cell transcriptome • Detection of plant resistant genes … NON standard Standard analysis
  23. 23. Future ideas • Semi-automatic reports • A CMS able to mine the NextFlow logfile and store both metadata and logs • Maybe a simple graphical interface to compete / complement with Galaxy?
  24. 24. Thanks! Bioinformatics Unit@CRG • Julia Ponomarenko • Luca Cozzuto • Toni Hermoso • Sarah Bonnin
  25. 25. Our experience We started developing a pipeline for single cell sequencing.
  26. 26. Analysis of single cell sequencing A concrete case: analysis of single cell sequencing Pair 1 (cell) Chunks Chunks Chunks Split Filtering /alignment Parallel Mapped tags Mapped tags Mapped tags Joining the results Genome file Indexing Index Pair 2 (gene) Expression per cell

×