SlideShare uma empresa Scribd logo
1 de 33
Baixar para ler offline
Introduction   MEGAN           Metadata         Pooling Datasets     Summary & Conclusion




           Pooling metagenomes in MEGAN based on
                   environmental parameters

                       Hans-Joachim Ruscheweyh

                   Center for Bioinformatics, Tuebingen University


                                 June 15, 2011




1 / 27            Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


         1     Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
         2     MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
         3     Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
         4     Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
         5     Summary & Conclusion
2 / 27                   Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


         1     Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
         2     MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
         3     Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
         4     Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
         5     Summary & Conclusion
3 / 27                   Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Metagenomics




               The study of DNA of uncultured organisms
               > 99% of all microbes cannot be cultured
               A genome is the entire genetic information of a single
               organism
               A metagenome is the entire genetic information of a
               assemblage of organisms




4 / 27                  Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction        MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Typical Metagenomic Samples



               Human microbiome
               Soil samples
               Sea water samples
               Seabed samples
               Air samples
               Medical samples
               Ancient bones




5 / 27                 Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction               MEGAN                Metadata      Pooling Datasets   Summary & Conclusion



 Metagenomic Pipeline




         A primer on metagenomics; Wooley et al. (2010)

6 / 27                          Hans-Joachim Ruscheweyh    Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


         1     Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
         2     MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
         3     Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
         4     Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
         5     Summary & Conclusion
7 / 27                   Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction             MEGAN              Metadata            Pooling Datasets           Summary & Conclusion



 MEGAN Introduction




         Interactive tool for metagenomic analysis - www-ab.informatik.uni-tuebingen.de/software/megan

8 / 27                        Hans-Joachim Ruscheweyh       Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets    Summary & Conclusion



 Taxonomic Analysis

                                                      Tree reflects the
                                                      NCBI taxonomy
                                                      Reads are
                                                      compared against
                                                      reference
                                                      database e.g. NR
                                                      Reads are
                                                      mapped on the
                                                      tree using the
                                                      comparison
                                                      results based on
                                                      the LCA algorithm



9 / 27            Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Functional Analysis - SEED


                                                      The tree contains
                                                      the nodes of the
                                                      SEED
                                                      classification
                                                      Reads are
                                                      mapped on to the
                                                      SEED
                                                      classification


                                               www.theSEED.org




10 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets             Summary & Conclusion



 Functional Analysis - KEGG




                                                        KEGG: Kanehisa et al., Nucleic
                                                          Acids Res. 38, D355-D360
                                                                    (2010)
                                                         http://www.genome.jp/kegg/




11 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets      Summary & Conclusion



 Comparing Datasets


                                                      Based on
                                                      (normalized)
                                                      number of reads
                                                      assigned to each
                                                      node
                                                      Each color
                                                      determines a
                                                      dataset




12 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction         MEGAN           Metadata        Pooling Datasets              Summary & Conclusion



 DB Extension - PostgreSQL



               MEGAN communicates with a
               PostgreSQL database
               Many datasets are available in
               one database instance
               Many users can operate on
               the same database instance
               This avoids redundancy on
               often large datasets
                                                            http://www.postgresql.org/




13 / 27                 Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


          1    Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
          2    MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
          3    Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
          4    Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
          5    Summary & Conclusion
14 / 27                  Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction                 MEGAN                  Metadata                Pooling Datasets                Summary & Conclusion



 What is Metadata?


          Metadata are for example environmental parameters recorded
          together with the actual metagenomic sample e.g. collection
          date, gender, health status, ...

                                                          Month             Salinity          Ammonia
                         January_2PM                     January             33.3               0.0
                        January_10PM                     January             34.2               0.0
                         August_4AM                      August              33.3               0.14
                        August_10AM                      August              32.1               0.06
          Datasets taken from: The taxonomic and functional diversity of microbes at a temperate coastal site: a ’multi-omic’
          study of the seasonal and diel temporal variation; Gilbert et al. (2010)




15 / 27                            Hans-Joachim Ruscheweyh             Pooling metagenomes
Introduction     MEGAN           Metadata        Pooling Datasets     Summary & Conclusion




                             Month ∈ {Dec, Jan, Feb}
          January_2PM
                                                                    Winter
          January_10PM

                             Month ∈ {Jun,Jul, Aug}
          August_4AM
                                                                    Summer
          August_10AM




16 / 27             Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


          1    Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
          2    MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
          3    Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
          4    Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
          5    Summary & Conclusion
17 / 27                  Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction           MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Basic Idea



               Create two new datasets (winter, summer) from the four
               BLAST files
               Problems:
                   Doubles space consumption
                   Is time inefficient
               Idea:
                   Use database technology to avoid redundancy, save time
                   and space




18 / 27                   Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Primary & Combined Datasets in the Database



               A primary dataset is a dataset created from the original
               BLAST output and the reads file
               A combined dataset is created from primary datasets
               A combined dataset is created by using:
                   References to read and match data of the primary datasets
                   Optionally also the classification data of the primary
                   datasets
               Hence, a combined dataset can be created time and space
               efficiently




19 / 27                 Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Creating Combined Datasets in MEGAN




20 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Analysis


               Input: 8 primary datasets. Altogether ~100,000 reads, ~4
               mio matches, ~4.5 GB space
               It takes ~50 minutes to load these datasets to the database
               Three combined datasets (winter, spring, summer) are
               created
               Their creation takes ~30 seconds and needs ~40MB
               additional space
               Alternatively combined datasets can be created on-the-fly.
               This takes less than a second and needs no additional
               space



21 / 27                 Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Comparing all Datasets




22 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Comparing by Season




23 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion




24 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction   MEGAN           Metadata        Pooling Datasets   Summary & Conclusion




24 / 27           Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction          MEGAN           Metadata        Pooling Datasets   Summary & Conclusion


          1    Introduction Metagenomics
                  Unculturable Microbes
                  Typical Metagenomic Samples
                  Pipeline
          2    MEGAN
                  MEGAN Introduction
                  Taxonomic & Functional Analysis
                  Comparison Analysis
                  PostgreSQL
          3    Metadata
                  What is Metadata?
                  Using Metadata to pool Datasets
          4    Pooling Datasets
                  Basic Idea
                  Combined Datasets
                  MetaData Analyzer
          5    Summary & Conclusion
25 / 27                  Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction         MEGAN           Metadata        Pooling Datasets   Summary & Conclusion



 Summary & Conclusion


               MEGAN communicates with a PostgreSQL database
               This gives the user access to many datasets
               Many user can work on the database simultaneously
               Primary datasets can be pooled to create combined
               datasets
               The MetaData Analyzer allows one to create combined
               datasets based on the usage of boolean expressions on
               assigned metadata
               This technique is highly space and time efficient




26 / 27                 Hans-Joachim Ruscheweyh   Pooling metagenomes
Introduction        MEGAN           Metadata        Pooling Datasets   Summary & Conclusion




               MEGAN v4 is freely available from www-ab.
               informatik.uni-tuebingen.de/software/megan
               Integrative analysis of environmental sequences using
               MEGAN4, Daniel H. Huson, Suparna Mitra, Hans-Joachim
               Ruscheweyh, Nico Weber, Stephan C. Schuster; submitted
               2011
               Thanks go to Daniel Huson, Suparna Mitra, Nico Weber,
               Stefan Schuster




          Thank your for your attention!
27 / 27                Hans-Joachim Ruscheweyh   Pooling metagenomes

Mais conteúdo relacionado

Semelhante a Hans-Joachim Ruscheweyh: Pooling Metagenomes in MEGAN Based on Environmental Parameters

Integrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonIntegrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonReece Hart
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Keesthehyve
 
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBMetabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBDinesh Barupal
 
Metabolic network visualization - concepts
Metabolic network visualization - conceptsMetabolic network visualization - concepts
Metabolic network visualization - conceptsDinesh Barupal
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...CSCJournals
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizAlexander Pico
 
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...Manuel GEA - Bio-Modeling Systems
 
Metabolomic data analysis and visualization tools
Metabolomic data analysis and visualization toolsMetabolomic data analysis and visualization tools
Metabolomic data analysis and visualization toolsDmitry Grapov
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisPrasanthperceptron
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuKAUSHAL SAHU
 
Protein Structure Alignment and Comparison
Protein Structure Alignment and ComparisonProtein Structure Alignment and Comparison
Protein Structure Alignment and ComparisonNatalio Krasnogor
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Mathew Varghese
 

Semelhante a Hans-Joachim Ruscheweyh: Pooling Metagenomes in MEGAN Based on Environmental Parameters (20)

Abstract kita
Abstract kitaAbstract kita
Abstract kita
 
Integrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from UnisonIntegrating Public and Private Data: Lessons Learned from Unison
Integrating Public and Private Data: Lessons Learned from Unison
 
Metabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie KeesMetabolomics Society meeting 2011 - presentatie Kees
Metabolomics Society meeting 2011 - presentatie Kees
 
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDBMetabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
Metabolic pathway mapping against KEGG, Reactome, HMDB and CPDB
 
Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...Chemistry Online and The vision and challenges associated with building the c...
Chemistry Online and The vision and challenges associated with building the c...
 
Metabolic network visualization - concepts
Metabolic network visualization - conceptsMetabolic network visualization - concepts
Metabolic network visualization - concepts
 
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...
International Journal of Biometrics and Bioinformatics(IJBB) Volume (2) Issue...
 
NetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-vizNetBioSIG2012 anyatsalenko-en-viz
NetBioSIG2012 anyatsalenko-en-viz
 
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...
Bm Systems Scientific Epa Conference Heuristic Mathematic Concepts Synergies ...
 
B.3.5
B.3.5B.3.5
B.3.5
 
Metabolomic data analysis and visualization tools
Metabolomic data analysis and visualization toolsMetabolomic data analysis and visualization tools
Metabolomic data analysis and visualization tools
 
Softwares For Phylogentic Analysis
Softwares For Phylogentic AnalysisSoftwares For Phylogentic Analysis
Softwares For Phylogentic Analysis
 
Metabolomics Data Analysis
Metabolomics Data AnalysisMetabolomics Data Analysis
Metabolomics Data Analysis
 
Bioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahuBioinformatic, and tools by kk sahu
Bioinformatic, and tools by kk sahu
 
Jm200026b
Jm200026bJm200026b
Jm200026b
 
Gene Expression Lab Summary
Gene Expression Lab SummaryGene Expression Lab Summary
Gene Expression Lab Summary
 
Protein Structure Alignment and Comparison
Protein Structure Alignment and ComparisonProtein Structure Alignment and Comparison
Protein Structure Alignment and Comparison
 
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
Computational Biology Methods for Drug Discovery_Phase 1-5_November 2015
 
presentation
presentationpresentation
presentation
 
Data handling metabolomics
Data handling metabolomicsData handling metabolomics
Data handling metabolomics
 

Mais de GigaScience, BGI Hong Kong

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteGigaScience, BGI Hong Kong
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...GigaScience, BGI Hong Kong
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixGigaScience, BGI Hong Kong
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserGigaScience, BGI Hong Kong
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceGigaScience, BGI Hong Kong
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveGigaScience, BGI Hong Kong
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...GigaScience, BGI Hong Kong
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
 

Mais de GigaScience, BGI Hong Kong (20)

IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...IDW2022: A decades experiences in transparent and interactive publication of ...
IDW2022: A decades experiences in transparent and interactive publication of ...
 
Scott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByteScott Edmunds: Preparing a data paper for GigaByte
Scott Edmunds: Preparing a data paper for GigaByte
 
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...
 
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
Measuring richness. A RCT to quantify the benefits of metadata quality; Scott...
 
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...
 
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...PAGAsia19 - The Digitalization of Ruili Botanical Garden Project:  Production...
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...
 
Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...Democratising biodiversity and genomics research: open and citizen science to...
Democratising biodiversity and genomics research: open and citizen science to...
 
Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10Hong Kong Open Access & GigaScience: CCHK@10
Hong Kong Open Access & GigaScience: CCHK@10
 
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU GuixRicardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
Ricardo Wurmus: Reproducible genomics analysis pipelines with GNU Guix
 
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browserAnil Thanki at #ICG13: Aequatus: An open-source homology browser
Anil Thanki at #ICG13: Aequatus: An open-source homology browser
 
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...
 
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant scienceVenice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
Venice Juanillas at #ICG13: Rice Galaxy: an open resource for plant science
 
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...
 
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...
 
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global PerspectiveChris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
Chris Armit at IDW2018: Democratising Data Publishing: A Global Perspective
 
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
EMBL OA Week: FAIR or unfair? Principled publishing for more Open & Democrati...
 
Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...Reproducible method and benchmarking publishing for the data (and evidence) d...
Reproducible method and benchmarking publishing for the data (and evidence) d...
 
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...
 

Último

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Último (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

Hans-Joachim Ruscheweyh: Pooling Metagenomes in MEGAN Based on Environmental Parameters

  • 1. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Pooling metagenomes in MEGAN based on environmental parameters Hans-Joachim Ruscheweyh Center for Bioinformatics, Tuebingen University June 15, 2011 1 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 2. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 2 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 3. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 3 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 4. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Metagenomics The study of DNA of uncultured organisms > 99% of all microbes cannot be cultured A genome is the entire genetic information of a single organism A metagenome is the entire genetic information of a assemblage of organisms 4 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 5. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Typical Metagenomic Samples Human microbiome Soil samples Sea water samples Seabed samples Air samples Medical samples Ancient bones 5 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 6. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Metagenomic Pipeline A primer on metagenomics; Wooley et al. (2010) 6 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 7. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 7 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 8. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion MEGAN Introduction Interactive tool for metagenomic analysis - www-ab.informatik.uni-tuebingen.de/software/megan 8 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 9. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Taxonomic Analysis Tree reflects the NCBI taxonomy Reads are compared against reference database e.g. NR Reads are mapped on the tree using the comparison results based on the LCA algorithm 9 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 10. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Functional Analysis - SEED The tree contains the nodes of the SEED classification Reads are mapped on to the SEED classification www.theSEED.org 10 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 11. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Functional Analysis - KEGG KEGG: Kanehisa et al., Nucleic Acids Res. 38, D355-D360 (2010) http://www.genome.jp/kegg/ 11 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 12. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing Datasets Based on (normalized) number of reads assigned to each node Each color determines a dataset 12 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 13. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion DB Extension - PostgreSQL MEGAN communicates with a PostgreSQL database Many datasets are available in one database instance Many users can operate on the same database instance This avoids redundancy on often large datasets http://www.postgresql.org/ 13 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 14. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 14 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 15. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion What is Metadata? Metadata are for example environmental parameters recorded together with the actual metagenomic sample e.g. collection date, gender, health status, ... Month Salinity Ammonia January_2PM January 33.3 0.0 January_10PM January 34.2 0.0 August_4AM August 33.3 0.14 August_10AM August 32.1 0.06 Datasets taken from: The taxonomic and functional diversity of microbes at a temperate coastal site: a ’multi-omic’ study of the seasonal and diel temporal variation; Gilbert et al. (2010) 15 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 16. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Month ∈ {Dec, Jan, Feb} January_2PM Winter January_10PM Month ∈ {Jun,Jul, Aug} August_4AM Summer August_10AM 16 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 17. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 17 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 18. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Basic Idea Create two new datasets (winter, summer) from the four BLAST files Problems: Doubles space consumption Is time inefficient Idea: Use database technology to avoid redundancy, save time and space 18 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 19. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Primary & Combined Datasets in the Database A primary dataset is a dataset created from the original BLAST output and the reads file A combined dataset is created from primary datasets A combined dataset is created by using: References to read and match data of the primary datasets Optionally also the classification data of the primary datasets Hence, a combined dataset can be created time and space efficiently 19 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 20. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 21. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 22. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 23. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 24. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 25. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Creating Combined Datasets in MEGAN 20 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 26. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Analysis Input: 8 primary datasets. Altogether ~100,000 reads, ~4 mio matches, ~4.5 GB space It takes ~50 minutes to load these datasets to the database Three combined datasets (winter, spring, summer) are created Their creation takes ~30 seconds and needs ~40MB additional space Alternatively combined datasets can be created on-the-fly. This takes less than a second and needs no additional space 21 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 27. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing all Datasets 22 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 28. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Comparing by Season 23 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 29. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 24 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 30. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 24 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 31. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion 1 Introduction Metagenomics Unculturable Microbes Typical Metagenomic Samples Pipeline 2 MEGAN MEGAN Introduction Taxonomic & Functional Analysis Comparison Analysis PostgreSQL 3 Metadata What is Metadata? Using Metadata to pool Datasets 4 Pooling Datasets Basic Idea Combined Datasets MetaData Analyzer 5 Summary & Conclusion 25 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 32. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion Summary & Conclusion MEGAN communicates with a PostgreSQL database This gives the user access to many datasets Many user can work on the database simultaneously Primary datasets can be pooled to create combined datasets The MetaData Analyzer allows one to create combined datasets based on the usage of boolean expressions on assigned metadata This technique is highly space and time efficient 26 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes
  • 33. Introduction MEGAN Metadata Pooling Datasets Summary & Conclusion MEGAN v4 is freely available from www-ab. informatik.uni-tuebingen.de/software/megan Integrative analysis of environmental sequences using MEGAN4, Daniel H. Huson, Suparna Mitra, Hans-Joachim Ruscheweyh, Nico Weber, Stephan C. Schuster; submitted 2011 Thanks go to Daniel Huson, Suparna Mitra, Nico Weber, Stefan Schuster Thank your for your attention! 27 / 27 Hans-Joachim Ruscheweyh Pooling metagenomes